This guide will show you how to work with data using grep / sed and awk commands.
grep #
The “grep” is a group of utilities which includes grep, egrep, and fgrep. We are using grep for searching inside linux files.
Step 1: Please type inside terminal:
$ grep root /etc/passwd
root:x:0:0:root:/root:/bin/bash
You can see, this commands print only one line with user root details.
Step 2: You can also check parse commands outputs and filter it by grep:
$ ps aux | grep bash
user 18783 0.0 0.1 22704 5208 pts/0 Ss 20:03 0:00 -bash
user 20193 0.0 0.0 14432 956 pts/0 S+ 20:17 0:00 grep –color=auto bash
Step 3: If you want to ignore some lines, eg. like grep in process list, please use -v argument:
$ ps aux | grep bash | grep -v grep
user 18783 0.0 0.1 22704 5208 pts/0 Ss 20:03 0:00 -bash
Step 4: You can also search recursively files with grep:
$ grep -r “swap” /etc
/etc/fstab:# swap was on /dev/sdb during installation
/etc/fstab:/dev/vdb swap swap defaults 0 0
Step 5: Using grep for finding whole words only:
$ grep -w login /etc/passwd
Output will be empty, even if there’s a “nologin” word, because grep is searching only for the word “login”.
sed #
This utility allows you to find and replace text inside files.
Step 1: Replace text in a file:
$ sed -i ‘s/some-old-text/replace-with-new-one/g’ somefile.txt
Please be aware, if you want to replace “/” character, you need to escape it with “\”:
$ sed -i ‘s/http:\/\/google.com/http:\/\/bing.com/g’ somefile.txt
Optional
You can use ‘@’ instead of ‘/’ to escape, eg:
$ sed -i ‘s@old/text@new/text@g’ somefile.txt
awk #
awk is a scripting language for data manipulating. Here’s some sample examples of usage:
Step 1: Get a first column of /etc/passwd file:
$ cat /etc/passwd | awk -F “:” {‘print $1’}
root
daemon
bin
sys
sync
games
man
lp
mail
news
uucp
proxy
www-data
backup
list
irc
gnats
nobody
systemd-network
systemd-resolve
syslog
messagebus
_apt
lxd
uuidd
dnsmasq
landscape
sshd
user
The -F argument it’s a field separator, in this case we’re using “:”. The $1 argument print only first column. You can also print two columns:
$ cat /etc/passwd | awk -F “:” {‘print $1 ” ” $7’}
root /bin/bash
daemon /usr/sbin/nologin
bin /usr/sbin/nologin
sys /usr/sbin/nologin
sync /bin/sync
games /usr/sbin/nologin
man /usr/sbin/nologin
lp /usr/sbin/nologin
mail /usr/sbin/nologin
news /usr/sbin/nologin
uucp /usr/sbin/nologin
proxy /usr/sbin/nologin
www-data /usr/sbin/nologin
backup /usr/sbin/nologin
list /usr/sbin/nologin
irc /usr/sbin/nologin
gnats /usr/sbin/nologin
nobody /usr/sbin/nologin
systemd-network /usr/sbin/nologin
systemd-resolve /usr/sbin/nologin
syslog /usr/sbin/nologin
messagebus /usr/sbin/nologin
_apt /usr/sbin/nologin
lxd /bin/false
uuidd /usr/sbin/nologin
dnsmasq /usr/sbin/nologin
landscape /usr/sbin/nologin
sshd /usr/sbin/nologin
user /bin/bash
Step 2: You can also exec system commands using awk, like killing processes:
$ ps aux | grep apache | grep -v grep | awk {‘ system(“kill -9 ” $2)’}
This command will kill the PIDs of apache processes