Awk vs Sed vs Grep in Linux – Differences Explained

Awk vs Sed vs Grep in Linux – Differences Explained

Text processing is of great importance for a computer user, as the computer can perform tedious tasks that a human can’t even think of trying.

Using text processing tools enables the user to search for a specific pattern match, replace matches with other text of the user’s choice, invoke an action upon the presence of a certain condition, or even do more complex tasks.

This article explains the differences between the most three well-known text processing tools in Linux awk, sed and grep.

Ordered from the most rich and complex tool (awk), to the simplest (grep), but before delving into the body of the article, we have to know a little about regex.

What are Regex (regular expressions)

Regular expressions are a way to specify a search pattern in text, where you tell the computer what are you looking for in the text, in a form of a sequence of characters, regex is used in the three tools we are discussing.

For example, if we were to find all occurrences of the sequence text, you would write \btext\b, where \b stands for a word boundary, that will result in two matches.

There are more complex patterns that can be achieved using regex, such as finding numerals or a text that satisfies a certain set of rules.

Let’s assume we have this text file (called feb_groc.txt); all examples will refer to this file:

feb_groc.txt
In February 2021, I bought these groceries:
2-2-2021 apples 10$
3-2-2021 sugar 12$
3-2-2021 toast 20$
3-2-2021 apples 07$
3-2-2021 tomatoes 05$
4-2-2021 meat 35$
4-2-2021 toast 10$
…

Using the Awk Command

Being the most powerful tool of the three, awk is a text processing and scanning language(Scripting language). The name is derived from the initials of the authors of the tool (Aho, Weinberger, Kernighan).

You can use this tool for simple tasks such as printing matches, to complex ones such as doing arithmetic operations on numerals. You can set conditions under which you want the commands to be executed. Another powerful aspect of awk that it can operate on multiple files without the use of other tools as in the case of sed.

Search & Print All Occurrences of a Sequence of Characters

Search and print all occurrences of a sequence of characters in a text file (say apples in feb_groc.txt):

awk /apples/{print $0} feb_groc.txt
Output
2-2-2021 apples   10$
3-2-2021 apples   07$

Find & Replace

Find and replace (replace apples with bananas an save the output to bfeb_groc.txt):

awk '{gsub(/apples/, "bananas")}{print}' feb_groc.txt > bfeb_groc.txt
Output
In February 2021, I bought these groceries:
2-2-2021 bananas   10$
3-2-2021 sugar    12$
3-2-2021 toast    20$
3-2-2021 bananas   07$
3-2-2021 tomatoes 05$
4-2-2021 meat     35$
4-2-2021 toast    10$
…

Sum a Column of Numerals

Sum a column of numerals (the cost column and omitting the $, and skipping the first line):

awk -v sum=0 'NR>1{sum += int($3)} END {print sum}' feb_groc.txt
Output
99
  • -v is used to set a variable
  • NR to start from lines >1
  • $3 means the third entry in the line
  • int to trim the $, changing the entry to an integer
  • END means after finishing the code block execute the next statement

Sum of Specific Numerals

Sum the cost of a specific grocery from a set of invoices (say toast):

awk -v sum=0 'NR>1 && /toast/ {sum += int($3)} END {print sum}' feb_groc.txt
Output
30

Search & Print for Multiple Files

Searching and printing for multiple files (copied the file under different extension and made some changes):

awk '/\$/{print $0}' feb_groc*
Output
2-2-2021 apples   10$
3-2-2021 sugar    12$
3-2-2021 toast    20$
3-2-2021 apples   07$
3-2-2021 tomatoes 05$
4-2-2021 meat     35$
4-2-2021 toast    10$

Add header and footer (let’s take the previous command and edit it):

awk 'BEGIN{print "\n\nDate Item Cost\n-----------------------"} /\$/{print $0} END {print "-----------------------\n total cost = $$$"}' feb_groc*
Output
Date Item Cost
-----------------------
2-2-2021 apples   10$
3-2-2021 sugar    12$
3-2-2021 toast    20$
3-2-2021 apples   07$
3-2-2021 tomatoes 05$
4-2-2021 meat     35$
4-2-2021 toast    10$
-----------------------
 total cost = $$$

These examples are only demos, awk has more usages like:

  • input/output statements
  • getting the index of a substring within the text
  • splitting a text into an array where each element satisfies a condition
  • change the case of the characters of the text and more math functions (sin(), cos(), rand(), exp(), … etc)

You can execute this man command to know more: man awk

Using the Sed Command

The name is an abbreviation of stream editor. Sed is simpler to use than awk, it is best suited for finding and substituting patterns, but you can also perform other tasks using sed.

Search & Print All Occurrences of a Sequence of Characters

Search and print all occurrences of a sequence of characters in a text file ( let’s search for toast):

sed -n '/toast/ p' feb_groc.txt
Output
3-2-2021 toast    20$
4-2-2021 toast    10$
  • -n to limit the default printing(all lines), to just the lines we are working on.
  • p is the print command.

Find & Replace

It uses the substitute command (s) to replace a given text with another:

sed 's/tomatoes/potatoes/' feb_groc.txt
Output
n February 2021, I bought these groceries:
2-2-2021 apples   10$
3-2-2021 sugar    12$
3-2-2021 toast    20$
3-2-2021 apples   07$
3-2-2021 potatoes 05$
4-2-2021 meat     35$
4-2-2021 toast    10$
…

Instead of directing the output to a file, we can use the in-place flag (i), and add an extension to the file we are newly creating(here the new file will be the same as the old + pot at the end ):

sed -ipot 's/tomatoes/potatoes/' feb_groc.txt
Output
In February 2021, I bought these groceries:
2-2-2021 apples   10$
3-2-2021 sugar    12$
3-2-2021 toast    20$
3-2-2021 apples   07$
3-2-2021 potatoes 05$
4-2-2021 meat     35$
4-2-2021 toast    10$
…

Limit Search/Substitution Within Specific Lines

You can limit your search or substitution within one line or range of lines:

Changing all a’s to j’s:

sed 's/a/j/' feb_groc.txt
Output
In Februjry 2021, I bought these groceries:
2-2-2021 jpples   10$
3-2-2021 sugjr    12$
3-2-2021 tojst    20$
3-2-2021 jpples   07$
3-2-2021 potjtoes 05$
4-2-2021 mejt     35$
4-2-2021 tojst    10$
…

Changing all a’s in the 4th line to j’s:

sed '4s/a/j/' feb_groc.txt
Output
In February 2021, I bought these groceries:
2-2-2021 apples   10$
3-2-2021 sugar    12$
3-2-2021 tojst    20$
3-2-2021 apples   07$
3-2-2021 potatoes 05$
4-2-2021 meat     35$
4-2-2021 toast    10$
…

Changing all a’s within the range of lines (3,7) to j’s:

sed '3,7s/a/j/' feb_groc.txt
Output
In February 2021, I bought these groceries:
2-2-2021 apples   10$
3-2-2021 sugjr    12$
3-2-2021 tojst    20$
3-2-2021 jpples   07$
3-2-2021 potjtoes 05$
4-2-2021 mejt     35$
4-2-2021 toast    10$
…

Note: you can set a bound within which the search is applicable as a part of the regex.

sed -n '4,/es/ p' feb_groc.txt
Output
3-2-2021 toast    20$
3-2-2021 apples   07$

Using the Grep Command

Grep is the simplest one, it stands for Global Regular Expression Print.

Grep works simply: you pass it a pattern and a text file in which you want to find all occurrences of that pattern, grep loads the text file line by line in a buffer and checks for a match, if found it prints out the match. So, the best usage of grep is finding and printing pattern matches.

Search & Print All Occurrences of a Sequence of Characters

Search and print all occurrences of a sequence of characters in a text file (let’s search all s’s in our file):

grep s feb_groc.txt

word image 184 1

There are various flags that can be used with grep like:

  • -A [n_lines]: outputting n_lines after any match.
  • -B [n_lines]: outputting n_lines before any match.
  • -C [n_lines]: outputting n_lines before any match and n_lines after any match

word image 185 1

  • -F: search for literal strings.
  • -l: when searching in multiple files, this is an option to just show the file name
  • -i: ignore case.
  • -n: adding a line number before every output line.
  • -v: negating the match (outputting lines that doesn’t match the pattern).
  • -x: finding exact lines

word image 186 1

Conclusion

In this article we covered some examples of how Awk, Sed and Grep work with the aim of giving you an idea of how they are different. If you have any feedback or questions feel free to leave a comment and we’ll get back to you as soon as we can.

0 Shares:
Subscribe
Notify of
guest
Receive notifications when your comment receives a reply. (Optional)
Your username will link to your website. (Optional)
0 Comments
Inline Feedbacks
View all comments
You May Also Like