Text processing is of great importance for a computer user, as the computer can perform tedious tasks that a human can’t even think of trying.

Using text processing tools enables the user to search for a specific pattern match, replace matches with other text of the user’s choice, invoke an action upon the presence of a certain condition, or even do more complex tasks.

This article explains the differences between the most three well-known text processing tools in Linux awk, sed and grep.

Ordered from the most rich and complex tool (awk), to the simplest (grep), but before delving into the body of the article, we have to know a little about regex.

What are Regex (regular expressions)
Using the Awk Command
Using the Sed Command
Using the Grep Command
1. Search & Print All Occurrences of a Sequence of Characters
Conclusion

What are Regex (regular expressions)

Regular expressions are a way to specify a search pattern in text, where you tell the computer what are you looking for in the text, in a form of a sequence of characters, regex is used in the three tools we are discussing.

For example, if we were to find all occurrences of the sequence text, you would write \btext\b, where \b stands for a word boundary, that will result in two matches.

There are more complex patterns that can be achieved using regex, such as finding numerals or a text that satisfies a certain set of rules.

Let’s assume we have this text file (called feb_groc.txt); all examples will refer to this file:

feb_groc.txt

In February 2021, I bought these groceries:
2-2-2021 apples 10$
3-2-2021 sugar 12$
3-2-2021 toast 20$
3-2-2021 apples 07$
3-2-2021 tomatoes 05$
4-2-2021 meat 35$
4-2-2021 toast 10$
…

Using the Awk Command

Being the most powerful tool of the three, awk is a text processing and scanning language(Scripting language). The name is derived from the initials of the authors of the tool (Aho, Weinberger, Kernighan).

You can use this tool for simple tasks such as printing matches, to complex ones such as doing arithmetic operations on numerals. You can set conditions under which you want the commands to be executed. Another powerful aspect of awk that it can operate on multiple files without the use of other tools as in the case of sed.

Search & Print All Occurrences of a Sequence of Characters

Search and print all occurrences of a sequence of characters in a text file (say apples in feb_groc.txt):

awk /apples/{print $0} feb_groc.txt

Output

2-2-2021 apples   10$
3-2-2021 apples   07$

Find & Replace

Find and replace (replace apples with bananas an save the output to bfeb_groc.txt):

awk '{gsub(/apples/, "bananas")}{print}' feb_groc.txt > bfeb_groc.txt

Output

In February 2021, I bought these groceries:
2-2-2021 bananas   10$
3-2-2021 sugar    12$
3-2-2021 toast    20$
3-2-2021 bananas   07$
3-2-2021 tomatoes 05$
4-2-2021 meat     35$
4-2-2021 toast    10$
…

Sum a Column of Numerals

Sum a column of numerals (the cost column and omitting the $, and skipping the first line):

awk -v sum=0 'NR>1{sum += int($3)} END {print sum}' feb_groc.txt

Output

-v is used to set a variable
NR to start from lines >1
$3 means the third entry in the line
int to trim the $, changing the entry to an integer
END means after finishing the code block execute the next statement

Sum of Specific Numerals

Sum the cost of a specific grocery from a set of invoices (say toast):

awk -v sum=0 'NR>1 && /toast/ {sum += int($3)} END {print sum}' feb_groc.txt

Output

Search & Print for Multiple Files

Searching and printing for multiple files (copied the file under different extension and made some changes):

awk '/\$/{print $0}' feb_groc*

Output

2-2-2021 apples   10$
3-2-2021 sugar    12$
3-2-2021 toast    20$
3-2-2021 apples   07$
3-2-2021 tomatoes 05$
4-2-2021 meat     35$
4-2-2021 toast    10$

Add Header & Footer

Add header and footer (let’s take the previous command and edit it):

awk 'BEGIN{print "\n\nDate Item Cost\n-----------------------"} /\$/{print $0} END {print "-----------------------\n total cost = $$$"}' feb_groc*

Output

Date Item Cost
-----------------------
2-2-2021 apples   10$
3-2-2021 sugar    12$
3-2-2021 toast    20$
3-2-2021 apples   07$
3-2-2021 tomatoes 05$
4-2-2021 meat     35$
4-2-2021 toast    10$
-----------------------
 total cost = $$$

These examples are only demos, awk has more usages like:

input/output statements
getting the index of a substring within the text
splitting a text into an array where each element satisfies a condition
change the case of the characters of the text and more math functions (sin(), cos(), rand(), exp(), … etc)

You can execute this man command to know more: man awk

Using the Sed Command

The name is an abbreviation of stream editor. Sed is simpler to use than awk, it is best suited for finding and substituting patterns, but you can also perform other tasks using sed.

Search & Print All Occurrences of a Sequence of Characters

Search and print all occurrences of a sequence of characters in a text file ( let’s search for toast):

sed -n '/toast/ p' feb_groc.txt

Output

3-2-2021 toast    20$
4-2-2021 toast    10$

-n to limit the default printing(all lines), to just the lines we are working on.
p is the print command.

Find & Replace

It uses the substitute command (s) to replace a given text with another:

sed 's/tomatoes/potatoes/' feb_groc.txt

Output

n February 2021, I bought these groceries:
2-2-2021 apples   10$
3-2-2021 sugar    12$
3-2-2021 toast    20$
3-2-2021 apples   07$
3-2-2021 potatoes 05$
4-2-2021 meat     35$
4-2-2021 toast    10$
…

Instead of directing the output to a file, we can use the in-place flag (i), and add an extension to the file we are newly creating(here the new file will be the same as the old + pot at the end ):

sed -ipot 's/tomatoes/potatoes/' feb_groc.txt

Output

In February 2021, I bought these groceries:
2-2-2021 apples   10$
3-2-2021 sugar    12$
3-2-2021 toast    20$
3-2-2021 apples   07$
3-2-2021 potatoes 05$
4-2-2021 meat     35$
4-2-2021 toast    10$
…

Limit Search/Substitution Within Specific Lines

You can limit your search or substitution within one line or range of lines:

Changing all a’s to j’s:

sed 's/a/j/' feb_groc.txt

Output

In Februjry 2021, I bought these groceries:
2-2-2021 jpples   10$
3-2-2021 sugjr    12$
3-2-2021 tojst    20$
3-2-2021 jpples   07$
3-2-2021 potjtoes 05$
4-2-2021 mejt     35$
4-2-2021 tojst    10$
…

Changing all a’s in the 4^th line to j’s:

sed '4s/a/j/' feb_groc.txt

Output

In February 2021, I bought these groceries:
2-2-2021 apples   10$
3-2-2021 sugar    12$
3-2-2021 tojst    20$
3-2-2021 apples   07$
3-2-2021 potatoes 05$
4-2-2021 meat     35$
4-2-2021 toast    10$
…

Changing all a’s within the range of lines (3,7) to j’s:

sed '3,7s/a/j/' feb_groc.txt

Output

In February 2021, I bought these groceries:
2-2-2021 apples   10$
3-2-2021 sugjr    12$
3-2-2021 tojst    20$
3-2-2021 jpples   07$
3-2-2021 potjtoes 05$
4-2-2021 mejt     35$
4-2-2021 toast    10$
…

Note: you can set a bound within which the search is applicable as a part of the regex.

sed -n '4,/es/ p' feb_groc.txt

Output

3-2-2021 toast    20$
3-2-2021 apples   07$

Using the Grep Command

Grep is the simplest one, it stands for Global Regular Expression Print.

Grep works simply: you pass it a pattern and a text file in which you want to find all occurrences of that pattern, grep loads the text file line by line in a buffer and checks for a match, if found it prints out the match. So, the best usage of grep is finding and printing pattern matches.

Search & Print All Occurrences of a Sequence of Characters

Search and print all occurrences of a sequence of characters in a text file (let’s search all s’s in our file):

grep s feb_groc.txt

word image 184 1

There are various flags that can be used with grep like:

-A [n_lines]: outputting n_lines after any match.
-B [n_lines]: outputting n_lines before any match.
-C [n_lines]: outputting n_lines before any match and n_lines after any match

word image 185 1

-F: search for literal strings.
-l: when searching in multiple files, this is an option to just show the file name
-i: ignore case.
-n: adding a line number before every output line.
-v: negating the match (outputting lines that doesn’t match the pattern).
-x: finding exact lines

word image 186 1

Conclusion

In this article we covered some examples of how Awk, Sed and Grep work with the aim of giving you an idea of how they are different. If you have any feedback or questions feel free to leave a comment and we’ll get back to you as soon as we can.

Awk vs Sed vs Grep in Linux – Differences Explained

Table of Contents

What are Regex (regular expressions)

Using the Awk Command

Search & Print All Occurrences of a Sequence of Characters

Find & Replace

Sum a Column of Numerals

Sum of Specific Numerals

Search & Print for Multiple Files

Add Header & Footer

Using the Sed Command

Search & Print All Occurrences of a Sequence of Characters

Find & Replace

Limit Search/Substitution Within Specific Lines

Using the Grep Command

Search & Print All Occurrences of a Sequence of Characters

Conclusion

Khalid Faiz

How to Install VirtualBox on Debian

How to install VirtualBox Guest Additions on Ubuntu

How to Install Multiple Instances of WSL

Using the chown Command to Change File Ownership in Linux

What is AlmaLinux – Key Information and Overview

Plex: How to Name Naruto (2002) Episodes

13 Compelling Reasons Why People Use Linux

Linux Command-Line for Absolute Beginners

How to Get a Domain Name for Free

How to Install & Configure VNC Server on Ubuntu 22.04

SSH Into a VMware Linux Guest VM from the Host OS

4 Best Cheap High-RAM VPS (2023)

How to Connect to a VPS (Linux & Windows) with RDP or SSH

Awk vs Sed vs Grep in Linux – Differences Explained

Table of Contents

What are Regex (regular expressions)

Using the Awk Command

Search & Print All Occurrences of a Sequence of Characters

Find & Replace

Sum a Column of Numerals

Sum of Specific Numerals

Search & Print for Multiple Files

Add Header & Footer

Using the Sed Command

Search & Print All Occurrences of a Sequence of Characters

Find & Replace

Limit Search/Substitution Within Specific Lines

Using the Grep Command

Search & Print All Occurrences of a Sequence of Characters

Conclusion

How to Install VirtualBox on Debian

How to install VirtualBox Guest Additions on Ubuntu

You May Also Like