In this article we’ll learn about with a wide range of features to exclude directories, skipping files and selecting non-matching results while searching files with grep.
What is Grep
Grep or global regular expression print is a command line utility to search input files for a search string and returns the matching lines; it searches through either information piped to it or files in the current directory.
Surprisingly this simple tool is by far one of the most useful command line tools in Linux; it provides various additional and useful features that make the search queries less time consuming and more effective.
We can achieve this by utilizing a wide range of command line options (flag keywords) and Regular Expressions patterns.
The search process of this program is very efficient even when dealing with a wide scoop of directories/files, grep does not store any lines; it copies a line into a buffer, check for the search string, then print the line if a match was found.
Despite the fact that searching for a given string is the objective of this tool, one of its features is excluding lines that match the given search string from the output while recurring through the directory tree. In other simpler words, you can find stuff with grep or you can choose to ignore them.
Grep Options Overview
Grep syntax overview:
grep <options> <search pattern> <files> <directory path>
Used options in this article:
-w
: match only whole words-i
: ignore case during search-E
: interpret patterns as extended regular expressions-R
: search recursively but follow all symlinks-v
: invert match--exclude=
: ignore files that match the pattern--exclude-dir=
: ignore directories that match the pattern
Excluding Matches With Grep
In grep we call this feature an invert match or negative result, which matches everything except the pattern given. We can use this action by including the -v
flag.
To fully understand this topic, we are going to take a step-by-step approach. Let’s start by creating our new testing directory and .txt
file with multiple lines.
mkdir grep_testing cd grep_testing/ nano my-lines.txt
Our file’s content:
they said a very important check was overlooked. he ignored his heart problems and... he paid the price. that was the sound of my patience shattering into a billion little pieces.
Let’s search for the pattern billion
in our file my-lines.txt
.
grep billion my-lines.txt
that was the sound of my patience shattering into a billion little pieces.
As you can see, the command line retuned the line where the word billion
resides.
Now let’s run the command line using the -v
flag.
grep -v billion my-lines.txt
they said a very important check was overlooked. he ignored his heart problems and... he paid the price.
Notice the negative results; non-matching results.
How about now we try excluding the word the
from the output.
grep -v the my-lines.txt
Notice that grep
printed no output. We expected the return of the first line because it has no the
word. Why is that?
As we’ve mentioned before in the first example, we ran a grep query that looks for a pattern, but not an exact string. Obviously the first line has the
pattern > which is part of “they”. So to overcome this, we’ll use the -w
flag (-w -v
for invert matching).
grep -wv the my-lines.txt
they said a very important check was overlooked.
In case we want to search for multiple patterns, grep offers us that possibility by using the -e
flag:
grep -wv -e billion -e check my-lines.txt
he ignored his heart problems and... he paid the price.
We can achieve the same results with regular expressions using the -E
flag.
grep -v -E 'billion|check' my-lines.txt
he ignored his heart problems and... he paid the price.
Note that the expression has to be enclosed by quotes separated by |
symbol (or).
By default, grep is case sensitive. To base our search on case insensitive query we have to use the -i
flag.
Excluding Directories With Grep
Another feature about grep is searching recursively through multiple directories.
When we use the -R
option, grep will read and process all files under each directory and subdirectory recursively. Let’s start by creating multiple directories and .txt
files to use them in our examples.
mkdir my-first-dir my-second-dir my-third-dir/ printf 'google\nSheets\ngCloud\n' > my-first-dir/my-lines1.txt printf 'The\nHuman\nAnimal\n' > my-second-dir/my-lines2.txt printf 'Bug\nhunting\nmethod\n' > my-third-dir/my-lines3.txt
Let’s start grasping this feature by running a grep query to return results from our newly created files.
grep -E -R 'google|Animal|hunting'
my-second-dir/my-lines2.txt:Animal my-first-dir/my-lines1.txt:google my-third-dir/my-lines3.txt:hunting
Note that <directory path>
can be empty if we are searching in the current directory.
To exclude certain directories we can use the grep option --exclude-dir=PATTERN
. This option is described as: directories that match PATTERN will be skipped.
So when we use the -R
flag to search recursively, the --exclude-dir=PATTERN
option will skip any directory whose base name matches PATTERN. In the next example we are going to grep results from two directories while excluding one. Hit the up arrow and add --exclude-dir=my-first-dir
grep -E -R 'google|Animal|hunting' --exclude-dir=my-first-dir
my-second-dir/my-lines2.txt:Animal my-third-dir/my-lines3.txt:hunting
Note that grep ignored the .txt
file located in the excluded directory.
If we want to exclude multiple directories, we can repeat the appropriate option multiple times as so:
grep -E -R 'google|Animal|hunting' --exclude-dir=my-first-dir --exclude-dir=my-second-dir
Or by specifying the directories separated by commas inside curly braces.
grep -E -R 'google|Animal|hunting' --exclude-dir={my-first-dir,my-second-dir}
my-third-dir/my-lines3.txt:hunting
Suppose that you have a large number of folders, the first set has the following naming syntax: folder1, folder 2 … folder100
.
The second set has a different naming syntax: 1folder, 2folder… 100folder
.
Now we want to search only the files in the first set of directories by excluding the second set of directories. We could achieve that by following one of the above methods, but that would be time consuming and tedious. The only solution to this is using regular expressions. For this example we will be using this special character *
to exclude directories.
Check some other regular expressions below:
?
: The preceding item is optional and is matched at most once.*
: The preceding item is matched zero or more times.+
: The preceding item is matched one or more times.[]
: Bracket expression: Matches a single character or range of characters|
: The choice (or set union) operator
Let’s type a grep query that excludes our second set of folders:
grep -R --exclude-dir=*folder
Note that this query will exclude all directories that match the following pattern {STUFF}folder
. If we want to do the opposite and exclude the first set, we have to change the exlude-dir argument to folder*.
In case we have to exclude multiple directory patterns, we should type our desired patterns separated by commas enclosed in curly braces.
grep -R --exclude-dir={*folder,folder*}
Excluding Files With Grep
After we’ve seen how to exclude specific directories and printing non-matching lines, we have check the last item in our article; skipping specific files while searching with grep. To exclude files we will be using the --exclude=PATTERN
option. This option is described as “skip files matching PATTERN”. To understand this part, let’s create different files with different extensions.
printf 'Hello World Go\n' > my-first-dir/my-script.go printf 'Hello World Cpp\n' > my-second-dir/my-script.cpp printf 'Hello World Python\n' > my-third-dir/my-script.py
Now let’s run a query that excludes .cpp
files from our search.
grep -R Hello --exclude=*.cpp
my-first-dir/my-script.go:Hello World Go my-third-dir/my-script.py:Hello World Python
To skip multiple files with different extensions we’ll have to tweak our previous curly braces method.
grep -R Hello --exclude=*.{cpp,py}
my-first-dir/my-script.go:Hello World Go
Conclusion
In this article we’ve learned about grep, and how to use its most important features and functionalities that make processing files and directories, along with searching in files less time consuming and more effective. By excluding directories, skipping files and printing non-matching results; grep made searching in files a lot easier.