How To Exclude Matches, Directories Or Files with Grep

examples of excluding patterns, dirs and files with grep

In this article we’ll learn about with a wide range of features to exclude directories, skipping files and selecting non-matching results while searching files with grep.

What is Grep

Grep or global regular expression print is a command line utility to search input files for a search string and returns the matching lines; it searches through either information piped to it or files in the current directory.

Surprisingly this simple tool is by far one of the most useful command line tools in Linux; it provides various additional and useful features that make the search queries less time consuming and more effective.

We can achieve this by utilizing a wide range of command line options (flag keywords) and Regular Expressions patterns.

The search process of this program is very efficient even when dealing with a wide scoop of directories/files, grep does not store any lines; it copies a line into a buffer, check for the search string, then print the line if a match was found.

Despite the fact that searching for a given string is the objective of this tool, one of its features is excluding lines that match the given search string from the output while recurring through the directory tree. In other simpler words, you can find stuff with grep or you can choose to ignore them.

Grep Options Overview

Grep syntax overview:

grep <options> <search pattern> <files> <directory path>

Used options in this article:

  • -w : match only whole words
  • -i : ignore case during search
  • -E : interpret patterns as extended regular expressions
  • -R : search recursively but follow all symlinks
  • -v : invert match
  • --exclude= : ignore files that match the pattern
  • --exclude-dir= : ignore directories that match the pattern

Excluding Matches With Grep

In grep we call this feature an invert match or negative result, which matches everything except the pattern given. We can use this action by including the -v flag.

To fully understand this topic, we are going to take a step-by-step approach. Let’s start by creating our new testing directory and .txt file with multiple lines.

mkdir grep_testing
cd grep_testing/
nano my-lines.txt

Our file’s content:

my-lines.txt

they said a very important check was overlooked.
he ignored his heart problems and... he paid the price.
that was the sound of my patience shattering into a billion little pieces.

Let’s search for the pattern billion in our file my-lines.txt.

grep billion my-lines.txt
Output
that was the sound of my patience shattering into a billion little pieces.

As you can see, the command line retuned the line where the word billion resides.

Now let’s run the command line using the -v flag.

grep -v billion my-lines.txt
Output
they said a very important check was overlooked.
he ignored his heart problems and... he paid the price.

Notice the negative results; non-matching results.

How about now we try excluding the word the from the output.

grep -v the my-lines.txt

Notice that grep printed no output. We expected the return of the first line because it has no the word. Why is that?

As we’ve mentioned before in the first example, we ran a grep query that looks for a pattern, but not an exact string. Obviously the first line has the pattern > which is part of “they”. So to overcome this, we’ll use the -w flag (-w -v for invert matching).

grep -wv the my-lines.txt
Output
they said a very important check was overlooked.

In case we want to search for multiple patterns, grep offers us that possibility by using the -e flag:

grep -wv -e billion -e check my-lines.txt
Output
he ignored his heart problems and... he paid the price.

We can achieve the same results with regular expressions using the -E flag.

grep -v -E 'billion|check' my-lines.txt
Output
he ignored his heart problems and... he paid the price.

Note that the expression has to be enclosed by quotes separated by | symbol (or).

By default, grep is case sensitive. To base our search on case insensitive query we have to use the -i flag.

Excluding Directories With Grep

Another feature about grep is searching recursively through multiple directories.

When we use the -R option, grep will read and process all files under each directory and subdirectory recursively. Let’s start by creating multiple directories and .txt files to use them in our examples.

mkdir my-first-dir my-second-dir my-third-dir/
printf 'google\nSheets\ngCloud\n' > my-first-dir/my-lines1.txt
printf 'The\nHuman\nAnimal\n' > my-second-dir/my-lines2.txt
printf 'Bug\nhunting\nmethod\n' > my-third-dir/my-lines3.txt

Let’s start grasping this feature by running a grep query to return results from our newly created files.

grep -E -R 'google|Animal|hunting'
Output
my-second-dir/my-lines2.txt:Animal
my-first-dir/my-lines1.txt:google
my-third-dir/my-lines3.txt:hunting

Note that <directory path> can be empty if we are searching in the current directory.

To exclude certain directories we can use the grep option --exclude-dir=PATTERN . This option is described as: directories that match PATTERN will be skipped.

So when we use the -R flag to search recursively, the --exclude-dir=PATTERN option will skip any directory whose base name matches PATTERN. In the next example we are going to grep results from two directories while excluding one. Hit the up arrow and add --exclude-dir=my-first-dir

grep -E -R 'google|Animal|hunting' --exclude-dir=my-first-dir
Output
my-second-dir/my-lines2.txt:Animal
my-third-dir/my-lines3.txt:hunting

Note that grep ignored the .txt file located in the excluded directory.

If we want to exclude multiple directories, we can repeat the appropriate option multiple times as so:

grep -E -R 'google|Animal|hunting' --exclude-dir=my-first-dir --exclude-dir=my-second-dir

Or by specifying the directories separated by commas inside curly braces.

grep -E -R 'google|Animal|hunting' --exclude-dir={my-first-dir,my-second-dir}
Output
my-third-dir/my-lines3.txt:hunting

Suppose that you have a large number of folders, the first set has the following naming syntax: folder1, folder 2 … folder100.

The second set has a different naming syntax: 1folder, 2folder… 100folder.

Now we want to search only the files in the first set of directories by excluding the second set of directories. We could achieve that by following one of the above methods, but that would be time consuming and tedious. The only solution to this is using regular expressions. For this example we will be using this special character * to exclude directories.

Check some other regular expressions below:

  • ? : The preceding item is optional and is matched at most once.
  • * : The preceding item is matched zero or more times.
  • + : The preceding item is matched one or more times.
  • [] : Bracket expression: Matches a single character or range of characters
  • | : The choice (or set union) operator

Let’s type a grep query that excludes our second set of folders:

grep -R --exclude-dir=*folder

Note that this query will exclude all directories that match the following pattern {STUFF}folder. If we want to do the opposite and exclude the first set, we have to change the exlude-dir argument to folder*.

In case we have to exclude multiple directory patterns, we should type our desired patterns separated by commas enclosed in curly braces.

grep -R --exclude-dir={*folder,folder*}

Excluding Files With Grep

After we’ve seen how to exclude specific directories and printing non-matching lines, we have check the last item in our article; skipping specific files while searching with grep. To exclude files we will be using the --exclude=PATTERN option. This option is described as “skip files matching PATTERN”. To understand this part, let’s create different files with different extensions.

printf 'Hello World Go\n' > my-first-dir/my-script.go
printf 'Hello World Cpp\n' > my-second-dir/my-script.cpp
printf 'Hello World Python\n' > my-third-dir/my-script.py

Now let’s run a query that excludes .cpp files from our search.

grep -R Hello --exclude=*.cpp
Output
my-first-dir/my-script.go:Hello World Go
my-third-dir/my-script.py:Hello World Python

To skip multiple files with different extensions we’ll have to tweak our previous curly braces method.

grep -R Hello --exclude=*.{cpp,py}
Output
my-first-dir/my-script.go:Hello World Go

Conclusion

In this article we’ve learned about grep, and how to use its most important features and functionalities that make processing files and directories, along with searching in files less time consuming and more effective. By excluding directories, skipping files and printing non-matching results; grep made searching in files a lot easier.

0 Shares:
Subscribe
Notify of
guest
Receive notifications when your comment receives a reply. (Optional)
Your username will link to your website. (Optional)

0 Comments
Inline Feedbacks
View all comments
You May Also Like