LPI 101-500 – 103.7: Searching text files with regular expressions
July 23, 2023

1. regex, grep, egrep, fgrep

This lesson is about the so called regular expressions, or Regex for short. Regular expressions are search patterns that are used to search through the contents of files. You can roughly compare Regex with the file globing. Although Regex is much more versatile and much more powerful. We already got to know file Blobbing in one of the last lessons. While file blobbing is used to search files and directories, regex is used to search the contents of files. If you think of a normal word processing program and use the find and replace function there, it is actually nothing more than Regex. A text is searched for certain characters and possibly if you want to, replaced by something else.

Regex is used in various programming language languages. The commands for regex introduction are quite different, but the search algorithms are the same everywhere. We already know some things from Fightlobbing, for example, the Asterisks wild card, the question mark, or the square brackets. With Regex, all of this is partly the same, partly with a slightly different meaning. There are two types of regular expressions, namely basic and extended regular expressions. Basic regular expressions can be used with Grab. Extended regular expressions can only be used with egrapp. The difference will become clear in this lesson. So let’s first look at Grab.

We have already used Grab many times in this course, whenever we only wanted to output a small part of a large text or a large list, for example lsla root and then grab TMP. So with lsla root we let display the entire root directory content and we forward the result through the pipe to the command grab. Grab for its part, searches for temp in the result and outputs the result accordingly. Grab can also be used on its own, so we don’t always have to pass the result of cat or LS to Grab first. I did this in the course mainly to make it clearer when what grab actually does. Alternatively, Grab could be used like this instead grep manuel password so I look for the word manual in the password file and it should give me this.

Of course, Grab also has various options. For example, the option V, which this time doesn’t mean verbose in this time v inverts the search here, that is, everything is output that does not contain manual, which we have this out with grab vanity, and here we see a long list and manual is nowhere. Here we can also use the option n, with which we can also output line numbers. So grab VNWare and here we have the same result as before, but this time with line numbers with grab o, only the characters or only the word that you searched for our output. So grab o and we get the word man reprinted two times. So it’s two times in this file so far.

However, we haven’t used regular expressions because we’ve only used Grab and a handful of its options. So let’s add a few regular expressions here, for example but maybe first I show you this file. I have prepared a little text file here, nothing really special, so just this here, just to show you the regular expressions. So for example, the roof symbol here means that the word so in this case the word this must be at the beginning of a line. Without this symbol we would see other results. But let’s check that out. So with this roof we have one output here. This is an example. Without this roof here we would have three results.

There is of course also an expression that describes the end of a line. First we look for the word example. So grab example and then in the file regex, TxC and we have 12345 results here. Now we use the dollar symbol as a regular expression that says that the word example must be at the end of a line in order for it to be displayed. So again, grab i, example, dollar and then rag, x, txt and here we have now received no result at all. Why? Actually the word example is more often at the end, as we can see here. But if we look closely, we see that there is still something after the word example. So in this case an S, in this case a period here it’s not the end, here is a bracket.

So accordingly, the word example never comes really at the end of the line. If we change that a bit and say after the word example there should be a period, then we can write down the following period and then the dollar sign reg. Then in this case we even get two results. The period here in Regex language means nothing more than any character, which is why we have one result here with a square bracket at the end. How do we get grab to actually just show us that with a period, only that with a period we use a backslash in front of the period. So grab, sample, backslash, period and then the dollar sign Regex.

And the backslash means that this period which then follows is not interpreted, but is actually seen as a period. So when we run it, we only have this one result. So let’s combine a few characters. For example grab Wonderful or Wonder Woman. This expression, this regular expression means that we are looking for the word wonderful, or we are looking for the word Wonder Woman. And in this case we have no result, although we can find both words in the file. So I’ll show you that again. So here at the bottom we have the word wonderful and we have the word Wonder Woman. But in this case we get no result. But actually result should have been given.

Why is that not the case? So the type symbol at this point here, which in Regex language does not mean that we are redirecting a result somewhere but here it just means or so wonderful or Wonder Woman. And this symbol. So this or symbol in regular expression can only be used with the so called extended regular expressions. So we need to tell Grep to switch to the advanced mode. And maybe you remember at the first sentences of this video I told you that the basic regular expressions can be used by Grab and the extended regular expressions can be used by egrep. So there are two options here for us. We could use grab with the e option or the egrapp command, which is ultimately one and the same.

So let’s first run the command with the e option in capital letters. It’s important, so grab e and then again one full or one woman red, x, txt, and now the two words are indicated to us as a test. Do the whole thing again with the egrapp command so egrep txt, and we have the same result here as grab with the option e. So it’s the same, completely the same. Of course, when we use regex expressions we can use the normal options of grab or e graph egrep. For example, again over V, which says that anything but the words in this case Wonderful or beautiful is displayed. We do the test with e, grab, v, and then again Wonderful, Wonder Woman ring x, txt, and yeah, b is inverted.

And here we see everything that is in the file except Wonderful and Wonder Woman. The regular expressions are a very huge and complicated field, but we don’t want to and can’t go that deep. For the epic one exam below this video I have placed a small PDF file in which I have included a few important regex command. I definitely recommend going through these and trying out a bit for yourself. There are also various so called cheat sheets for regex on the Internet with dozens of regular expressions. I recommend that you deal with them a little bit here.

I hope the difference between grab and egrep or grab with the e option has become clear. We have another grab on the list, namely Fgrab. Fgrab does not interpret any characters, it only uses the characters that are passed and searches for them. We just had the following example using Fgrab instead of egrep. So this one here Fgrab, and we have no results here because Fgrab actually searches the regex text file for exactly the text that is between the quotation marks. So it searches for this text here. It does not interpret anything, and this line isn’t found in the regex text file. So it’s correct that we have no result here.

Leave a Reply

How It Works

img
Step 1. Choose Exam
on ExamLabs
Download IT Exams Questions & Answers
img
Step 2. Open Exam with
Avanset Exam Simulator
Press here to download VCE Exam Simulator that simulates real exam environment
img
Step 3. Study
& Pass
IT Exams Anywhere, Anytime!