CIT052 Index > Notes 3

Notes 3

Chapter 4

Section 4.5

The book starts out with a misleading statement: “The \( \) and \{ \}, however, are not allowed. You can use them; just use them without the backslashes.

Section 4.5.1

Looking at the lines found in Example 4.31 might make you think that egrep '3+' datafile means “one or more 3s anywhere on the line”; it really means “one or more consecutive 3s anywhere on the line.”

Similarly, the output from Example 4.33 doesn’t give you a good idea of what the plus sign does. As a slightly more-to-the-point example, type this into a file temp.txt

met
meat
mitt
mutt
maitre d'
muumuu

And then do this to find all words that have an m, one or more of the vowels aeio, followed by t.

egrep 'm[aeio]+t' temp.txt

Section 4.6

Why would you ever want to use fgrep? If you are doing a search for something that contains a lot of metacharacters (for example, a line like a[3]=b[0]+5 in a program), you can use fgrep to avoid having to put backslashes everywhere.

Section 4.7

Everything up to this point has been valid for grep as found on generic UNIX systems. This section tells you what has been added to the GNU version of grep, which is what you will find on Linux systems.

Section 4.7.1

There is an interaction between using ranges, character encoding, and egrep. Presume the following file, letters:

å
a
A

The default character set in Linux is Unicode, and ranges get sorted into dictionary order, so that upper and lowercase coincide. Thus, you will get these results from these commands:

egrep '[abcdefghijklmnopqrstuvwxyz]' letters # finds only line two
egrep '[a-z]' letters # finds all three lines
egrep '[[:lower:]]' letters # finds the first two lines

Which to use?

Unless you are using a ton of metacharacters, I recommend that you always use egrep, because it doesn’t require as many backslashes.

Section 4.11

Recursive grep (egrep -r) will indiscriminately search all files in the subdirectories. Presume you only wish to find things in files ending with .html. In that case, you would add the --include option, which lets you specify a shell pattern for files to include in the search.

egrep -r -include='*.html' thingToFind *

The *.html has to be placed in quote marks to prevent the shell from expanding the asterisk. The last * means to search all files in the current directory (the --include will filter out anything other than .html files.)

Section 4.12

On page 114, there should not be two dashes after --help. On page 115, the long name for -v should be --invert-match.

Section 4.13

Example 4.67 may not work as advertised.