Thursday, June 4, 2009

Using Grep Command

Grep is a command line text search utility originally written for Unix. Difference between grep and find command is grep is used to search for string in a file, find is used to search files or directories

The grep command searches files or standard input globally for lines matching a given regular expression, and prints them to the program's standard output.

sahab@sahab-desktop:~$ grep root /etc/passwd
root:x:0:0:root:/root:/bin/bash

sahab@sahab-desktop:~$ grep -n root /etc/passwd
1:root:x:0:0:root:/root:/bin/bash
-n, --line-number
Prefix each line of output with the line number within its input
file.

sahab@sahab-desktop:~$ grep -v bash /etc/passwd | grep -v nologin
daemon:x:1:1:daemon:/usr/sbin:/bin/sh
bin:x:2:2:bin:/bin:/bin/sh
sys:x:3:3:sys:/dev:/bin/sh
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/bin/sh
man:x:6:12:man:/var/cache/man:/bin/sh
v, --invert-match
Invert the sense of matching, to select non-matching lines.

sahab@sahab-desktop:~$ grep -c bash /etc/passwd
4

sahab@sahab-desktop:~$ grep -c nologin /etc/passwd
1
-c, --count
Suppress normal output; instead print a count of matching lines
for each input file. With the -v, --invert-match option (see
below), count non-matching lines.

sahab@sahab-desktop:~$ grep -i ps ~/.bash*
/home/sahab/.bash_history:ps -ax
/home/sahab/.bash_history:tops
/home/sahab/.bash_history:ps -ax | grep -i giis
/home/sahab/.bash_history:ps -ax
/home/sahab/.bashrc:[ -z "$PS1" ] && return
/home/sahab/.bashrc:export HISTCONTROL=$HISTCONTROL$ {HISTCONTROL+,}ignoredups
/home/sahab/.bashrc:# ... or force ignoredups and ignorespace

sahab@sahab-desktop:~$ grep -i ps ~/.bash* | grep -v history
/home/sahab/.bashrc:[ -z "$PS1" ] && return
-i, --ignore-case
Ignore case distinctions in both the PATTERN and the input
files.
We now exclusively want to display lines starting with the string "root":

$ grep -rwl 'ar' /home/sahab/
grep: /home/sahab/.kde/socket-sahab-desktop: No such file or directory
grep: /home/sahab/.kde/tmp-sahab-desktop:
-R, -r, --recursive
Read all files under each directory, recursively; this is equiv-
alent to the -d recurse option.
-w, --word-regexp
Select only those lines containing matches that form whole
words.
-x, --line-regexp
Select only those matches that exactly match the whole line.
-l, --files-with-matches
Suppress normal output; instead print the name of each input
file from which output would normally have been printed. The
scanning will stop on the first match.

$ grep ^root /etc/passwd
root:x:0:0:root:/root:/bin/bash
The caret ^ and the dollar sign $ are meta-characters that respectively
match the empty string at the beginning and end of a line. The symbols
\<> respectively match the empty string at the beginning and end
of a word. The symbol \b matches the empty string at the edge of a
word, and \B matches the empty string provided not at the edge of
a word.

$ grep -w / /etc/fstab
UUID=8d7122c6-6549-4622-83b8-7855ad822edc / ext3 relatime,errors=remount-ro 0
$ grep / /etc/fstab
#/etc/fstab: static file system information.
proc /proc proc defaults 0 0
# /dev/sda6
UUID=8d7122c6-6549-4622-83b8-7855ad822edc / ext3 relatime,errors=remount-ro 0 1
# /dev/sda7
UUID=f9ca09b2-9a7d-472d-9ecd-a8d156737559 /data ext3 relatime 0 2
# /dev/sda3
/dev/scd0 /media/cdrom0 udf,iso9660 user,noauto,exec,utf8 0 0

Here is an example shell command that invokes GNU `grep':

grep -i 'hello.*world' menu.h main.c

This lists all lines in the files `menu.h' and `main.c' that contain the string `hello' followed by the string `world'; this is because `.*' matches zero or more characters within a line. *Note Regular Expressions::. The `-i' option causes `grep' to ignore case, causing it to match the line `Hello, world!', which it would not otherwise match. *Note Invoking::, for more details about how to invoke `grep'.

Here are some common questions and answers about `grep' usage.

1. How can I list just the names of matching files?

grep -l 'main' *.c

lists the names of all C files in the current directory whose
contents mention `main'.

2. How do I search directories recursively?

grep -r 'hello' /home/sahab

searches for `hello' in all files under the directory
`/home/sahab'. For more control of which files are searched, use
`find', `grep' and `xargs'. For example, the following command
searches only C files:

find /home/sahab -name '*.c' -print | xargs grep 'hello' /dev/null

This differs from the command:

grep -r 'hello' *.c
lists the names of all C files in the current directory whose
contents mention `main'.


3. What if a pattern has a leading `-'?

grep -e -cut here- *

searches for all lines matching `--cut here--'. Without `-e',
`grep' would attempt to parse `--cut here--' as a list of options.

4. Suppose I want to search for a whole word, not a part of a word?

grep -w 'hello' *

searches only for instances of `hello' that are entire words; it
does not match `Othello'. For more control, use `\<' and `\>' to
match the start and end of words. For example:
grep 'hello\>' *

searches only for words ending in `hello', so it matches the word
`Othello'.

5. How do I output context around the matching lines?

grep -C 2 'hello' *

prints two lines of context around each matching line.

6. How do I force grep to print the name of the file?

Append `/dev/null':

grep 'test' /etc/passwd /dev/null

gets you:
/etc/passwd:test:x:1002:1002:,,,:/home/test:/bin/bash

7. Why do people use strange regular expressions on `ps' output?

ps -ef | grep '[c]ron'

If the pattern had been written without the square brackets, it
would have matched not only the `ps' output line for `cron', but
also the `ps' output line for `grep'. Note that some platforms
`ps' limit the ouput to the width of the screen, grep does not
have any limit on the length of a line except the available memory.

8. Why does `grep' report "Binary file matches"?
If `grep' listed all matching "lines" from a binary file, it would
probably generate output that is not useful, and it might even
muck up your display. So GNU `grep' suppresses output from files
that appear to be binary files. To force GNU `grep' to output
lines even from files that appear to be binary, use the `-a' or
`--binary-files=text' option. To eliminate the "Binary file
matches" messages, use the `-I' or `--binary-files=without-match'
option.

9. Why doesn't `grep -lv' print nonmatching file names?

`grep -lv' lists the names of all files containing one or more
lines that do not match. To list the names of all files that
contain no matching lines, use the `-L' or `--files-without-match'
option.

10. I can do OR with `|', but what about AND?

grep 'sahab' /etc/motd | grep 'ubuntu,jaunty

finds all lines that contain both `sahab' and 'ubuntu,jaunty'.

11. How can I search in both standard input and in files?

Use the special file name `-':
cat /etc/passwd | grep 'sahab' - /etc/motd

12. How to express palindromes in a regular expression?

It can be done by using the back referecences, for example a
palindrome of 4 chararcters can be written in BRE.

grep -w -e '\(.\)\(.\).\2\1' file

It matches the word "radar" or "civic".

Guglielmo Bondioni proposed a single RE that finds all the
palindromes up to 19 characters long.

egrep -e '^(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?).?\9\8\7\6\5\4\3\2\1$' file

Note this is done by using GNU ERE extensions, it might not be
portable on other greps.

No comments: