Grep From List of Words
Grep is very powerful unix command line tools to search plain-text dataset using regex. The basic functionality of grep is as follow:
grep [pattern] [file]
Since wikipedia already has an excellent introduction for the basic functionality. I will jump out to uncommon feature to search from list of words that I found out yesterday because I need to clean up and analyse some csv files.
Let say we have this text file where we want to search for something.
cat fruitlist.txt
apple
Pineapple
banana
pear
PEACH
orange
Then we have another list words where we want to grep from.
cat lookup.txt
apple
orange
watermelon
peach
grep -f lookup.txt fruitlist.txt
yields
apple
Pineapple
orange
-f
command line arguments is used to pass one or more separated pattern from lookup.txt
file.
To search the exact word add -w
argument before -f
. -w
means the expression in the list is searched as a word.
grep -wf lookup.txt fruitlist.txt
yields
apple
orange
Pineapple is get rid.
To invert the search to find string not in the lookup file, use -v
argument.
grep -vwf lookup.txt fruitlist.txt
yields
Pineapple
banana
pear
PEACH
To search by ignoring case, add -i
argument.
grep -iwf lookup.txt fruitlist.txt
yields
apple
PEACH
orange
That's it example for me. Indeed man grep
is not a pleasant documentation, but we can get the whole feature explanation there.