Friday, September 12, 2008

Delete lines based on another file - awk


$ cat main.txt
ID1:A:45
ID2:B:12
ID4:C:12
ID3:D:56
ID7:F:90
ID9:K:14
ID5:P:32

$ cat filter.txt
ID7:0
ID3:0
ID4:0

Required output: Delete those lines from "main.txt" for which the ID field (first field) matched with that in "filter.txt". Basically the output file say rest.txt will be subtraction of filter.txt from main.txt.

The awk solution:

$ awk >rest.txt 'NR==FNR{arr[$1];next}!($1 in arr)' FS=":" filter.txt main.txt

or

$ awk >rest.txt 'NR==FNR{_[$1];next}!($1 in _)' FS=":" filter.txt main.txt

Result:

$ cat rest.txt
ID1:A:45
ID2:B:12
ID9:K:14
ID5:P:32

3 comments:

豆豆 said...

hi, the code just doesn't work for me:

awk: syntax error near line 1
awk: bailing out near line 1

would you take a look at the problem please? thanks!

Jadu Kumar Saikia said...

hi,
for me both my one liners work; I am using

GNU Awk 3.1.4

Please use nawk or gawk or /usr/xpg4/bin/awk on Solaris

// Jadu

Zachary Young said...

Great, worked perfectly. Thank you :)

© Jadu Saikia www.UNIXCL.com