Tuesday, March 18, 2008

Performing "join" using AWK


$ cat file1.txt
Alex:ID23:How:23:2004
Aina:ID12:Thomas:14:2003
Ciam:ID13:Dev:23:2000
Alot:ID34:Ya:24:2004
Brian:ID64:Low:25:1999

$ cat file2.txt
ID13:12,300
ID12:34,300
ID64:50,000

#Join both the files based on common 1st filed of file2.txt and 2nd field of file1.txt
#i.e. required Output:

Ciam:ID13:Dev:23:2000:12,300
Aina:ID12:Thomas:14:2003:34,300
Brian:ID64:Low:25:1999:50,000

Solution1:
$ awk '
BEGIN {FS=OFS=":"}
NR==FNR{arr[$1]=$2;next}
$2 in arr{print $0,arr[$2]}
' file2.txt file1.txt


Aina:ID12:Thomas:14:2003:34,300
Ciam:ID13:Dev:23:2000:12,300
Brian:ID64:Low:25:1999:50,000

Solution2:
$ awk '
BEGIN {FS=":"}
NR==FNR{arr[$2]=$0;next}
$1 in arr && $0=arr[$1] FS $2
' file1.txt file2.txt


Ciam:ID13:Dev:23:2000:12,300
Aina:ID12:Thomas:14:2003:34,300
Brian:ID64:Low:25:1999:50,000

Using "join":
Same thing if I have to do with "join" command, here are the steps:

$ cat file1.txt
Alex:ID23:How:23:2004
Aina:ID12:Thomas:14:2003
Ciam:ID13:Dev:23:2000
Alot:ID34:Ya:24:2004
Brian:ID64:Low:25:1999

#Sort file1.txt based on 2nd field, store the sorted output on file1.txt.srt
$ sort -t ":" +1 -2 -o file1.txt.srt file1.txt

$ cat file1.txt.srt
Aina:ID12:Thomas:14:2003
Ciam:ID13:Dev:23:2000
Alex:ID23:How:23:2004
Alot:ID34:Ya:24:2004
Brian:ID64:Low:25:1999

$ cat file2.txt
ID13:12,300
ID12:34,300
ID64:50,000

#Sort file2.txt based on 1st field, store the sorted output on file2.txt.srt
$ sort -t ":" +0 -1 -o file2.txt.srt file2.txt

$ cat file2.txt.srt
ID12:34,300
ID13:12,300
ID64:50,000

#Now perform "join" operation.
$ join -t ":" -1 2 -2 1 -o "1.1 1.2 1.3 1.4 1.5 2.2" file1.txt.srt file2.txt.srt
Aina:ID12:Thomas:14:2003:34,300
Ciam:ID13:Dev:23:2000:12,300
Brian:ID64:Low:25:1999:50,000

More about "join" can be found here

No comments:

© Jadu Saikia www.UNIXCL.com