Wednesday, May 6, 2009

Replace from lookup file in awk - bash


Input files:
main.txt is the master file with some details in the order
country-code|epoch|name|value

And id.txt is a look-up table which contains the country-code to country-name mapping.


$ cat main.txt
512|1241503759|ax|90
234|1241503760|ay|10
122|1241503823|az|90
123|1241503947|at|80

$ cat id.txt
122|US
123|IN
125|NZ
234|HK
512|ZM
600|KR

Required:
Country codes(field 1) in main.txt has to be replaced with the corresponding country-name from look-up table id.txt

Solutions: Normal bash scripting would be:

$ cat main.txt | while read line
do
code=$(echo $line | awk '{print $1}' FS=\|)
cn=$(awk -v CID=$code '$1==CID {print $2}' FS=\| id.txt)
echo $line|awk -v CN=$cn 'BEGIN {FS=OFS="|"} {$1=CN} 1'
done


Awk solutions:

$ awk '
BEGIN {FS=OFS="|"}
FNR==NR{a[$1]=$2;next}
$1 in a{print a[$1],$2,$3,$4}
' id.txt main.txt

Output:
ZM|1241503759|ax|90
HK|1241503760|ay|10
US|1241503823|az|90
IN|1241503947|at|80

or

$ awk '
BEGIN {FS=OFS="|"}
FNR==NR{a[$1]=$2;next}
$1 in a{print a[$1]"("$1")",$2,$3,$4}
' id.txt main.txt

Output:
ZM(512)|1241503759|ax|90
HK(234)|1241503760|ay|10
US(122)|1241503823|az|90
IN(123)|1241503947|at|80

or

$ awk '
BEGIN {FS=OFS="|"}
FNR==NR{a[$1]=$2;next}
$1 in a{print a[$1],$0}
' id.txt main.txt

ZM|512|1241503759|ax|90
HK|234|1241503760|ay|10
US|122|1241503823|az|90
IN|123|1241503947|at|80


Related post:
- Performing join using awk here
- Update file based on another file in awk here
- Difference between awk NR and FNR variables here
- Replace column with column of another file in awk here
- Match words between two files in bash here
- Delete line based on another file here

5 comments:

Unknown said...

In case the delimiter in the 2nd file is different from the 1st one; something like

$ cat main.txt
512|1241503759|ax|90
234|1241503760|ay|10
122|1241503823|az|90
123|1241503947|at|80

$ cat id.txt
122,US
123,IN
125,NZ
234,HK
512,ZM
600,KR

$ awk '
FNR==NR{a[$1]=$2;next}
$1 in a{print a[$1],$2,$3,$4}
' FS=\, id.txt FS=\| OFS=\| main.txt
ZM|1241503759|ax|90
HK|1241503760|ay|10
US|1241503823|az|90
IN|1241503947|at|80

Rashika said...

Hi I tried to use your above awk solution by creating two files of same type as above ,but got the following error.

awk: syntax error near line 4
awk: bailing out near line 4

The awk solution used :
cat awk1.sh
#!/bin/bash
awk '
BEGIN {FS=OFS="|"}
FNR==NR{a[$1]=$2;next}
$1 in a{print a[$1],$2,$3,$4}
' id.txt main.txt

The files used are
cat main.txt
512|1241503759|ax|90
234|1241503760|ay|10
122|1241503823|az|90

cat id.txt
122|US
123|IN
234|HK
512|ZM

CSrohit said...

Need help on this.

Unknown said...

@Rashika, are you on Solaris ?

Could you please try gawk or nawk or /usr/xpg4/bin/awk and let me know. Thanks.

Maulik said...

thank you for the codes, but what if I want to replace more than 1 column (say also third and fourth column from id.txt)? how should we modify the codes, can you please elaborate? Thank you again,

With Regards,
Maulik

© Jadu Saikia www.UNIXCL.com