Monday, May 11, 2009

Modify file based on another - awk


Input file master-classVI.txt contains some details of class VI students.

$ cat master-classVI.txt
sachin:22:M
rupali:12:F
nilutpal:11:M
rohan:01:M
wasim:19:M
priya:08:F
monali:44:F
rashi:11:F


Results of unit test on English language has come in results-fall.txt (it contains the name of all passed students)

$ cat results-fall.txt
priya
rashi
sachin
nilutpal


Required output:
Modify master-classVI.txt with the result. i.e. pass/fail as the last column. i.e. required output


sachin:22:M:Pass
rupali:12:F:Fail
nilutpal:11:M:Pass
rohan:01:M:Fail
wasim:19:M:Fail
priya:08:F:Pass
monali:44:F:Fail
rashi:11:F:Pass


Awk solution using awk NR FNR variables

$ awk 'BEGIN {FS=OFS=":"}
{$4="Fail"}
FNR==NR{_[$1]=$1;next}
$1 in _{$4="Pass"} {print}
' results-fall.txt master-classVI.txt


Some explanation:

a) By default make $4 = "Fail" ; all failed huh !!
b) I have put a lot of posts on awk NR FNR ; just for you understanding print this:

$ awk '
{print FNR,NR,"["$0"]"}' results-fall.txt master-classVI.txt

1 1 [priya]
2 2 [rashi]
3 3 [sachin]
4 4 [nilutpal]
1 5 [sachin:22:M]
2 6 [rupali:12:F]
3 7 [nilutpal:11:M]
4 8 [rohan:01:M]
5 9 [wasim:19:M]
6 10 [priya:08:F]
7 11 [monali:44:F]
8 12 [rashi:11:F]


c) So array _ is going to contain "priya","rashi","sachin","nilutpal"
Records from "results-fall.txt" is avoided from printing using the 'next' above.

d) For all the above entries in the array _ , make $4 "Pass"

No comments:

© Jadu Saikia www.UNIXCL.com