Thursday, May 14, 2009

Compare numeric fields with quotes - awk


Input file:

$ cat file.txt
"12"|"w2432"|"awk"
"22"|"35435"|"sed"
"13.2"|"adad"|"awk"
"9"|""qqwq"|"perl"

Output required: Print only those lines where first field value is > 12

Way1:
Normal $2 > 12 condition is not going to work as $2 is with "". So we need to remove the double quotes from first field and then we can compare the condition.

$ awk -v c='"' 'BEGIN {
FS=OFS="|"
}
{
f=$1
gsub(c,"",f)
if (f > 12) print
}
' file.txt

o/p:

"22"|"35435"|"sed"
"13.2"|"adad"|"awk"

Way2:
Another way would be to use 2 field separator (FS) with awk:


$ awk -F "[\",|]" '$2 > 12' file.txt

o/p:

"22"|"35435"|"sed"
"13.2"|"adad"|"awk"

3 comments:

Nathan said...

another way , without awk:

#!/bin/bash

declare sz
declare -i status

while read LINE
do

sz=$(cut -d'|' -f1 <<<"$LINE" | sed -n 's/"//g;p')

status=(`bc -q << EOF $sz > 12.0 EOF`)

[[ "$status" -eq 1 ]] && echo "$LINE"

done < txt

rattus said...

In this particular case you don't need two FS, just " is sufficient:

$ awk -F\" '$2 > 12' file.txt

In general, in these cases, I recommend conversion to CSV and use CSV tools. Without it's just getting nasty:

$ cat file.txt
"a"|"12"|"w2432"|"awk"
"b"|"22"|"35435"|"sed"
c|"13.2"|"adad"|"awk"
|"9"|""qqwq"|"perl"

$ awk -F "[\",|]" '$5 > 12' file.txt
"b"|"22"|"35435"|"sed"

Note that your own example features a double-double quote ;) Something common with CSV files.

Jadu Saikia said...

@Nathan, thanks.
@Rattus: yeah, you are right, I never realized that, thanks for that.

© Jadu Saikia www.UNIXCL.com