Friday, October 22, 2010

Check equality of multiple numbers - awk


My input file 'file.txt' contains 4 values of a certain metric for each of the following 'Continents'.

$ cat file.txt
Continent Val1 Val2 Val3 Val4
AS 440518 440518 440516 440516
AF 253317 253317 253315 253317
EU 245397 245397 245397 245397
OC 226410 226410 226410 226410
NA 221961 221961 221962 221961

Required : I was required to find out only those 'Continents' for which 'all' values are 'same'.

Solutions:

1) Using awk:

$ awk '
/^Continent/ {print $1; next}
$2==$3 && $3==$4 && $4==$5 {print $1}
' file.txt

Output:

Continent
EU
OC

2) Wrote this python program using python 'sets' (Unordered collections of unique elements) to achieve the same. Something like:

from sets import Set
for line in open("file.txt"):
if line.startswith('Continent'):
print line.split()[0]
firstfield = line.split()[0]
remaining = line.split()[1:]
vals = Set(remaining)
if len(vals) == 1:
print firstfield

Executing it:

$ python printequal.py
Continent
EU
OC

3) Any other solution using Bash, Awk or any other scripting languages ? Readers, please put your solutions here in the comment section. Much appreciated.

Related:
- Bash function to compare multiple numbers equality

3 comments:

Derek Evan Schrock said...

GNU AWK running in posix mode:

BEGIN { getline; if( $1 ~ /^Continent/ ) { print $1 } }
{ if( match( $0, "^[A-Z]{2} ("$2" ?)+$" ) ) { print $1 } }

Jadu Saikia said...

@Derek Evan Schrock, thanks for the solution:

when I tried this:

$ gawk 'BEGIN { getline; if( $1 ~ /^Continent/ ) { print $1 } }
{ if( match( $0, "^[A-Z]{2} ("$2" ?)+$" ) ) { print $1 } }' file.txt

Only output following:
Continent

Please assist. Thanks again.

Riv said...

grep -E '^[A-Z]+ ([0-9]+) /1 /1 /1' file.txt

© Jadu Saikia www.UNIXCL.com