## Friday, October 22, 2010

### Check equality of multiple numbers - awk

My input file 'file.txt' contains 4 values of a certain metric for each of the following 'Continents'.
`\$ cat file.txtContinent Val1 Val2 Val3 Val4AS 440518 440518 440516 440516AF 253317 253317 253315 253317EU 245397 245397 245397 245397OC 226410 226410 226410 226410NA 221961 221961 221962 221961`

Required : I was required to find out only those 'Continents' for which 'all' values are 'same'.

Solutions:

1) Using awk:
`\$ awk '    /^Continent/ {print \$1; next}    \$2==\$3 && \$3==\$4 && \$4==\$5 {print \$1}' file.txt`

Output:
`ContinentEUOC`

2) Wrote this python program using python 'sets' (Unordered collections of unique elements) to achieve the same. Something like:
`from sets import Setfor line in open("file.txt"):    if line.startswith('Continent'):        print line.split()[0]    firstfield = line.split()[0]    remaining = line.split()[1:]    vals = Set(remaining)    if len(vals) == 1:        print firstfield`

Executing it:
`\$ python printequal.pyContinentEUOC`

3) Any other solution using Bash, Awk or any other scripting languages ? Readers, please put your solutions here in the comment section. Much appreciated.

Related:
- Bash function to compare multiple numbers equality

Derek Evan Schrock said...

GNU AWK running in posix mode:

BEGIN { getline; if( \$1 ~ /^Continent/ ) { print \$1 } }
{ if( match( \$0, "^[A-Z]{2} ("\$2" ?)+\$" ) ) { print \$1 } }

@Derek Evan Schrock, thanks for the solution:

when I tried this:

\$ gawk 'BEGIN { getline; if( \$1 ~ /^Continent/ ) { print \$1 } }
{ if( match( \$0, "^[A-Z]{2} ("\$2" ?)+\$" ) ) { print \$1 } }' file.txt

Only output following:
Continent