Sunday, December 12, 2010

Awk - Sum of multiple columns of file


Input file:

$ cat file.txt
500:120:100
100:120:700
200:900:125
120:120:900

Required:
Compute the sum of individual columns of the above file. i.e. required output:

920:1260:1825

Awk solution - 1:

$ awk 'BEGIN {FS=OFS=":"}
NR == 1 { n1 = $1; n2 = $2; n3 = $3; next }
{ n1 += $1; n2 += $2; n3 += $3 }
END { print n1, n2, n3 }' file.txt

Output:

920:1260:1825

Awk solution - 2:

$ awk -F ":" '
{ for (i=1; i<=NF; ++i) sum[i] += $i; j=NF }
END { for (i=1; i <= j; ++i) printf "%s ", sum[i]; printf "\n"; }
' file.txt

Output:

920 1260 1825

And the solution for finding sum of numbers in each row of a file (i.e. horizontal sum) is here

5 comments:

Mike M. said...

I appreciate your posts on this and a similar post on 3/24/2010.

Here is my scenario, I have two columns. I need to add the first columns (the easy part). The second column is a percent that I need to subtract by 100 then multiply against $1 for each line, then sum them

Here is the input:

942330863 96
942172150 95
942099452 92

I need this as an output:

2826602465 160169797

I tried this code, but it's just doing the percentage work and only doing it for the first line and not summing every line.

awk '
BEGIN {FS=","}
NR == 1{ n1 = $1; x = substr((100-$2)/100,2,3); y = x*$2; next }
{ n1 += $1; y += $y }
END { printf ("%-15d%d\n",n1,y)
}'

Unknown said...

@Mike M:

I think this is the problem:

y += $y

should be

y += y


?

Unknown said...


dc -e "
# code to update the sums stored in registers 'a', 'b', & 'c'
[lc+sc lb+sb la+sa z 3 !>d]sd

# load up the main stack
$(< file.txt tr ':-' ' _')

# initialize the registers 'a', 'b', & 'c' for holding the running sums
0sa 0sb 0sc

# compute the sums
ldx

# display results
lan[:]nlbn[:]nlcp
"

Regarding the problem posted by the OP "Mike M", it can be tackled using "dc" as:

# multiply by -1
[_1*]sm

# percentage calculations
[100-lmx]sn

# load up the main stack
$(< file.txt tr '-' '_')

## initialize the registers 'a' & 'b' for holding the running sums
0sa 0sb

# code to update the sums stored in registers 'a' & 'b'
[lnxscdlc*lb+sbla+sa z 2 !>r]sr

# compute the sums
lrx

# display result
lan[ ]nlb100/p
"

Unknown said...

dc -e "
# code to update the sums stored in array x[3], x[2], & x[1]
[3;x+3:x2;x+2:x1;x+1:xz3!>d]sd

# load the main stack
$(< file.txt tr ':-' ' _')

# compute sums
ldx3;x2;x1;x

# display result
n[:]nn[:]np
"


> Here is my scenario, I have two columns. I need to add the first columns (the easy > part). The second column is a percent that I need to subtract by 100 then multiply > against $1 for each line, then sum them

dc -e "
# multiply by -1
[_1*]sm

# compute 100 - column2 number
[100-lmx]sn

# load up the main stack
$(< file.txt '-' '_')

# initialize the registers 'a' & 'b' for holding the running sums
0sa 0sb

# compute code
[lnxscdlc*lb+sbla+saz2!>r]sr

# computations
lrx

# display result
lan[ ]nlb100/p
"

Unknown said...

< file.txt tr ':-' '\040_' | # colons -> spaces & minus -> underscore. Demand of dc
dc -e "
[lpxq]sq #action taken after all lines readin
[pq]se #special action on last field of line
[58a]s: #colon symbol
[z;x+ z:x z 0 <a]sa #accumulate the sums
[z1+;xd z3<e nl:xn lpx]sp #print sums. the 3 here reflects the num of fields
[? z 0 =q lax c l?x]s? #loop over lines
l?x #action
"

© Jadu Saikia www.UNIXCL.com