Friday, October 30, 2009

Bash while loop sum issue explained


On one of my directory I had a lot of log files and I had to find the count of the total number of lines which starts with 's' (i.e. ^s).
My first approach was:

$ ls | xargs -i grep -c ^s {} | awk '{sum+=$0} END {print sum}'
190978

And I got my result. Then I thought of performing the same using bash scripting for and while loop and this is what I tried.

#!/bin/sh

sum=0
DIR=~/original
for file in $(ls $DIR)
do
Slines=$(grep -c ^s $DIR/$file)
((sum+=Slines))
#You can also use
#sum=$(expr $sum + $Slines)
#sum=`expr $sum + $Slines`
done
echo $sum

Executing it:

$ ./usingfor.sh
190978

Cool, correct result.

And then I modified the above script for bash while loop:

#!/bin/sh
sum=0
DIR=~/original
ls $DIR | while read file
do
Slines=$(grep -c ^s $DIR/$file)
((sum+=Slines))
done
echo $sum

Executing it:

$ ./usingwhile.sh
0


Oops!!! what went wrong ?

In Bash shell, piping directly to bash while loop causes the bash shell to function in a sub shell.
So in the above example the scope of the 'sum' variable is limited to the sub-shell of the while loop and so the modified value of 'sum' is not reflected when we exit the loop. Value of sum is still 0 (local value) as we initialized it to 0 at the beginning of the script.

The solution of this variable scoping problem with while and direct piping will be:

Remove the direct pipe and feed the list of file names under '~/original' directory as stdin to the while loop as shown below (Basically create a temp file with the file names of the directory '~/original')

#!/bin/sh
sum=0
DIR=~/original
ls $DIR > /tmp/filelist

while read file
do
Slines=$(grep -c ^s $DIR/$file)
((sum+=Slines))
done < /tmp/filelist
echo $sum

Executing it:

$ ./usingwhile_1.sh
190978

And the result is correct.

3 comments:

internetjanitor said...

#!/bin/bash
sum=0
DIR=~/original

declare -a sum
while read file
  do
    while read line
    do
      if [[ "$line" =~ "^s" ]]; then
        ((sum+=1))
      fi
    done <$DIR/$file
done < <(ls $DIR)
echo $sum

etamme said...

umm... is there a reason not to use

grep ^c *.log | wc -l

Unknown said...

@Etamme, I forgot to mention, I was doing the xargs and all as the number of files were more and using * was giving "Argument list too long" for ls. But you are right, that should be the first solution to the requirement.
Also my whole intention of the post was to explain the while thing. Thanks for your comment and keep in touch.

© Jadu Saikia www.UNIXCL.com