Monday, January 19, 2009

Count non empty fields in a file - awk

Input file: "4w_prod_u_1.txt" is the attendance report for production unit1 of a factory for a particular week.
If the employee was present for a day there will be a "y" for that day, if absent nothing will be put.

$ cat 4w_prod_u_1.txt
#Name,Designation,attendance details
Naveen A,Unit Mngr,y,y,,y,y,y
S Puri,Driller,,,y,,y,y
N Nandan,Factory Mngr,y,y,y,y,y,
K Asif,Unit Mngr,,,y,y,y,y

We need to count total days an employee was present in that week, basically to count total number of non empty fields after the first two fields in each line (except the first hear line) or to be more specific counting total number of "y" for each line (after the 1st two fields)

The awk solution:

$ awk > 4w_prod_u_1.out '
BEGIN {print "Name[Num Days Present]"}
!/^#/ {cnt=0; for (i=3; i<=NF; i++)
{if ($i != "") {cnt++} } {print $1,"["cnt" days]"} }
' FS=, 4w_prod_u_1.txt

Output file:
$ cat 4w_prod_u_1.out
Name[Num Days Present]
Naveen A [5 days]
S Puri [3 days]
N Nandan [5 days]
K Asif [4 days]

or the

if ($i != "")

in the above solution can be

if ($i == "y")
will produce the same result.


JuanPablo said...

a awk question,
is posible use a bash command in awk
example, fila have epoch dates

cat file

$ date -d @1231256700
Tue Jan 6 12:45:00 CLST 2009
convert to human date

how convert every vaulue in file to human date?
cat file | awk 'a=$(date -d @$1) {print a}'


Jadu Saikia said...


One idea I have got

$ cat epochcnvrt
date -d @$1

$ awk '{ system("sh epochcnvrt "$1)}' file

Should work.
But there should be a good way of doing this. Let me give a thought. Thanks for commenting. Keep in touch.

Jadu Saikia said...

or directly you can put the command like this

$ awk '{ system("date -d @"$1)}' file

But the problem is

suppose your file is like this:

$ cat file.3
1231254900 f
1231255800 d
1231256700 n

$ awk '{ system("date -d @"$1)}{print $2}' file.3

Tue Jan 6 20:45:00 IST 2009
Tue Jan 6 21:00:00 IST 2009
Tue Jan 6 21:15:00 IST 2009

I am trying to see how we can get the following output:

Tue Jan 6 20:45:00 IST 2009 f
Tue Jan 6 21:00:00 IST 2009 d
Tue Jan 6 21:15:00 IST 2009 n

JuanPablo said...

awk '{system("echo -n $(date -d @" $1 " )")}{print "\t" $2}' file.3

thank you very much for your help

Jadu Saikia said...

@JuanPablo, thanks, I forgot of the echo -n , great. thanks for the comment. Its helpful for me.

Sumit said...


I have a text file "mytest.txt" where columns are separated by ^A (Ctrl-A) characters:


and it contains several rows with second column as blank (3^A^APresident). How can print/write to a file all those rows with second column as blank?


Jadu Saikia said...

@Sumit, you can something like this:

awk -F "^A" '$2==""' file.txt > somefile.txt

(^A as you know is CtrlV-CtrlA if you are copying above line)

Hope this helps.

Sumit said...

Hi Jadu,

I tried something like you suggested:

awk -F^A '{if ($2 == "") print $0 }' > somefile.txt

and it worked.


Anirudh said...

perl -F, -pale '
$_ = "$F[0] [" . ($a=join",",@F[2..$#F])=~tr/y//d . "days]";

© Jadu Saikia