Monday, January 19, 2009

Count non empty fields in a file - awk


Input file: "4w_prod_u_1.txt" is the attendance report for production unit1 of a factory for a particular week.
If the employee was present for a day there will be a "y" for that day, if absent nothing will be put.

$ cat 4w_prod_u_1.txt
#Name,Designation,attendance details
Naveen A,Unit Mngr,y,y,,y,y,y
S Puri,Driller,,,y,,y,y
N Nandan,Factory Mngr,y,y,y,y,y,
K Asif,Unit Mngr,,,y,y,y,y

Required:
We need to count total days an employee was present in that week, basically to count total number of non empty fields after the first two fields in each line (except the first hear line) or to be more specific counting total number of "y" for each line (after the 1st two fields)

The awk solution:

$ awk > 4w_prod_u_1.out '
BEGIN {print "Name[Num Days Present]"}
!/^#/ {cnt=0; for (i=3; i<=NF; i++)
{if ($i != "") {cnt++} } {print $1,"["cnt" days]"} }
' FS=, 4w_prod_u_1.txt

Output file:
$ cat 4w_prod_u_1.out
Name[Num Days Present]
Naveen A [5 days]
S Puri [3 days]
N Nandan [5 days]
K Asif [4 days]

or the

if ($i != "")

in the above solution can be

if ($i == "y")
will produce the same result.

8 comments:

JuanPablo said...

a awk question,
is posible use a bash command in awk
example, fila have epoch dates

cat file
1231254900
1231255800
1231256700

with
$ date -d @1231256700
Tue Jan 6 12:45:00 CLST 2009
convert to human date

how convert every vaulue in file to human date?
cat file | awk 'a=$(date -d @$1) {print a}'


thanks

Jadu Saikia said...

Juan,

One idea I have got

$ cat epochcnvrt
date -d @$1

$ awk '{ system("sh epochcnvrt "$1)}' file

Should work.
But there should be a good way of doing this. Let me give a thought. Thanks for commenting. Keep in touch.

Jadu Saikia said...

or directly you can put the command like this

$ awk '{ system("date -d @"$1)}' file

But the problem is

suppose your file is like this:

$ cat file.3
1231254900 f
1231255800 d
1231256700 n

$ awk '{ system("date -d @"$1)}{print $2}' file.3

Tue Jan 6 20:45:00 IST 2009
f
Tue Jan 6 21:00:00 IST 2009
d
Tue Jan 6 21:15:00 IST 2009
n

I am trying to see how we can get the following output:

Tue Jan 6 20:45:00 IST 2009 f
Tue Jan 6 21:00:00 IST 2009 d
Tue Jan 6 21:15:00 IST 2009 n

JuanPablo said...

awk '{system("echo -n $(date -d @" $1 " )")}{print "\t" $2}' file.3

thank you very much for your help

Jadu Saikia said...

@JuanPablo, thanks, I forgot of the echo -n , great. thanks for the comment. Its helpful for me.

Sumit said...

Hi,

I have a text file "mytest.txt" where columns are separated by ^A (Ctrl-A) characters:

ID^AName^ATitle
1^AJohn^AManager
2^APeter^AProgrammer
3^A^APresident
4^AMike^AProgrammer

and it contains several rows with second column as blank (3^A^APresident). How can print/write to a file all those rows with second column as blank?

Thanks,
Sumit

Jadu Saikia said...

@Sumit, you can something like this:

awk -F "^A" '$2==""' file.txt > somefile.txt

(^A as you know is CtrlV-CtrlA if you are copying above line)

Hope this helps.

Sumit said...

Hi Jadu,

I tried something like you suggested:

awk -F^A '{if ($2 == "") print $0 }' > somefile.txt

and it worked.

Thanks!
Sumit

© Jadu Saikia www.UNIXCL.com