Tuesday, August 26, 2008
Sort date in ddmmyyyy format - awk and bash script
Input file is having first field as ddmmyyyy format.
$ cat myf.dat
12082008;pull done;ret=34;Y
08072008;push hanged;s=3;N
15082008;pull done;ret=34;Y
01062008;psuh done;ret=23;Y
18082007;old entry;old;N
Required output: We need to sort the above file based on first field date in ddmmyyyy format; so that the final output after sort should be:
18082007;old entry;old;N
01062008;psuh done;ret=23;Y
08072008;push hanged;s=3;N
12082008;pull done;ret=34;Y
15082008;pull done;ret=34;Y
The solution is divided into 3 steps:
1) Adding a temporary field to the beginning. This field is nothing but the yyyymmdd format of the corresponding first field.
$ awk '{
tempfield=sprintf("%s%s%s",substr($1,5),substr($1,3,2),substr($1,1,2))
print tempfield","$0
}' FS=";" myf.dat
20080812,12082008;pull done;ret=34;Y
20080708,08072008;push hanged;s=3;N
20080815,15082008;pull done;ret=34;Y
20080601,01062008;psuh done;ret=23;Y
20070818,18082007;old entry;old;N
2) Now Doing a numeric sort.
$ awk '{
tempfield=sprintf("%s%s%s",substr($1,5),substr($1,3,2),substr($1,1,2))
print tempfield","$0
}' FS=";" myf.dat | sort -n
20070818,18082007;old entry;old;N
20080601,01062008;psuh done;ret=23;Y
20080708,08072008;push hanged;s=3;N
20080812,12082008;pull done;ret=34;Y
20080815,15082008;pull done;ret=34;Y
3) Removing the temporary field from beginning.
$ awk '{
tempfield=sprintf("%s%s%s",substr($1,5),substr($1,3,2),substr($1,1,2))
print tempfield","$0
}' FS=";" myf.dat | sort -n | cut -d"," -f2
18082007;old entry;old;N
01062008;psuh done;ret=23;Y
08072008;push hanged;s=3;N
12082008;pull done;ret=34;Y
15082008;pull done;ret=34;Y
The above is the required output.
Subscribe to:
Post Comments (Atom)
© Jadu Saikia www.UNIXCL.com
No comments:
Post a Comment