Thursday, August 21, 2008

Split a file, add headers - awk and bash script


Input file:

$ cat master.dat
h1|323|0|v1
l3|2121|MOD
k|1|53453
k|2|312312
k|3|12121
k|4|76577
k|5|76577
k|6|96557
k|7|76577
k|8|26579
k|9|96532
k|10|76577
k|11|6577
k|12|96577
k|13|16577

Output required: Split the above file into sub files such that each subfile will contain 4 lines and first two lines (beginning with h1 and l3) will be there in each of the subfiles.

The script:

$ cat split.sh

#!/bin/sh

NOARG=64
[ -z $1 ] && echo "one file please" && exit $NOARG || FILE=$1
numlines=4
echo "Operation on: $FILE"

awk 'NR==1{h=$0} NR==2{t=$0} NR==L*(n+1)+3 {close(F); n++; print h RS t > "'"$FILE"'"n+1".sub"} {print > (F="'"$FILE"'"n+1".sub")}' L=$numlines $FILE

mkdir -p backup; mv $FILE backup/.
echo "Done for: $FILE"

Executing:

$ ./split.sh master.dat
Operation on: master.dat
Done for: master.dat

Output:

$ ls
backup master.dat1.sub master.dat2.sub master.dat3.sub master.dat4.sub split.sh

$ cat master.dat1.sub
h1|323|0|v1
l3|2121|MOD
k|1|53453
k|2|312312
k|3|12121
k|4|76577

$ cat master.dat2.sub
h1|323|0|v1
l3|2121|MOD
k|5|76577
k|6|96557
k|7|76577
k|8|26579

$ cat master.dat3.sub
h1|323|0|v1
l3|2121|MOD
k|9|96532
k|10|76577
k|11|6577
k|12|96577

$ cat master.dat4.sub
h1|323|0|v1
l3|2121|MOD
k|13|16577

No comments:

© Jadu Saikia www.UNIXCL.com