Thursday, October 15, 2009

Awk - split file vertically on columns


I have already put a post on - how we can split a file into multiple sub-files based on different conditions (that was basically a horizontal splitting of file); lets see how we can split a file vertically.

Input file 'file.txt' is a csv file:

$ cat file.txt
A,B,C,D,E,F,G,H,I
1,2,3,4,5,6,7,8,9
I,II,III,IV,V,VII,VIII,IX
a,b,c,d,e,f,g,h,i

Required:

Split the above file into two sub-files such that 1st 3 columns are written to sub-file1 and rest of the columns to sub-file2.
i.e.

sub-file1 content will be

A,B,C
1,2,3
I,II,III
a,b,c

And sub-file2 content will be

D,E,F,G,H,I
4,5,6,7,8,9
IV,V,VII,VIII,IX
d,e,f,g,h,i

Well, this is a pretty simple task using Linux/UNIX cut command

#Printing first 3 columns of 'file.txt'
$ cut -d"," -f1-3 file.txt
or
$ cut -d"," -f-3 file.txt

and

#Printing from 4th column till end
$ cut -d"," -f4-9 file.txt
or
$ cut -d"," -f4- file.txt

Awk solution:

$ awk -F "," '
{
for(i=1;i<=NF;i++) {
if(i <= 3) {
printf "%s,", $i >> "sub-file1"
if(i==3){
printf "\n" >> "sub-file1"
}
} else {
printf "%s,", $i >> "sub-file2"
if(i==NF){
printf "\n" >> "sub-file2"
}
}
}
}' file.txt

Sub-files generated after running the above awk script:

$ cat sub-file1
A,B,C,
1,2,3,
I,II,III,
a,b,c,

$ cat sub-file2
D,E,F,G,H,I,
4,5,6,7,8,9,
IV,V,VII,VIII,IX,
d,e,f,g,h,i,

2 comments:

Joey said...

That's a cool trick. What do I need to change if I am starting with a space delimited file and not a csv file?

Jadu Saikia said...

@Joey, you have to remove the -F "," and replace the commas with space in two printf s.


awk '{
for(i=1;i<=NF;i++) {
if(i <= 3) {
printf "%s ", $i >> "sub-file1"
if(i==3){
printf "\n" >> "sub-file1"
}
} else {
printf "%s ", $i >> "sub-file2"
if(i==NF){
printf "\n" >> "sub-file2"
}
}
}
}' file.txt

© Jadu Saikia www.UNIXCL.com