Friday, July 26, 2013

Unix - merge multiple consecutive lines


Input file:
$ cat infile.txt 
aid=33
pw=3
nn=90
aid=32
pw=30
nn=70
aid=56
pw=3
nn=93
Required:
Combine or merge every three consecutive lines of the above file so that the output becomes:
aid=33,pw=3,nn=90
aid=32,pw=30,nn=70
aid=56,pw=3,nn=93
Awk solution: If line number is divisible by 3 then put a new line(\n) else put a comma(,) i.e.
$ awk '{printf("%s%s", $0, (NR%3 ? "," : "\n"))}' infile.txt 
aid=33,pw=3,nn=90
aid=32,pw=30,nn=70
aid=56,pw=3,nn=93
Another way using Awk:
$ awk 'NR%3{printf $0",";next;}1' infile.txt 
aid=33,pw=3,nn=90
aid=32,pw=30,nn=70
aid=56,pw=3,nn=93
Using UNIX paste command:
$ paste -d"," - - - < infile.txt 
aid=33,pw=3,nn=90
aid=32,pw=30,nn=70
aid=56,pw=3,nn=93
A bash command line solution:
$ while read line1; do read line2; read line3; echo "$line1,$line2,$line3"; done < infile.txt 
aid=33,pw=3,nn=90
aid=32,pw=30,nn=70
aid=56,pw=3,nn=93
Related posts:
  1. Join multiple lines using Awk
  2. Combine related consecutive lines using Awk
  3. Merging lines in UNIX

6 comments:

awkseeker said...

wow, how great unix is. I do believe this oneliner would help our text processing code to be simple and effective work.
Thank Jadu

awkseeker said...

Dear Jadu, I have look for some of trick to solve my data:
name=john
age=31
class=rich
name=bill
age=40
name=linda
age=40
class=rich

output:
john|31|rich
bill|40|
linda|40|rich

Please advise me.

Thank you

Unknown said...

@awkseeker, not a very efficient solution would be:

$ cat file.txt
name=john
age=31
class=rich
name=bill
age=40
name=linda
age=40
class=rich

$ awk '$0=(/^name=/) ? RS$0 : "|"$0' ORS= file.txt

name=john|age=31|class=rich
name=bill|age=40
name=linda|age=40|class=rich

$ awk '$0=(/^name=/) ? RS$0 : "|"$0' ORS= file.txt | grep .
name=john|age=31|class=rich
name=bill|age=40
name=linda|age=40|class=rich

$ awk '$0=(/^name=/) ? RS$0 : "|"$0' ORS= file.txt | grep . | awk -F "[=,|]" 'BEGIN {OFS="|"}{print $2,$4,$6}'
john|31|rich
bill|40|
linda|40|rich

(you can refer using multiple field separator here http://www.unixcl.com/2008/03/multiple-fs-in-awk.html )

Hope this helps. Thanks !

Regards,
Jadu

awkseeker said...

Than Jadu, I tried my code and need you to explain.
awk '{if((NR%3==0)&&($0!~/class/)) print "class\n"$0, NR=NR+1; else print $0}' file.txt

output:
name=john
age=31
class=rich
name=bill
age=40
class
name=linda 7
age=40
class=rich

question:
1. why NR=NR+1 appear after "name=linda" even though I just set variable, did not print?
2. Please show me other style of "if" condition, I though my code is ugly.

Many thank

awkseeker said...

Hi Jadu, I know it is very basic for programer to ware of variable ++ vs ++ variable, but it is not for me. So I would bother you to explain to me more clear about that concept involved to my code.
I asked you about code below:
input:
name=john
age=31
class=rich
name=bill
age=40
name=linda
age=40
class=rich

$ awk '{if((NR%3==0)&&($0!~/class/)) print NR"\tclass=\n"NR+1"\t"$0,NR=NR+1;else print NR"\t"$0}' file.txt

outtput:
1 name=john
2 age=31
3 class=rich
4 name=bill
5 age=40
6 class=
7 name=linda 7
8 age=40
9 class=rich

If I change to NR++
$ awk '{if((NR%3==0)&&($0!~/class/)) print NR"\tclass=\n"NR++"\t"$0;else print NR"\t"$0}' file.txt
1 name=john
2 age=31
3 class=rich
4 name=bill
5 age=40
6 class=
6 name=linda
8 age=40
9 class=rich

line 6 will be repeated. So I change to ++NR
$ awk '{if((NR%3==0)&&($0!~/class/)) print NR"\tclass=\n"++NR"\t"$0;else print NR"\t"$0}' file.txt
1 name=john
2 age=31
3 class=rich
4 name=bill
5 age=40
6 class=
7 name=linda
8 age=40
9 class=rich

Everything seem to be nice. But I am not quite sure for huge file and why the sequence of row is correct now.

Thank you

awkseeker said...

Hi Jadu, I have some to bother you with some question.
cat infile.txt
30,5
30,6
30,2
31,3
31,6
31,3
32,0
32,0
32,0

I need awk to get min,max,count of each record
30,2,6,3
31,3,6,3
32,0,0,3

Thanks

© Jadu Saikia www.UNIXCL.com