Tuesday, May 20, 2008

Remove or replace newlines using sed,awk,tr - BASH



$ cat week.txt
Su
Mo
Tu
We
Th
Fr
Sa


Output Required:


- a) Remove the newlines. i.e. required output:
SuMoTuWeThFrSa

- b) Replace the newlines with "|", i.e.
Su|Mo|Tu|We|Th|Fr|Sa



Remove/Replace newlines with sed


a)
$ sed -e :a -e '$!N;s/\n//;ta' week.txt
SuMoTuWeThFrSa

b)
$ sed -e :a -e '$!N;s/\n/|/;ta' week.txt
Su|Mo|Tu|We|Th|Fr|Sa


One more way of doing. But not suitable for files with large number of records, as you see the number of N's is just 1 less than number of lines in the file.


a)
$ sed 'N;N;N;N;N;N;s/\n//g' week.txt
SuMoTuWeThFrSa

b)
$ sed 'N;N;N;N;N;N;s/\n/|/g' week.txt
Su|Mo|Tu|We|Th|Fr|Sa


Remove/Replace newlines with awk

a)
$ awk '{printf "%s",$0} END {print ""}' week.txt
SuMoTuWeThFrSa

b)
$ awk '{printf "%s|",$0} END {print ""}' week.txt
Su|Mo|Tu|We|Th|Fr|Sa|

So we need to remove the last "|" in the above output.

$ awk '{printf "%s|",$0} END {print ""}' week.txt | awk '{sub(/\|$/,"");print}'
Su|Mo|Tu|We|Th|Fr|Sa


Remove/Replace newlines with tr

a)
$ tr -d '\n' < week.txt
SuMoTuWeThFrSa

b)
$ tr '\n' '|' < week.txt
Su|Mo|Tu|We|Th|Fr|Sa|

Similarly we need to remove the last "|" from the above output:
$ tr '\n' '|' < week.txt | sed 's/|$//'
Su|Mo|Tu|We|Th|Fr|Sa

3 comments:

Unknown said...

One more:

$ jot -c 10 A
A
B
C
D
E
F
G
H
I
J

$ jot -c 10 A | paste -sd,
A,B,C,D,E,F,G,H,I,J

Moreover,

$ jot -c 10 A | tr '\n' ',' | sed -e "s/[A-Z]*/'&'/g" -e "s/,''/\n/g"
'A','B','C','D','E','F','G','H','I','J'


$ jot -c 10 A | awk -v x="'" '{ s=s sprintf(x "%s" x ",", $0) } END { sub(",$", "", s); print(s) }'
'A','B','C','D','E','F','G','H','I','J'

Joseph E Edwards VIII said...

Neato. Thanks for the tips. Worth noting that using printf can be problematic if you also have certain other escape codes in your string (like \u which will trigger unicode but is common in say $PS1 as an code with an entirely different meaning). An awk method minus printf is the best bet for weird string parsing.

Joseph E Edwards VIII said...

Follow up: to fix the \u unicode escape problem with weird strings and printf, one can instead use:

echo -e "blah\u\nblah" | tr '\n' ' '

© Jadu Saikia www.UNIXCL.com