Monday, November 2, 2009

Bash - numbering lines in file using awk


Input file 'file.txt' contains names of few students.

$ cat file.txt
Sam G
Ashok Niak
Rosy M
Peter K
Sid Thom
Rasi Yad
Papu S
Niaraj J
Aloh N K
Nipu H
Quam L

Required output:

For the entries of the above file,
- add a serial number to each line
- Also add 'House' number such that all the students are group into total 4 houses in the following fashion:

Sl No,Name,House
1,Sam G,House1
2,Ashok Niak,House2
3,Rosy M,House3
4,Peter K,House4
5,Sid Thom,House1
6,Rasi Yad,House2
7,Papu S,House3
8,Niaraj J,House4
9,Aloh N K,House1
10,Nipu H,House2
11,Quam L,House3

The awk solution using awk NR variable:

$ awk '
BEGIN {OFS=","; print "Sl No,Name,House"}
{print NR,$0,"House"((NR-1)%4)+1}
' file.txt

Lets format the output for a better look:

$ awk '
BEGIN {
FORMAT="%-8s%-18s%s\n" ;
{printf FORMAT,"Sl No","Name","House"}
}
{printf FORMAT,NR,$0,"House"((NR-1)%4)+1}
' file.txt

Output:

Sl No Name House
1 Sam G House1
2 Ashok Niak House2
3 Rosy M House3
4 Peter K House4
5 Sid Thom House1
6 Rasi Yad House2
7 Papu S House3
8 Niaraj J House4
9 Aloh N K House1
10 Nipu H House2
11 Quam L House3

Read about text alignment using awk printf function here

A Bash script for the same will be something like this:

#!/bin/sh
i=0
while read
do
echo "$((i+1)),$REPLY,House$((i++ % 4 + 1))"
done < file.txt

Output:

$ sh numbering.sh
1,Sam G,House1
2,Ashok Niak,House2
3,Rosy M,House3
4,Peter K,House4
5,Sid Thom,House1
6,Rasi Yad,House2
7,Papu S,House3
8,Niaraj J,House4
9,Aloh N K,House1
10,Nipu H,House2
11,Quam L,House3

Now a question:
What is that '$REPLY' in the above script ?

Answer: '$REPLY' is the default value when a variable is not supplied to read.

So the above script is same as:

#!/bin/sh
i=0
while read line
do
echo "$((i+1)),$line,House$((i++ % 4 + 1))"
done < file.txt


In general, numbering of the lines of a file can be done in several ways viz

Using UNIX/Linux nl(1) command - number lines of files

$ nl file.txt
1 Sam G
2 Ashok Niak
3 Rosy M
4 Peter K
5 Sid Thom
6 Rasi Yad
7 Papu S
8 Niaraj J
9 Aloh N K
10 Nipu H
11 Quam L

Using awk NR:

$ awk '{print "\t"NR"\t"$0}' file.txt
1 Sam G
2 Ashok Niak
3 Rosy M
4 Peter K
5 Sid Thom
6 Rasi Yad
7 Papu S
8 Niaraj J
9 Aloh N K
10 Nipu H
11 Quam L

Using sed syntax:

$ sed = file.txt | sed 'N;s/\n/\t/'
1 Sam G
2 Ashok Niak
3 Rosy M
4 Peter K
5 Sid Thom
6 Rasi Yad
7 Papu S
8 Niaraj J
9 Aloh N K
10 Nipu H
11 Quam L

No comments:

© Jadu Saikia www.UNIXCL.com