In one of my earlier post I have already discussed on numbering lines in a file using awk, here is a similar post to number lines ignoring 'blank lines' present.
Input file 'file.txt' has got first 2 lines fixed as HEADER line, followed by a number of record lines (^k).
$ cat file.txt
h1|456|v1|1
h2|190|-|5
k|rn|90.67|12|90
k|rn|90.43|22|35
k|rn|90.62|71|90
k|rn|90.51|16|96
k|rn|90.37|18|71
Required: In the above file, replace the 2nd field in the record lines (i.e. ^k) with the serial record number (starting with zero i.e. 0).
i.e. required output:
h1|456|v1|1
h2|190|-|5
k|0|90.67|12|90
k|1|90.43|22|35
k|2|90.62|71|90
k|3|90.51|16|96
k|4|90.37|18|71
The solution using awk NR variable:
$ awk '
BEGIN {FS=OFS="|"}
$1=="k" {$2=NR-3} {print}
' file.txt
i.e. for the lines("|" delimited) where first field is "k", replace 2nd field with "NR-3" and then print the new output.
NR is the ordinal number of the current record. A post describing awk NR variable can be found here
Now if the input file contains certain blank lines, something like this:
$ cat file.txt
h1|456|v1|1
h2|190|-|5
k|rn|90.67|12|90
k|rn|90.43|22|35
k|rn|90.62|71|90
k|rn|90.51|16|96
k|rn|90.37|18|71
Executing the above awk one liner:
$ awk '
BEGIN {FS=OFS="|"}
$1=="k" {$2=NR-3} {print}
' file.txt
Output:
h1|456|v1|1
h2|190|-|5
k|0|90.67|12|90
k|1|90.43|22|35
k|4|90.62|71|90
k|6|90.51|16|96
k|7|90.37|18|71
As you can see the record numbers are not in serial order, as using NR will also count for the blank lines present.
A different solution:
$ awk '
BEGIN{FS=OFS="|"}
!/^$/ {++c}
$1=="k" {$2=c-3} {print}
' file.txt
Output:
h1|456|v1|1
h2|190|-|5
k|0|90.67|12|90
k|1|90.43|22|35
k|2|90.62|71|90
k|3|90.51|16|96
k|4|90.37|18|71
Related posts using Awk:
- Replace a field with different values using awk
- Subdividing a file into sub-files using awk and bash
- Finding blank columns in a file - awk
- A practical example using awk
No comments:
Post a Comment