Saturday, May 24, 2008

Replace digits with serial numbers : awk


Some reference:

substr(s,p,l) -- The substring of s starting at p and continuing for l characters
RSTART -- The location of the data matched using the match built-in function
RLENGTH -- The length of the data matched using the match built-in function


This is a good page for all awk built-in functions, variables

Input file:

$ cat myrec.dat
AA=[98345];
BB=[23333];
CC=[593503];
DD=[32445];
EE=[249];


Requirement:
- Replace the digits inside [] with some serial numbers starting from 5000 in each line

i.e. Required Output:

AA=[5001];
BB=[5002];
CC=[5003];
DD=[5004];
EE=[5005];


Awk solution:

$ awk 'BEGIN { rval=5000 }
{
rval += 1;
match($0, /([0-9]+)];$/);
aval = substr($0, RSTART, RLENGTH-2)
gsub(aval, rval);
print
}' myrec.dat


Adding some debug:

$ awk 'BEGIN { rval=5000 }
{
rval += 1;
match($0, /([0-9]+)];$/);
aval = substr($0, RSTART, RLENGTH-2)
print "actual="aval,"replacewith="rval
gsub(aval, rval)
print
}' myrec.dat


Output:

actual=98345 replacewith=5001
AA=[5001];
actual=23333 replacewith=5002
BB=[5002];
actual=593503 replacewith=5003
CC=[5003];
actual=32445 replacewith=5004
DD=[5004];
actual=249 replacewith=5005
EE=[5005];

No comments:

© Jadu Saikia www.UNIXCL.com