Saturday, January 26, 2008

Add a conditional 4th field - AWK & BASH


The input file is something like this.

$ cat thefile.out

ID1,06-11-1983,21
ID1,07-11-1983,21
ID1,08-11-1983,21
ID2,12-12-1982,21
ID2,12-12-1982,21
ID3,14-06-2007,12

Output Required:
Add a 4TH field to the above file, it starts with 1, increments when the ID(first field) changes.

i.e.


ID1,06-11-1983,21,1
ID1,07-11-1983,21,1
ID1,08-11-1983,21,1
ID2,12-12-1982,21,2
ID2,12-12-1982,21,2
ID3,14-06-2007,12,3

This is how it can be done using AWK.

$ awk 'BEGIN{OFS=FS=","}{if ($1 != var ){var=$1;count++};print $0,count}' thefile.out > thefile.tmp

In absence of "awk", if we go for a traditional BASH prog,

#!/bin/sh

c=0
id2="fakeid"

for line in `cat thefile.out`
do
id=`echo $line| cut -d "," -f1`
[ $id != $id2 ] && ((c+=1)) && id2=$id
echo "$line,$c"
done

2 comments:

Thomas Fabian Dcunha said...

Hi,

Do you know how column 3 can be updated.
For example if we have 21 in column 3 replace it with 1.

Regards
Thomas

Jadu Saikia said...

@Thomas, something like this ?

e.g.

$ cat file.txt
chrom:index:forward:reverse
chr01:13:1:2
chr03:12:1:4
chr01:3445:1:6
chr02:2311:3:1
chr13:23432:4:7
chr01:212:5:2
chr02:345:12:6
chr01:45:45:0

if we have 12 n column 2, replace it with 100.

$ awk -F ":" 'BEGIN {OFS=":"} $2==12 {$2=100} {print}' file.txt

chrom:index:forward:reverse
chr01:13:1:2
chr03:100:1:4
chr01:3445:1:6
chr02:2311:3:1
chr13:23432:4:7
chr01:212:5:2
chr02:345:12:6
chr01:45:45:0

Please let me know if this is fine. Thanks.

© Jadu Saikia www.UNIXCL.com