$ cat file.txt
DD:12:A
AA:11:N
EE:13:B
AA:11:F
BB:09:K
DD:13:X
#Based on first field. Duplicates are DD,AA
$ awk '!x[$1]++' FS=":" file.txt
DD:12:A
AA:11:N
EE:13:B
BB:09:K
$ cat file.txt
DD:12:A
AA:11:N
EE:13:B
AA:11:F
BB:09:K
DD:13:X
#This time, based on first and 2nd field.Only duplicate combination is (AA:11)
$ awk '!x[$1,$2]++' FS=":" file.txt
DD:12:A
AA:11:N
EE:13:B
BB:09:K
DD:13:X
Related post:
4 comments:
Hello there,
Any Idea on how to replace duplicate line with blank line instead of deleting them?
e.g.
Input:
test1
test1
test2
test2
test2
test3
Output:
test1
test2
test3
Thanks in advance.
@Abdel: you can do something like this
$ awk 'x[$0]++ {$0=""} {print}' new.txt
Please let me know if that works. Thanks.
Hi,
Do you know how can I exclude duplicate cases based based on other collumn?
Duplicates are located in column 4 and I would like to exclude the duplicate cases that column 2 and column 3 are not the same.
Input:
file 10 --- rs11511647 NA 62766
file 10 10 rs11511647 NA 62766
file 5 --- rs22334455 NA 63767
file 5 --- rs12354678 NA 63768
file 5 5 rs12354678 NA 63768
Desired output:
file 10 10 rs11511647 NA 62766
file 5 --- rs22334455 NA 63767
file 5 5 rs12354678 NA 63768
Thanks,
Thais
Hi there,
I would like to exclude duplicates based on the 3 column. Duplicates are located in column 3 and I would like to keep the duplicate cases that have column2=column3.
Input file
file 10 --- rs11511647 NA 62766
file 10 10 rs11511647 NA 62766
file 5 --- rs22334455 NA 63767
file 5 --- rs12354678 NA 63768
file 5 5 rs12354678 NA 63768
Desired output file
file 10 10 rs11511647 NA 62766
file 5 --- rs22334455 NA 63767
file 5 5 rs12354678 NA 63768
Thanks in advance,
Thais
Post a Comment