Thursday, July 9, 2009
Print all duplicate lines using awk
Input file:
$ cat cnr.txt
HIDDENHAUSEN|99.60
FIEBERBRUNN|99.07
MELLENDORF|99.04
HERBSTEIN|99.02
ACHTERWEHR|98.82
GOLM|98.82
PARA|98.82
BOGEN|98.61
SAINTANDRE|98.55
OLSZTYN|98.61
HYDERABAD|99.02
Output required: Print those lines for which 2nd field has occurred more than once. i.e. required o/p:
HERBSTEIN|99.02
ACHTERWEHR|98.82
GOLM|98.82
PARA|98.82
BOGEN|98.61
OLSZTYN|98.61
HYDERABAD|99.02
Awk solution:
$ awk 'NR==FNR && a[$2]++ {b[$2];next} $2 in b' FS="|" cnr.txt cnr.txt
Related posts:
- Difference between awk NR and FNR variables
- Posts on awk NR==FNR
- Awk FNR variable usage example
- Remove duplicates based on field using awk
- Remove duplicates from file without sorting using awk
Subscribe to:
Post Comments (Atom)
© Jadu Saikia www.UNIXCL.com
1 comment:
sed -e '
$!{
N;s/^/\n/;D
}
/\n$/!G
/^\n$/d
/^[^|]*|m/{
s/^\([^|]*|\)m/\1/
P;D
}
/^\([^|]*|\)\([0-9][0-9]*\.[0-9][0-9]*\)\(\n.*|\)\(\2\n\)/!D
:mark
s//\1\2\3m\4/
tmark
P;D
' cnr.txt
Post a Comment