From SORT(1) man page:
-n, --numeric-sort
compare according to string numerical value
-k, --key=POS1[,POS2]
start a key at POS1, end it at POS2 (origin 1)
-t, --field-separator=SEP
use SEP instead of non-blank to blank transition
1) File with same number of fields in each line
$ cat file.txt
IN,90,453
US,12,120
NZ,89,200
WI,20,500
TS,12,124
$ sort -t, -n -k3 file.txt
US,12,120
TS,12,124
NZ,89,200
IN,90,453
WI,20,500
2) File with uneven number of fields in each line
$ cat file.txt
IN,90,453
US,12,1,120
NZ,89,200
WI,500
TS,12,124
Solution : The idea is join the last field(column) of each line at the beginning of each of the lines (with a different delimiter, in this case I am using pipe "|")
$ awk 'BEGIN {FS=","; OFS="|"}{print $NF,$0}' file.txt
453|IN,90,453
120|US,12,1,120
200|NZ,89,200
500|WI,500
124|TS,12,124
Then sort on 1st field with pipe ("|") delimited.
$ awk 'BEGIN {FS=","; OFS="|"}{print $NF,$0}' file.txt | sort -n -t"|" -k1
120|US,12,1,120
124|TS,12,124
200|NZ,89,200
453|IN,90,453
500|WI,500
Then discard the first field i.e. print the 2nd field with pipe ("|") delimited.
$ awk 'BEGIN {FS=","; OFS="|"}{print $NF,$0}' file.txt | sort -n -t"|" -k1 | awk -F "|" '{print $NF}'
US,12,1,120
TS,12,124
NZ,89,200
IN,90,453
WI,500
Which is same as:
$ awk 'BEGIN {FS=","; OFS="|"}{print $NF,$0}' file.txt | sort -n -t"|" -k1 | awk -F "|" '{print $2}'
or
$ awk 'BEGIN {FS=","; OFS="|"}{print $NF,$0}' file.txt | sort -n -t"|" -k1 | cut -d"|" -f2
A question to unstableme readers:
- Is there a better way to directly mention the last field number (as counting field number incase line has huge number of fields is never going to be easy) ? Thanks in advance, much appreciated.
Related posts:
- UNIX sort strings by string length using awk in Bash
- Changing temporary directory for UNIX sort command
- Sort date in ddmmyyyy format using awk
4 comments:
Gawk's asorti function comes in handy here.
gawk -F "," {arr[$NF]=$0}END{j = asorti(arr, sorted); for(i=1; i<=j; i++){print arr[sorted[i]]}} test.txt
This fails if there is more than one line with identical sort fields.
@djohngo, thanks for the awk asorti solution. Its useful.
Tried the same with python http://pythonstarter.blogspot.com/2010/11/python-sort-file-based-on-last-field.html
This is the Schwartz transform in Perl, wholly analogus to what you were doing...
perl -aln0777e '
print for
map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
map { [$_,/,(\d+)$/] }
@F
' file.txt
Post a Comment