Thought of continuing a similar post (w.r.t. few of my recent posts) based on awk array.
Input file:
$ cat details.txt
Manager1|sw1
Manager3|sw5
Manager1|sw4
Manager2|sw9
Manager2|sw12
Manager1|sw2
Manager1|sw0
Output required:
Group the similar (based on $1) fields($2) together, i.e. group the engineers which are under a particular common manager. i.e. required output:
Manager1|sw1,sw4,sw2,sw0
Manager2|sw9,sw12
Manager3|sw5
Awk solution:
$ awk '
BEGIN {FS=OFS="|"}
!A[$1] {A[$1] = $0; next}
{A[$1] = A[$1] "," $2}
END {for(i in A) {print A[i]}
}' details.txt
Lets add one more field as the "team" field. e.g. "Manager1" manages engineer "sw1" which is from team1.
$ cat details1.txt
Manager1|team1|sw1
Manager3|team4|sw5
Manager1|team2|sw4
Manager2|team5|sw9
Manager2|team5|sw12
Manager1|team3|sw2
Manager1|team2|sw0
Now lets try to group the engineers which are from the same team and are being managed by the same common manager. The awk solution would be:
$ awk '
BEGIN {FS=OFS="|"}
!A[$1$2] {A[$1$2] = $0; next}
{A[$1$2] = A[$1$2] "," $3}
END {for(i in A) {print A[i]}
}' details1.txt
o/p:
Manager1|team1|sw1
Manager1|team2|sw4,sw0
Manager1|team3|sw2
Manager3|team4|sw5
Manager2|team5|sw9,sw12
4 comments:
#####################################################################################################
# Note : the complexity of the #following solution is worse than #O(n^2) , while n is the number
# of lines in the input file. #while this solution works , I wont use #it over the awk solution.
#####################################################################################################
#!/bin/bash
declare -a ARR_MNGRS
declare -i NUM_MNGRS
declare -i COMMA=0
ARR_MNGRS=(`cat details.txt | cut -d'|' -f1 | sort | uniq`)
NUM_MNGRS=${#ARR_MNGRS[*]}
for ((i=0; i<$NUM_MNGRS; i++))
do
printf "%s|" "${ARR_MNGRS[$i]}"
while read LINE
do
if [[ ${LINE%%|*} = ${ARR_MNGRS[$i]} ]]
then
if [[ $COMMA -eq 0 ]]
then
printf "%s," "${LINE##*|}"
COMMA=1
elif [[ $COMMA -eq 1 ]]
then
printf "%s" "${LINE##*|}"
((COMMA++))
elif [[ $COMMA -eq 2 ]]
then
printf ",%s" "${LINE##*|}"
fi
fi
done < details.txt
echo "" # needed for the newline
COMMA=0
done
Complexity : more than O(n^2) , so its not so great.
I hope I didn't double posted.
#!/bin/bash
declare -a ARR_MNGRS
declare -i NUM_MNGRS
declare -i COMMA=0
ARR_MNGRS=(`cat details.txt | cut -d'|' -f1 | sort | uniq`)
NUM_MNGRS=${#ARR_MNGRS[*]}
for ((i=0; i<$NUM_MNGRS; i++))
do
printf "%s|" "${ARR_MNGRS[$i]}"
while read LINE
do
if [[ ${LINE%%|*} = ${ARR_MNGRS[$i]} ]]
then
if [[ $COMMA -eq 0 ]]
then
printf "%s," "${LINE##*|}"
COMMA=1
elif [[ $COMMA -eq 1 ]]
then
printf "%s" "${LINE##*|}"
((COMMA++))
elif [[ $COMMA -eq 2 ]]
then
printf ",%s" "${LINE##*|}"
fi
fi
done < details.txt
echo "" # needed for the newline
COMMA=0
done
@Nathan, thanks for the pure bash solution . Really useful.
Another alternative using awk:
$ awk 'BEGIN {FS=OFS="|"} {
arr[$1] = ($1 in arr) ? arr[$1] "," $2 : $2
}
END {
for (i in arr)
print i, arr[i]
}' details.txt
Manager1|sw1,sw4,sw2,sw0
Manager2|sw9,sw12
Manager3|sw5
Here's a pure sed based solution
of the managers pbm.
Note: I am using the y/// command
since POSIX sed dont support the
[^\n] syntax.
cat - <<__DATA__ |
Manager1|team1|sw1
Manager3|team4|sw5
Manager1|team2|sw4
Manager2|team5|sw9
Manager2|team5|sw12
Manager1|team3|sw2
Manager1|team2|sw0
__DATA__
sed -e '
G
y/\n_/_\n/
s/^\([^|_]*\)|\([^|_]*\)|\([^|_]*\)_\1|\2|\([^|_]*\)/\1|\2|\4,\3/;ta
s/^\([^|_]*\)|\([^|_]*\)|\([^|_]*\)_\(.*_\)\1|\2|\([^|_]*\)/\4\1|\2|\5,\3/;ta
s/_$//
s/^\([^_]*\)_\(.*\)/\2_\1/
:a
y/\n_/_\n/
h;$!d
'
Post a Comment