Sunday, March 30, 2008

Creating menus using select - BASH


The select construct is adopted from the Korn Shell and is a good tool for building menus.

The basic usage:

select variable [in list]
do
command...
break
done

This prompts the user to enter one of the choices presented in the variable list.

Note: select uses the PS3 prompt by default.

A simple script to start with.

$ cat se3.sh
#!/bin/bash

PS3='Your Favourite Shell ?' # Set the prompt string
echo
select shell
do
echo
echo "You have selected: $shell"
echo
break #This break is important and must
done
exit 0

Executing:
$ ./se1.sh

1) ksh
2) bash
3) sh
4) zsh
5) csh
Your Favourite Shell ?2

You have selected: bash

List is omitted:
If "in list" is not mentioned, select picks the list of command line arguments ($@) passed to the script or to the function in which the select construct is embedded. See "se3.sh" script below:

$ cat se3.sh
#!/bin/bash

PS3='Your Favourite Shell ?' # Set the prompt string
echo
select shell
do
echo
echo "You have selected: $shell"
echo
break #This break is important and must
done
exit 0

Executing:
$ ./se3.sh bash ksh zsh

1) bash
2) ksh
3) zsh
Your Favourite Shell ?1

You have selected: bash


Using select in a function:

$ cat se6.sh
#!/bin/bash

PS3='Your Favourite Shell ?' # Set the prompt string
echo

f_choose () {
select shell
do
echo
echo "You have selected: $shell"
echo
break #This break is important and must
done
}

f_choose "bash" "ksh" "csh" "zsh"
exit 0

Executing:
$ ./se6.sh

1) bash
2) ksh
3) csh
4) zsh
Your Favourite Shell ?3

You have selected: csh


A practical example:

$ cat sel.sh
#!/bin/bash

f_Add () {
echo "Inside Add Function"
}

f_Delete () {
echo "Inside Delete Function"
}

f_Update () {
echo "Inside Update Function"
}

PS3='Your Menu ? ' # Set the prompt string
echo
select item in "Add" "Delete" "Update" "quit"
do

#Handling default case in select
if ! test "$item"; then
echo "Wrong entry, try again"
continue
fi

[ $item = "quit" ] && exit 0

echo
echo "You have selected: $item"
f_$item # Calling the function
echo
done
exit 0

Executing:
$ ./sel.sh

1) Add
2) Delete
3) Update
4) quit
Your Menu ? 1

You have selected: Add
Inside Add Function

Your Menu ? 2

You have selected: Delete
Inside Delete Function

Your Menu ? 3

You have selected: Update
Inside Update Function

Your Menu ? 5
Wrong entry, try again
Your Menu ? 4

Monday, March 24, 2008

Compute simple average using AWK

Sample files:

$ cat file1.txt

AA 33
BB 21
KK 99
CC 14
DD 13

$ cat file2.txt
EE 32
FF 45
FG 56
KK 99
FF 67

Objective:
- Item "KK" is erroneous, so should not be considered in average
- Compute the average of rest of the Items 2nd fields considering both the files, file1.txt and file2.txt.

The script:
$ awk '
!/^KK/ {sum+=$2; ++n; print $1,$2 }
END { print "-----------"
print "Tot="sum"("n")";print "Avg="sum"/"n"="sum/n}
' file1.txt file2.txt

Output:
AA 33
BB 21
CC 14
DD 13
EE 32
FF 45
FG 56
FF 67
-----------
Tot=281(8)
Avg=281/8=35.125

Sunday, March 23, 2008

Print field based on header - AWK

$ cat student.vi.out
FName:DOB:Rank:Result
Jack:1982:A+:Pass
Bin:1981:B-:Fail
Nin:1982:A:Pass
Lenin:1983:A+:Pass
Hope:1982:B+:Pass

We can print the 3rd field i.e. "Rank" field as below:

$ awk '{print $3}' FS=: student.vi.out

But if you are not sure whether "Rank" field is at 3rd place or 4th place or 5th(in case of a big file with so many fields), but you know that the field has the heading named as "Rank", this is how you can print that particular field:

$ awk -v header="Rank" '
BEGIN { FS=":"; c=0 }
NR == 1 { for (i=1;i<=NF;i++) { if ($i==header) { c=i }} }
NR > 1 && c>0 { print $c }
' student.vi.out

Output:
A+
B-
A
A+
B+

Saturday, March 22, 2008

Multiple FS in AWK

Sample file:

$ cat summary.txt

A|Jan|clerk|02:45
B|Jan|Salesman|02:12
C|Jan|Accountant|03:12
A|Feb|clerk|01:10
B|Feb|Salesman|11:10
B|March|Salesman|3:10
C|Feb|Accountant|3:34

Output Required:

(First field)|(last field converted to minutes)

i.e.

A|165
B|132
C|192
A|70
B|670
B|190
C|214

This is how we can specify two field separators (| and :) with FS in awk:

$ awk 'BEGIN{FS="[|,:]"; OFS="|"} {print $1,$(NF-1)*60+$NF}' summary.txt

Friday, March 21, 2008

Print/Remove first some characters of string

#The string
$ var="unixbashscripting"


#To print the first 8 characters of the string "unixbashscripting"

1)
$ echo ${var:0:8}
unixbash

2)
$ echo $var | sed 's/\(.\{8\}\).*/\1/'
unixbash

3)
$ echo $var | awk '{print substr($0,1,8)}'
unixbash

4)
$ echo $var | cut -c1-8
unixbash

5)
$ printf "%.8s\n" "$var"
unixbash

6)
$ echo "${var%${var#????????}}"
unixbash


#To remove the first 8 characters of the string "unixbashscripting"

1)
$ echo $var | cut -c9-
scripting

2)
$ echo "${var#????????}"
scripting

3)
$ echo ${var:8}
scripting

4)
$ echo $var | awk '{print substr($0,9)}'
scripting

Tuesday, March 18, 2008

Performing "join" using AWK

$ cat file1.txt
Alex:ID23:How:23:2004
Aina:ID12:Thomas:14:2003
Ciam:ID13:Dev:23:2000
Alot:ID34:Ya:24:2004
Brian:ID64:Low:25:1999

$ cat file2.txt
ID13:12,300
ID12:34,300
ID64:50,000

#Join both the files based on common 1st filed of file2.txt and 2nd field of file1.txt
#i.e. required Output:

Ciam:ID13:Dev:23:2000:12,300
Aina:ID12:Thomas:14:2003:34,300
Brian:ID64:Low:25:1999:50,000

Solution1:
$ awk '
BEGIN {FS=OFS=":"}
NR==FNR{arr[$1]=$2;next}
$2 in arr{print $0,arr[$2]}
' file2.txt file1.txt


Aina:ID12:Thomas:14:2003:34,300
Ciam:ID13:Dev:23:2000:12,300
Brian:ID64:Low:25:1999:50,000

Solution2:
$ awk '
BEGIN {FS=":"}
NR==FNR{arr[$2]=$0;next}
$1 in arr && $0=arr[$1] FS $2
' file1.txt file2.txt


Ciam:ID13:Dev:23:2000:12,300
Aina:ID12:Thomas:14:2003:34,300
Brian:ID64:Low:25:1999:50,000

Using "join":
Same thing if I have to do with "join" command, here are the steps:

$ cat file1.txt
Alex:ID23:How:23:2004
Aina:ID12:Thomas:14:2003
Ciam:ID13:Dev:23:2000
Alot:ID34:Ya:24:2004
Brian:ID64:Low:25:1999

#Sort file1.txt based on 2nd field, store the sorted output on file1.txt.srt
$ sort -t ":" +1 -2 -o file1.txt.srt file1.txt

$ cat file1.txt.srt
Aina:ID12:Thomas:14:2003
Ciam:ID13:Dev:23:2000
Alex:ID23:How:23:2004
Alot:ID34:Ya:24:2004
Brian:ID64:Low:25:1999

$ cat file2.txt
ID13:12,300
ID12:34,300
ID64:50,000

#Sort file2.txt based on 1st field, store the sorted output on file2.txt.srt
$ sort -t ":" +0 -1 -o file2.txt.srt file2.txt

$ cat file2.txt.srt
ID12:34,300
ID13:12,300
ID64:50,000

#Now perform "join" operation.
$ join -t ":" -1 2 -2 1 -o "1.1 1.2 1.3 1.4 1.5 2.2" file1.txt.srt file2.txt.srt
Aina:ID12:Thomas:14:2003:34,300
Ciam:ID13:Dev:23:2000:12,300
Brian:ID64:Low:25:1999:50,000

More about "join" can be found here

Monday, March 17, 2008

Remove duplicates based on fields - AWK

$ cat file.txt
DD:12:A
AA:11:N
EE:13:B
AA:11:F
BB:09:K
DD:13:X

#Based on first field. Duplicates are DD,AA
$ awk '!x[$1]++' FS=":" file.txt
DD:12:A
AA:11:N
EE:13:B
BB:09:K

Again,

$ cat file.txt
DD:12:A
AA:11:N
EE:13:B
AA:11:F
BB:09:K
DD:13:X


#This time, based on first and 2nd field.Only duplicate combination is (AA:11)
$ awk '!x[$1,$2]++' FS=":" file.txt
DD:12:A
AA:11:N
EE:13:B
BB:09:K
DD:13:X

Related post:

Remove duplicates without sorting file.

Send alternate lines to separate files - AWK

I already explained about subdividing a file in my previous post http://unstableme.blogspot.com/2008/01/subdividing-file-bash-newbie.html

Here are some more and useful one.

$ cat file1
one
two
three
four
five
six
seven

Purpose1:

Send alternate lines to 2 different files. i.e. 1st,3rd,5th,7th line should go to newfile1 and 2nd,4th,6th line should go to newfile2. i.e odd lines to newfile1 and even lines to newfile2


$ awk 'NR%2 {print > "newfile1"}' file1

or

$ sed -n 'p;n' file1 > newfile1


$ cat newfile1
one
three
five
seven

$ awk '(NR+1)%2 {print > "newfile2"}' file1

or

$ sed -n 'n;p' file1 > newfile2


$ cat newfile2
two
four
six

Purpose2:

Send every consecutive 3 lines into separate files myfile_1,myfile_2..

$ awk '{print >("myfile_" int((NR+2)/3))}' file1

$ cat myfile_1
one
two
three

$ cat myfile_2
four
five
six

$ cat myfile_3
seven

Purpose3:

Send every consecutive 2 lines into separate files myfile_1,myfile_2..

$ awk '{print >("myfile_" int((NR+1)/2))}' file1

Sunday, March 16, 2008

Renaming files lowercase to UPPERCASE to Titlecase - BASH

Suppose we have three files in a directory.

$ ls
abc.txt mno.pyc xyz.cpp

A) Renaming the files to Titlecase

$ ls | while read file
> do
> mv $file `echo $file | sed 's/\<./\u&/'`
> done


They became:
$ ls
Abc.txt Mno.pyc Xyz.cpp

B) Renaming above files to UPPERCASE

$ ls | while read file
> do
> mv $file `echo $file | sed 's/.*/\U&/'`
> done


Just confirm,
$ ls
ABC.TXT MNO.PYC XYZ.CPP

C) Now reverting back to lowercase, renaming them to lowercase

$ ls | while read file
> do
> mv $file `echo $file | sed 's/.*/\L&/'`
> done


Confirmed!
$ ls
abc.txt mno.pyc xyz.cpp


The same sequence of operations in a different way

**A1) To Titlecase

$ ls | while read file
> do
> mv $file `echo $file| awk 'BEGIN{OFS=FS=""}{$1=toupper($1);print}'`
> done


$ ls
Abc.txt Mno.pyc Xyz.cpp

**B1) To UPPERCASE
$ ls | while read file
> do
> mv $file `echo $file|awk '{$1=toupper($1);print}'`
> done


$ ls
ABC.TXT MNO.PYC XYZ.CPP

C1) To lowercase
$ ls | while read file
> do
> mv $file `echo $file|awk '{$1=tolower($1);print}'`
> done


$ ls
abc.txt mno.pyc xyz.cpp

We can also use ‘tr’ command for renaming files to UPPERCASE and lowercase, as
$ echo abc.txt | tr 'a-z' 'A-Z'
ABC.TXT

$ echo "ABC.TXT" | tr 'A-Z' 'a-z'
abc.txt

** Reference for A1) and B1) above. Feel the difference

$ echo "UNIX" | awk '{print $1}'
UNIX

$ echo "UNIX" | awk 'BEGIN{OFS=FS=""} {print $1}'
U

Accessing external variable in AWK and SED

$ echo "unix scripting"
unix scripting

In SED:

This is a general substitution. I am trying to replace "unix" with "BASH", so "unix scripting" will become "BASH scripting"
$ echo "unix scripting" | sed 's/unix/BASH/'
BASH scripting

Suppose, the text "BASH" is assigned to a variable called "var", now if I try to replace "unix" with "$var" in sed single quote notation, its not going to work as SED can't expand external variable in single quotes.
$ var="BASH"; echo "unix scripting" | sed 's/unix/$var/'
$var scripting

Try the same above with double quotes, this will work.
$ var="BASH"; echo "unix scripting" | sed "s/unix/$var/"
BASH scripting

In AWK

General substitution of "unix" with "BASH", will work. "unix scripting" will become "BASH scripting"
$ echo "unix scripting" | awk '{gsub(/unix/,"BASH")}; 1'
BASH scripting

"BASH" is assigned in variable "var". So the following substitution is not going to work.
$ var="BASH"; echo "unix scripting" | awk '{gsub(/unix/,"$var")}; 1'
$var scripting

Method1: See the "'" (double quote-single quote-double quote) before and after the variable var.
$ var="BASH"; echo "unix scripting" | awk '{gsub(/unix/,"'"$var"'")}; 1'
BASH scripting

Method2: Use awk -v flag this way.
$ var="BASH"; echo "unix scripting" | awk -v v="$var" '{sub(/unix/,v)}1'
BASH scripting

Find longest string in a field - AWK

names.txt is file with the following format:

FirstName|LastName|DOB|Location

Intention: Find the longest LastName (2nd field) among all the entries.

$ cat names.txt
Anish|Saikia|1982|India
Jack|King|1978|London
Rajesh|Gupta|1980|India
Steve|Spitzer|1977|Scotland
Lias|Hazarik|1983|India

The Code:


$ awk ' BEGIN { OFS=FS="|"; cur=max=0; seen=""}
{
cur = length($2)
if (cur > max ) {
seen = $2 "(" $1" "$2 " from " $NF")"
max = cur
} else if (cur == max) {
seen = seen "\n" $2 "(" $1" "$2 " from " $NF")"
}
}
END { print seen }' names.txt

Output:
Spitzer(Steve Spitzer from Scotland)
Hazarik(Lias Hazarik from India)

Saturday, March 15, 2008

Execute a program periodically - BASH

Imagine, your application is generating log files in very short intervals in a particular log directory. You have to check every 3 seconds the count of the number of files in the log directory, so that you can terminate your application when it generates 300 files, also you have to check the time taken in generating those logs (this is for your some testing purpose)

You can use “watch” command which execute a program periodically.

$ watch -n 3 "date; ls | wc -l"

Also you can use “yes” command which basically outputs a string repeatedly until killed.

$ yes "date;ls |wc -l; sleep 3" | sh

And if you are interested in writing a BASH one liner, here it is using while.

$ while :
> do
> date
> ls |wc -l
> sleep 3
> done

* while : #same as while true,

Thursday, March 13, 2008

Pattern not found - AWK

Sample file "cscbatch04" is the details of students of Computer Science 2004 batch of JEC.


$ cat cscbatch04.txt
Mr A,26,C1,C2
Mr B,25,C1,C1
Mr C,25,C3,C1
Mr D,25,C4,C2

Purpose: Search a pattern as the 3rd field in all the lines of the file cscbatch04.txt.
- If the pattern is found, list the lines where it is present as the 3rd field
- If the pattern is not found as 3rd field in any of the lines, echo a message telling "not found"

Here lies the AWK solution:

Searching the pattern "C1" which is present as 3rd field in cscbatch04.txt.
$ awk 'END { if (!found) print "pattern not found" }
$3 == "C1" { print "pattern found in line:", NR; found++ }
' FS="," cscbatch04.txt

pattern found in line: 1
pattern found in line: 2


Searching the pattern "C2" which is not present as "3rd field" in cscbatch04.txt.
$ awk 'END { if (!found) print "pattern not found" }
$3 == "C2" { print "pattern found in line:", NR; found++ }
' FS="," cscbatch04.txt

pattern not found

Wednesday, March 12, 2008

Report Genertion with AWK

"Gyaan" International school use an UNIX based "attendance keeping software" called "regul", which generates attendance in flat file like below.
For Class: VI, the attendance of Month: June, 2007 is something like this.

$ cat attendancesum.txt
Month:June, 2007
Teacher: Mr. JKS
Class: VI

begin
Roll No:1
Days:20
end
begin
Roll No:2
Days:17
end
begin
Roll No:3
Days:16
end
begin
Roll No:4
Days:22
end
begin
Roll No:5
Days:20
end
begin
Roll No:6
Days:12
end

The purpose is build a report for students with attendance more than 19 days in that particular month.

$ awk ' BEGIN {
FS=":"
printf ("Roll\tDays\n")
printf ("------------\n")
}
END {print "------------\n"}
$1~/Roll No/ { x=$2 ; next ; }
($1~/Days/) && ($2 >= 20 ) { printf x"\t" $2"\n" ;}
' attendancesum.txt


The output would be:

Roll Days
------------
1 20
4 22
5 20
------------

Tuesday, March 11, 2008

Case insensitive serach and replace:sed,awk

$ cat os.txt
unix
UNIX
Unix
uNiX

Intention: To replace all unix(s) in os.txt above with the word "BEST". As you can see all the 4 occurrences of the word "unix" is of different cases(upper,lower,mixed)

In general sed search and replace is "case sensitive". The below replacement is only going to effect the 1st line, not all.
$ sed 's/unix/BEST/' os.txt
BEST
UNIX
Unix
uNiX

Sed supports "i" or "I" for "case insensitive" search and replacement
$ sed 's/unix/BEST/i' os.txt
BEST
BEST
BEST
BEST

$ sed 's/unix/BEST/I' os.txt
BEST
BEST
BEST
BEST


This is one more way:
$ sed 's/[uU][nN][iI][xX]/BEST/' os.txt
BEST
BEST
BEST
BEST

With AWK, you have to make IGNORECASE=1 in BEGIN section
$ awk 'BEGIN{IGNORECASE=1} {sub(/unix/,"BEST");print}' os.txt
BEST
BEST
BEST
BEST

Monday, March 10, 2008

Sort based on two fields, merge 3rd fields:AWK

$ cat out3.txt
id1|/usr|user1
id2|/root|user5
id1|/usr|user2
id2|/root|user9
id3|/root|user8
id1|/usr|user3

Output required:
-------------------
id1|/usr|user1,user2,user3
id3|/root|user8
id2|/root|user5,user9

i.e.

- sort based on values of 1st and 2nd fields
- get all the corresponding 3rd fields together (comma separated)


$ awk -F "|" '{Arr[$1"|"$2]=sprintf("%s,%s",Arr[$1"|"$2],$3)} END {for ( i in Arr) {printf("%s|%s\n",i,Arr[i])}}' out3.txt | sed 's/|,/|/' > out3.txt.tmp

$ cat out3.txt.tmp
id3|/root|user8
id2|/root|user5,user9
id1|/usr|user1,user2,user3


Another solution:

$ awk 'BEGIN {FS = OFS = "|"}
!arr[$1$2] {arr[$1$2] = $0; next}
{arr[$1$2] = arr[$1$2] "," $3}
END {for(i in arr) {print arr[i]}}
' out3.txt > out3.txt.tmp


$ cat out3.txt.tmp
id1|/usr|user1,user2,user3
id3|/root|user8
id2|/root|user5,user9

Merge alternate lines of files - BASH,AWK

$ cat file1
AAAA
BBBB
CCCC
DDDD
EEEE
FFFF


$ cat file2
1111
2222
3333
4444
5555
6666


Required Output:
AAAA
1111
BBBB
2222
CCCC
3333
DDDD
4444
EEEE
5555
FFFF
6666

$ cat file2 | paste - | paste file1 - | tr "\t" "\n"
$ paste file1 file2 | tr '\t' '\n'


Now suppose,
$ cat file1
AAAA
BBBB
CCCC


and file2 is as mentioned above.
and you want to merge file1 and file2 such that for every line of file1, two lines of file2 should follow.
i.e.

AAAA
1111
2222

BBBB
3333
4444

CCCC
5555
6666


This can be achieved:

$ cat file2 | paste - - | paste file1 - | tr "\t" "\n"

And using AWK:

$ awk 'FNR==NR{ a[FNR]=$0;next }
{
print $0
print a[FNR+l]
l++
print a[FNR+l]
}
' file2 file1


Note: FNR is the current record number in the current file. FNR is incremented each time a new record is read.It is reinitialized to zero each time a new input file is started.

Friday, March 7, 2008

Remove duplicates without sorting file - BASH

Usually whenever we have to remove duplicate entries from a file, we do a sort of the entries and then eliminate the duplicates using "uniq" command.


But if we have to remove the duplicates and preserve the same order of occurrence of the entries, here is the way:

Sample file:
$ cat file3
AAAA
FFFF
BBBB
BBBB
CCCC
AAAA
FFFF
DDDD

- uniq without sorting will not remove all duplicates.
$ uniq file3
AAAA
FFFF
BBBB
CCCC
AAAA
FFFF
DDDD

- sort gives an option (-u) to sort and uniq, but we lost the original occurrence order of the entries
$ sort -u file3
AAAA
BBBB
CCCC
DDDD
FFFF

- sort and then uniq, removed the duplicates, but order ?? (Same as above)
$ sort file3 | uniq
AAAA
BBBB
CCCC
DDDD
FFFF

- here is the solution using AWK:
$ awk ' !x[$0]++' file3
AAAA
FFFF
BBBB
CCCC
DDDD

Find String Length : BASH

Suppose:

$ VAR="Bash Scripting"

Now to find the length of the above string, I have found 3 different ways:

$ echo "${#VAR}"
14

$ expr length "$VAR"
14

$ echo $VAR | awk '{print length}'
14

setsid: Keep Linux program running while you logs out

From man pages:

NAME
setsid - run a program in a new session

SYNOPSIS
setsid program [ arg ... ]

DESCRIPTION
setsid runs a program in a new session.


setsid runs a program in a new session, so even if you are log out of the session, your program will keep on running.

This is similar to "nohup" command - which allows you to ignore HUP (hangup) signal and keep running the command after user logged out.


"screen" is another good utility to achieve this.


e.g.

$ setsid ./memmonitor.sh '172.22.22.124'

so "memmonitor.sh" (an example script with “172.22.22.124” as first arg) will keep running till its done with its work, even if you are logged out of your session.

I will post a new post for "nohup" and "screen" soon.

Tuesday, March 4, 2008

Delete the line where pattern is found - AWK

The sample file is like this:

$ cat myfile.out
CAN1:23:1970:2006:A
CAN5:45:1999:2007:D
CAN2:14:2006:2008:A
CAN3:43:1982:2000:E


Purpose:
Delete the lines whose (4th field is either 2006 or 2000) or (3rd field is 1999)

Note: we can't use "grep -v" here, as the search pattern "2206" is 3rd filed in 3rd line, but we are interested in 2006 in 4th field.

$ awk -F ":" '$4 ~ /2006|2000/ || $3 ~ 1999 {next}1' myfile.out
CAN2:14:2006:2008:A

or

$ awk -F ":" '$4 !~ /2006|2000/ && $3 !~ /1999/ ' myfile.out
CAN2:14:2006:2008:A

Ftp automation in BASH, without using Expect

This is how we can automate a ftp session in BASH without taking the help of expect.
As ftp provides "quote" option to send username and password, we can use that to send the same.

# FTP function to get a file from REMOTE host
f_GET_FTP(){
local HOST=$1
local USER=$2
local PASSWD=$3

ftp -n $HOST > $TEMPDIR/ftpsh.worked.$$ 2> $TEMPDIR/ftpsh.failed.$$ << EOF
quote USER $USER
quote PASS $PASSWD
cd $DESTDIR
bin
get $4
quit
EOF }

#Calling the function
f_GET_FTP 172.23.0.12 user5 passuser5 input.out

Compare two directories using diff - BASH

$ ls dir1/
bin.sh convert.pl hex.cpp top.py

$ ls dir2/
bin.sh convert.pl touch.cpp


$ diff -r --brief dir1 dir2
Only in dir1: hex.cpp
Only in dir1: top.py
Only in dir2: touch.cpp

To copy the content of dir1 into dir2 without copying the files that are already exist and are the same, use rsync(faster, flexible replacement for rcp):

$ rsync -a dir1/ dir2/

Now comparing the two directories:

$ diff -r --brief dir1 dir2
Only in dir2: touch.cpp

Now doing a rsync of dir2 to dir1
$ rsync -a dir2/ dir1/

Now both the directories are the same, no difference.
$ diff -r --brief dir1 dir2

Sunday, March 2, 2008

Understand your code's performance using time command - BASH

time - run programs and summarize system resource usage

The time command is useful to measure/understand your code's performance. The usual output consists of real, user and system time.
Real time is the amount of time between when the code started and when it exited.
User time and system time are the amount of time spent executing application code versus kernel code, respectively.

Two types of time command are available.

Shell's in-build time: Gives only scheduler information
/usr/bin/time: Gives more information, also allows formatting the output

To save the "time" details in a file:
$ /usr/bin/time -p ./chess.sh 2> time.out

Illustrating the use of "time" command with an example below:

I have two scripts for the same purpose(subdividing a file - sending each line to a new file named as the line-number, also adding a line called HEADER as first line of the

files, and a blank line as the last or 3rd line), but written in two different ways.
Now to compare the performance of the two scripts:

$ cat way1.sh
#!/bin/sh
awk '{print "HEADER" > "file_" NR} {print > "file_" NR} {printf("\n") > "file_" NR}' thefile.out


$ cat way2.sh
#!/bin/sh
cat thefile.out | awk '{ print "HEADER"
printf("%s\n",$0)
print "" }' | awk '{print >("file_" int((NR+2)/3))}'


$ time ./way1.sh
real 0m0.130s
user 0m0.060s
sys 0m0.107s

$ time ./way2.sh
real 0m0.178s
user 0m0.152s
sys 0m0.091s

So way2.sh is a bit expensive performance wise compared to way1.sh.

Saturday, March 1, 2008

OR and AND in grep

OR

**Search "one" or "eleven" in the file "file.out"

$ cat file.out
1 one
11
eleven
2 two
3 three
13 thirteen
4 four
1
one
40 forty


Using grep:
$ grep "one\|eleven" file.out
1 one
11 eleven
1 one

Using AWK:
$ awk '/one|eleven/' file.out
1 one
11 eleven
1 one

Using SED:
$ sed -e '/one/b' -e '/eleven/b' -e d file.out
1 one
11 eleven
1 one


AND

$ cat experience.out
Good judgment comes from experience,
and often experience comes from bad judgment.
Experience is simply the name we give our mistakes.
Experience keeps a dear school, but fools will learn in no other.



**Search for lines containing pattern "comes" and "from" and "judgement" (in any order)

Using SED:
$ sed '/comes/!d; /from/!d; /judgment/!d' experience.out
Good judgment comes from experience,
and often experience comes from bad judgment.


**Search for lines containing pattern "comes" and "from" and "judgement" (in that order)

Using SED:
$ sed '/comes.*from.*judgment/!d' experience.out
and often experience comes from bad judgment.

Using AWK:
$ awk '/comes.*from.*judgment/' experience.out
and often experience comes from bad judgment.

© Jadu Saikia www.UNIXCL.com