Wednesday, April 7, 2010

Simple bash script to parse log file


A simple bash script; can be useful for bash newbies.

Input file 'log.txt' is a log file and is of the following format:

$ cat log.txt
2010-04-06 08:06:01 INFO Start ....
2010-04-06 08:06:02 INFO Consuming file trp.1270540317.0.in
2010-04-06 08:06:02 INFO Consuming file trp.1270540326.2.in
2010-04-06 08:06:03 INFO Consuming file trp.1270540341.6.in
2010-04-06 08:06:03 INFO Consuming file trp.1270540367.0.in
2010-04-06 08:06:04 INFO End ....

Required: For each file (trp.<epoch>.<id>.in) entry in the above file, calculate the difference in seconds between this "epoch" time-stamp and the time the file is processed.
e.g.
for this entry:

2010-04-06 08:06:02 INFO Consuming file trp.1270540317.0.in

Calculate this:

(2010-04-06 08:06:02)-1270540317

The bash script:

#!/bin/sh

QFILE=log.txt

grep "INFO Consuming file" $QFILE | while read line
do
filename=$(echo "$line" | awk '{print $NF}')
PT=$(date +%s -u -d "${line%INFO*}")
AT=$(echo "$line" | awk '{print $(NF-2)}' FS=\.)
((diff_sec=PT-AT))
echo "$filename $diff_sec"
done


Output:

$ ./show-diff.sh
trp.1270540317.0.in 845
trp.1270540326.2.in 836
trp.1270540341.6.in 822
trp.1270540367.0.in 796

Some related posts:

7 comments:

kasehDANpercaya said...

Jadu,

i need you help on something. since i cant find any other way to contact you, i'll ask you here.

i need to append a text file with a fixed header and a trailer with line count. is it possible to do it in just 1 awk script?

thx

Zaiman Noris, Kuala lumpur.

Jadu Saikia said...

@kasehDANpercaya, thanks for writing to me. I would definitely assist you on this. Could you please post a sample example of your requirement. Keep in touch. // Jadu

Kedar said...

Hi

I'm using this tutorial to learn how to parse a file. However when I used your code I got errors with the %s when I try to run the small bit of code. Can you explain what I'm doing wrong?

Thanks.

Jadu Saikia said...

@Kedar, could you please put the error message here, that will help. Also could you please confirm if the following command works on your machine

date +%s -u

also which OS are you in ?

K said...

Hi Jadu,

I'm very new to this so please bear with me.

I am using windows 7. In notepad I made a file called parse.sh with this bit of code:

#!/bin/bash

QFILE=log.txt

grep "INFO Consuming file" $QFILE | while read line
do
filename=$(echo "$line" | awk '{print $NF}')
PT=$(date +%s -u -d "${line%INFO*}")
AT=$(echo "$line" | awk '{print $(NF-2)}' FS=\.)
((diff_sec=PT-AT))
echo "$filename $diff_sec"
done

I tried to run it in SSH client and this is what I got:

bash parse.sh
: command not found
: command not found
'arse.sh: line 10: syntax error near unexpected token `
'arse.sh: line 10: ` ((diff_sec=PT-AT))

K said...

Also Jadu, when I type

date +%s -u

it just says %s

K said...

sorry to disturb you again, but I'm learning bash because my CS professor gave us an assignment where we have to learn bash on our own and do a project on CPU scheduling using 2 out of 4 algorithms (I'm going to do first come first serve and round robin).

We are given a processes.txt file that we have to first read and gather information from. After that I have to show output of process id and current time. in addition I have to show turnaround time, response time, throughput, and processor utilization.

We are only going to be taught the algorithms for the scheduling, but its up to us to learn bash and implement this.

© Jadu Saikia www.UNIXCL.com