Friday, October 17, 2008

system in awk - external command execution

system(cmd-line) : Execute the command cmd-line, and return the exit status.

system() in awk returns the exit status of the command rather than its actual output. The command runs, writes its output to standard output, and its exit status ("0") is what gets returned and assigned to the variable.

Input file: My input file contains unix epoch time as the first field with geo_continent as the 2nd field.

$ cat g_details.txt

Required: Convert the unix epoch times (1st field) to human readable date time format.

I had one single line script (epochcnvrt) for converting the epoch time to human readable date time format.

$ cat epochcnvrt
date --date '1970-01-01 UTC '$1' seconds'

So this is how I can execute my script epochcnvrt on each line first field of the above file.

$ awk '{ system("sh epochcnvrt "$1)} {print $2}' FS="|" g_details.txt

The output:

Mon Aug 18 15:00:00 UTC 2008
Tue Aug 19 15:00:00 UTC 2008
Sun Sep 7 15:00:00 UTC 2008
Mon Sep 8 15:00:00 UTC 2008
Tue Sep 9 15:00:00 UTC 2008
Wed Sep 10 15:00:00 UTC 2008


知者行者 said...

it is an interesting example. I have an opposite problem, in my file I have date in human readable format, and I would like to convert it to seconds since epoch time, see the second field. adapting your script didn't work out. Any suggestions?
41538,2000-06-12 13:25:38.000,1.470000

Also in the epchcvrt, the usage of '$1' is confusing, shouldn't we use "$1" to let the shell expand the parameter? but your usage works, I am kind of lost. any explanation? thanks!

Unknown said...

Michale Chen,

Thanks for commenting.

For the opposite one (i.e. human readable to epoch seconds) you need to have your convert script like this:

$ cat myep
date +%s -d"$1"


$ date +%s -d"2000-06-12 13:25:38"

But it does not support the date format you have mentioned (as it has .000); probably you need to remove this extra piece.

Please let me know if you need any help on this.

And the other question: your confusion is valid, but in this case as I am using single quote ' as the outer quote, inside double quote is not going to work, you need to put single quote for the shell to expand the variable.


$ VAR=1221058800

$ echo $VAR

$ echo "$VAR"

$ echo '$VAR'

$ echo 'my var is '$VAR''
my var is 1221058800

$ echo 'my var is "$VAR"'
my var is "$VAR"

Please let me know for any doubt or queries; I would love to answer them, thank you.


知者行者 said...

thanks. however even if I delete the extra .000, the code doesn't work as expected. the calculation is not what I expected, i.e., 2000-06-12 23:01:06 should be 960865266 seconds from Epoch. However the script would give me 960782400
seconds from Epoch, which actually corresponds to 2000-06-12 00:00:00. it seems that in awk the usage
is not going to expand as what we expected if $3 itself contains a space. in this case, $3="2000-06-12 23:01:06" in awk, but the system command only get the first part of it due to the space. the usage
will simply be wrong again, since awk is prevented from expanding $3.

my conclusion is that awk can only pass field to system command if the field doesn't contain space and other special characters meaningful for the shell. Within shell programming, we can use quota to escape these special characters, however using quota inside awk, such as "$3", we prevent awk to send out the field.

By the way, it seems that single quota within single quota is better to read as two consecutive pair of single quotas. in your example
'my variable is '$1''
is equivalent to
'my variable is '$1

all the best,

Anonymous said...

@Michael Chen: I've managed to pass strings with spaces to the command line by escaping the quotes:
That is, the variable needs to be quoted in the shell, so you'll have to pass the quotes from awk, too

newfather888 said...

Excellent Jaidu
I have used your example to a certain extent but with getline and
my epochcvd converts to yyyymmddhh24miss format.

I have a csv and this works fine with one epoch field:

awk -F, '{ system("epochcvd "$1 | getline $1) close(epochcvd)} {print $1","$2"}'

Input of

returns this Output:

But now I have a third field which is also an epoch timestamp

My input of

turns out like this:

i.e. only the first record gets all the fields converted

My new awk statement is
awk -F, '{ system("epochcvd "$1 | getline $1 ) }
{ system("epochcvd "$3 | getline $3 ) close(epochvd) }
{print $1","$2","$3"}'

Can you see anything wrong?

© Jadu Saikia