Tuesday, July 1, 2008

Convert date format using sed, awk, perl


Here we will be discussing two examples:

1) Convert mmddyy to dd-mm-yy format using sed and perl
2) Convert yyyymmdd format to mm/dd/yyyy using awk and sed


1) Convert mmddyy to dd-mm-yy format

Input file:


$ cat file2.txt
090607:awk sed expect
091207:bash ksh csh zsh
091207:bash perl python


Output Required:
- Convert the first field(delimiter :) which is in "mmddyy" format to "dd-mm-yy format". i.e.


06-09-07:awk sed expect
12-09-07:bash ksh csh zsh
12-09-07:bash perl python


sed solution:


$ sed -e "s/^\(..\)\(..\)\(.*\)/\2-\1-\3/g" file2.txt



Perl solution:


$ cat frmt.pl
#!/usr/bin/perl
open(FILE, "$ARGV[0]");

while(<FILE>){
$_=~s/^(\d{2})(\d{2})(\d{2})/$2-$1-$3/;
print $_;
}


Executing the perl script:


$ ./frmt.pl file2.txt
06-09-07:awk sed expect
12-09-07:bash ksh csh zsh
12-09-07:bash perl python



2) Convert yyyymmdd format to mm/dd/yyyy


$ echo "YYYYMMDD" | awk '
BEGIN {OFS="/"}
{print substr($1,5,2),substr($1,7,2),substr($1,1,4)}'


Output:
MM/DD/YYYY


or


$ echo "YYYYMMDD"|sed -n -e "s_\(....\)\(..\)\(..\)_\2/\3/\1_p"


Output:
MM/DD/YYYY

12 comments:

SwarnaLatha said...

I have a file with these type of rows. There are many rows, around 4000 with different dates and numbers.

MyS : MD : abc cde efg,20100715044454,s : 4946274201
MyS : MD : abc cde efg,20100715044453,s : 5946274201
MyS : MD : abc cde efg,20100715044457,s : 6946274201
MyS : MD : abc cde efg,20100715044458,s : 7946274201
MyS : MD : abc cde efg,20100715044459,s : 9946274201
MyS : MD : abc cde efg,20100715044450,s : 8946274201


I want to modify these rows in file with this format to load it into a table through teradata.

Example output for first row:
2010-07-15 04:44:55,4946274201

Can anyone help me out in this.

If anyone has any solution, please mail me to swarna.appala@gmail.com

Jadu Saikia said...

@SwarnaLatha,

Something like this ?

$ cat file.txt
MyS : MD : abc cde efg,20100715044454,s : 4946274201
MyS : MD : abc cde efg,20100715044453,s : 5946274201
MyS : MD : abc cde efg,20101115044457,s : 6946274201
MyS : MD : abc cde efg,20100715054858,s : 7946274201
MyS : MD : abc cde efg,20100715044459,s : 9946274201
MyS : MD : abc cde efg,20100715044450,s : 8946274201

$ awk -F "," '{print $2,$3}' file.txt

20100715044454 s : 4946274201
20100715044453 s : 5946274201
20101115044457 s : 6946274201
20100715054858 s : 7946274201
20100715044459 s : 9946274201
20100715044450 s : 8946274201

$ awk -F "," '{print $2,$3}' file.txt | awk '{print substr($1,0,4)"-"substr($1,5,2)"-"substr($1,7,2),substr($1,9,2)":"substr($1,11,2)":"substr($1,13,2)","$NF}'

2010-07-15 04:44:54,4946274201
2010-07-15 04:44:53,5946274201
2010-11-15 04:44:57,6946274201
2010-07-15 05:48:58,7946274201
2010-07-15 04:44:59,9946274201
2010-07-15 04:44:50,8946274201

SwarnaLatha said...

Thanks Jadu,
your code has been of great help to me.
i do have a small doublt in it,
there are some rows like

MyS : MD : abc cde efg,20100715044454,s : 12345
MyS : MD : abc cde efg,20100715044453,23456
MyS : MD : abc cde efg,20101115044457,s : 3456744235
MyS : MD : abc cde efg,20100715054858,s : 9946254202

please see the first and second row, how can we handle those to get the same output as what you have given, thanks

SwarnaLatha said...

Thanks Jadu,

your code has been of great help to me.

there are some rows like

MyS : MD : abc cde efg,20100715044454,s : 1234578
MyS : MD : abc cde efg,20100715044453,2345665
MyS : MD : abc cde efg,20101115044457,
MyS : MD : abc cde efg,20101115044457,s : 3456744235
MyS : MD : abc cde efg,20100715054858,s : 9946254202.

please see the first three rows, how can we handle those to get the same output as what you have given,

the third row has no value for the last field, we also need to discard those kind of rows and also delete complete duplicate rows from the file.

Sorry, if iam asking too much from you. Please help me on this.

thanks

Jadu Saikia said...

@SwarnaLatha,
Thanks for the question.

Something like this ?

awk -F ":" 'NF==4' file.txt | awk -F "," '{print $2,$3}' | awk '{print substr($1,0,4)"-"substr($1,5,2)"-"substr($1,7,2),substr($1,9,2)":"substr($1,11,2)":"substr($1,13,2)","$NF}'

We can optimize multiple awk statements to one. Once you confirm the above, I will optimize it.

PH said...

very helpful. thanks!

Unknown said...

I have file which type of rows :

6289693505455 8/18/2012 23:59 Plan_DAIL_30D_AA

and i want to modify became :
DDD PSSS: URDR="6289693505455", NAME="Plan_DAIL_30D_AA", SRVENDDATETIME=2012&08&18&23&59&59, STATE=Normal;

can you help me to do ?, thanks before.

Jadu Saikia said...

@unknown, thanks for the question.

A not so efficient tip would be :

$ cat file.txt
6289693505455 8/18/2012 23:59 Plan_DAIL_30D_AA

$ awk '{print "DDD PSSS: URDR=""\""$1"\""",NAME=""\""$NF"\""",SRVENDDATETIME="$2,$3",STATE=Normal;"}' file.txt

o/p:
DDD PSSS: URDR="6289693505455",NAME="Plan_DAIL_30D_AA",SRVENDDATETIME=8/18/2012 23:59,STATE=Normal;


$ awk '{print "DDD PSSS: URDR=""\""$1"\""",NAME=""\""$NF"\""",SRVENDDATETIME="$2,$3",STATE=Normal;"}' file.txt | sed 's/\(.*,SRVENDDATETIME=\)\(.*\)\/\(.*\)\/\(.*\) \(.*\):\(.*\)\(,STATE=.*\)/\1\4\&\2\&\3\&\5\&\6\7/'

o/p:
DDD PSSS: URDR="6289693505455",NAME="Plan_DAIL_30D_AA",SRVENDDATETIME=2012&8&18&23&59,STATE=Normal;

Let me try one efficient one :-)

tri ms said...

hi...

how i could convert "08/10/2012 5:05 PM" to "08/10/2012 17:05" using awk?

Jadu Saikia said...

@tri ms:

with date(1) format options:

$ date "+%D %R %p" -d "08/10/2012 5:05 PM"

08/10/12 17:05 PM

Unknown said...

Could you tell me how to convert the following dates?

If I have m/d/yyyy, I want to have
0m/0d/yyyy.

In other words, I want a 0 inserted whenever the month or day is a single digit.
Thanks,
cnurb

Jadu Saikia said...

@Unknown, something like this ?

$ echo "3/4/2013" | awk -F "/" '{
month=$1; day=$2; if (length(day) == 1) day="0"day; if (length(month) == 1) month="0"month
print month"/"day"/"$3
}'

03/04/2013


$ echo "3/14/2013" | awk -F "/" '{
month=$1; day=$2; if (length(day) == 1) day="0"day; if (length(month) == 1) month="0"month
print month"/"day"/"$3
}'

03/14/2013

$ echo "12/4/2013" | awk -F "/" '{
month=$1; day=$2; if (length(day) == 1) day="0"day; if (length(month) == 1) month="0"month
print month"/"day"/"$3
}'

12/04/2013

© Jadu Saikia www.UNIXCL.com