Monday, April 12, 2010

Remove leading zero from line - awk, sed, bash


Input file:

$ cat ids.txt
00009
01902
34390
00190
00001

Required: Remove the leading zero's from each of the lines in the above file.

Using sed:

$ sed 's/^[0]*//' ids.txt

Output:
9
1902
34390
190
1

So in vi ex mode, the command for the same will be:

:1,$ s/^[0]*//

Other alternatives:

$ awk '{printf "%d\n",$0}' ids.txt
$ awk '{print $1 + 0}' ids.txt
$ cat ids.txt | bc

Using bash parameter substitution:

$ shopt | grep extglob
extglob off

# Set this option on
$ shopt -s extglob
$ i=000100
$ echo ${i##+(0)}
100

extglob shell option in bash:
If set, the extended pattern matching features are enabled

Now if you are thinking how we can add zero's at beginning of a bash variable, here is the way using printf:

$ printf "%010d\n" 00005
0000000005

You might look at this post which shows how we can make some numbers equal width by padding the number with leading zeroes.

Some related posts:
Replace leading zero's with blank using sed
Remove white space in vi editor

14 comments:

Nate said...

I appreciate all your posts. It really has helped educate me in scripting, especially with awk/sed.

Jadu Saikia said...

@Nate, thanks.

Taufik Zukhan F said...

Hai Jadu,

How i can replace the "," by the "|" in this text.

ab,cd,ef,gh
gh,jg,gj,yu

To

ab|cd|ef|gh
gh|jg|gj|yu

Many thanxs Jadu

Taufik Zukhan F said...

Hai jadu,

How to convert this text.

ab,cd,ef,gh
ij,kl,mn,op


to become

ab|cd|ef|gh
ij|kl|mn|op


Thanxs Jadu

Jadu Saikia said...

@Taufik,

This should be a sed replacement like this:

sed 's/,/|/g' file > file.tmp
sed file.tmp file

or if you have -i support on sed

sed -i 's/,/|/g' file

Also you can have a look on this post:

http://unstableme.blogspot.com/2008/01/awk-change-field-separator-add-line.html

Hope this helps. Keep in touch.

Mahesh Kharvi said...

It is not necessary to enclose zero inside square brackets.

sed 's/^[0]*//' ids.txt

can be

sed 's/^0*//' ids.txt

Jadu Saikia said...

@Mahesh, true. Thanks

Retagi said...

Hi all, sorry I dont if this is the right place to put my query. I have a file with lots of characters, and I want to remove all the characters after the 4th character in all lines, e.g.
1234456
7654328
1209873
so remove characters to become
1234
7654
1209


please can you help?

FR

Jadu Saikia said...

@Retagi,

Something like this ?

$ sed 's/^\(....\).*$/\1/' file.txt

o/p:
1234
7654
1209

or

$ awk 'BEGIN {FS=OFS=""} {print $1,$2,$3,$4}' file.txt

Adithya Kiran said...

Situation:

Delete all directories, except a directory.
Tip:

# mkdir a b c
# shopt -s extglob
# rm -rf !(c)

(This will remove directories a and b, but keeps c)

Jadu Saikia said...

@Ana, sorry could not reply to you last day. Assuming you want to split the master file into subfiles starting with "DESCRIPTION", here is a solution using awk:

$ awk '/^DESCRIPTION/{close("sub_"f);f++}{print $0 > "sub_"f}' file.txt

Please let me know if your requirement was different. Thanks for your comment.

Agung Kurniawan Faisol said...

how to remove the 0 in front of the comma ..
example:
0.4
but I got out just .4
how to remove it 0

Jadu Saikia said...

@Agung Kurniawan Faisol , do you mean how to get .4 from 0.4 ? Please confirm. Thanks.

Steve Benz said...

@Retagi,

You can use 'cut' to extract substrings...

per your example...
cut -c 1-4 file.txt

© Jadu Saikia www.UNIXCL.com