Tuesday, February 17, 2009

Handling argument list too long - bash


I have nearly 200,000 files in one of my log directory out of which number of files of the name format "ka.log.*" is 120,000. So whenever I try to do apply some command such as rm, ls or cp etc on those big set of "ka.log.*" files, I used to get

$ ls ka.log.*
bash: /bin/ls: Argument list too long

$ cp ka.log.* new/
bash: /bin/cp: Argument list too long

$ mv ka.log.* new/
bash: /bin/mv: Argument list too long

$ rm ka.log.*
bash: /bin/rm: Argument list too long

"Argument list too long" error for the above commands is due to the limitation of the command (rm, mv, ls, cp) to handle large number of files(arguments).

Linux 'find' command is useful to perform these operations (ls, cp, mv or rm etc) on such big set of files/arguments.

e.g.

To copy those "ka.log.*" to directory /somedir

$ find . -name "ka.log.*" -exec cp {} /somedir/ \;

Looping through while:

find . -name "ka.log.*" | while read FILE
do
...
<some operation on $FILE>
...
done


Another way is to assign the file names to a variable, e.g.

FILES=$(echo /mydir/ka.log.*)

for FILE in $FILES
do
...
<some operation on $FILE>
...
done

11 comments:

housetier said...

maybe also xargs could help dealing with so many files. Just read about it somewhere on nixCraft.tld so its still fresh in my memory

Ian Kelling said...

I gotta call you out:

FILES=$(echo /mydir/ka.log.*)

instead do

FILES=(/mydir/ka.log.*)

good stuff though.

Jadu Saikia said...

@housetier, ya xargs can be sued with find for operations like rm .. etc. Thanks for commenting.
e.g.
$ find . -name "ka.log.*" | xargs rm

Jadu Saikia said...

@Ian, thanks for commenting.

FILES=(/mydir/ka.log.*)
is not going to work (even for smaller set of files); variable FILES in going to hold only one file name here.

FILES=$(echo /mydir/ka.log.*)
is going to work here.

Jadu Saikia said...

One interesting article on the same topic
http://www.linuxjournal.com/article/6060

Ian Kelling said...

your echo thing does not handle whitespace, etc correctly. Mine puts them all in an array, which does handle whitespace etc correctly. Saying it only puts 1 file? I'm dissapointed in you.

Jadu Saikia said...

@Ian, please dont;t be disappoint with me :-)
I feel I have to prove with an example:

$ ls new10/ka.log.*
new10/ka.log.1 new10/ka.log.2 new10/ka.log.3

$ FILES=(./new10/ka.log.*)
$ echo $FILES
./new10/ka.log.1

$ FILES1=$(echo ./new10/ka.log.*)
$ echo $FILES1
./new10/ka.log.1 ./new10/ka.log.2 ./new10/ka.log.3

leprasmurf said...

find . -name "ka.log.*" -exec cp {} /somedir/ \;

wouldn't handle whitespace either.

find . -name "ka.log.*" -exec cp '{}' /somedir/ \;

Jadu Saikia said...

@leprasmurf, thanks for your comment. Keep in touch.

navaho said...

Hello Jadu,

I stumbled on your blog through google looking for something else, but was browsing through your "bash tricks" posts because, well, an old dog can always learn some new ones... ;-)

Anyway, I just wanted to let you know that the method that Ian suggests works perfectly. Though perhaps you used it incorrectly yielding only one result -- as the for..in has to be slightly modified.

Try this:

FILES=(/tmp/*)
for FILE in "${FILES[@]}"; do
cp "$FILE" /somedir/
done;

(Note those quotes around the ${FILES[@]} are necessary to make whitespaces being handled correctly.)

IMHO this method is much cleaner than using "echo".

Jadu Saikia said...

My bad. Thanks Navaho, Ian.

I went wrong as:

$ FILES=(/root/demo/*)

I was trying this (and I just saw one file)

$ echo $FILES
/root/demo/log.1

I could have tried:

$ echo "${FILES[@]}"

And this works perfectly:

$ for FILE in "${FILES[@]}"; do ls $FILE; done

Thank you so much for pointing this.

© Jadu Saikia www.UNIXCL.com