Friday, January 28, 2005

Geeky Friday: find, xargs, grep, and spaces

On linux/unix systems, the find command is often used along with grep to find text in a file somewhere in a multi-level directory hierarchy. Two ways to do this are:
find . -type f -exec grep "blah" {} \;
find . -type f -print | xargs grep "blah"
The second method is slightly better as the grep command will be run only once on all files found whereas the first will run grep for each file.

The second method breaks down if find finds filenames containing spaces. If, for example, there's a file named "How To Conquer The World.txt". The xargs grep part will try to search in five files: "How", "To", "Conquer", "The", "World.txt" and grep will complain that these files don't exist.

But wait, it's 2005, surely the geeks of the world have resigned themselves to the fact that some people like to put spaces in file names -- not out of malice towards linux geeks, but just because they don't know what it can do. The good news is that yes, in fact, the geeks working on find and xargs have done something about this.

Their solution is to provide an alternate file name delimiter. The new way to handle this is to do this:

find . -type f -print0 | xargs -0 grep blah
The -print0 tells find to output its list of files separated by nul (ASCII 0) instead of a space. Similarly, the -0 tells xargs to expect that its input is separated by nulls

No comments: