Programming in find(1)

You know find(1). It's that useful UN*X utility for locating files based on various criteria and doing things to them (usually in conjunction with xargs). But did you know you can write programs in it?

Example Programs

Here is a selection of find(1) programs I have written:

Hello World

The traditional first program, this one is trivial enough to include here:

find . -path . -printf 'Hello, World\n' -prune

99 Bottles Of Beer

This was the first program I wrote, as an entry for the 99 Bottles of Beer on the Wall site (which gives listings of programs in an enormous number of different languages, all of which print the "lyrics" to "99 Bottles of Beer"). find(1) was the most obscure thing I could think of...

View the script.

10 Green Bottles Sig Hack

This is a modification of the beer hack, which fits into a four-line signature suitable for use in mail or news. The lyrics have been heavily abbreviated to make it fit, and as a result the singular/plural distinction has disappeared.

View the script.

Finding Primes

This one accomplishes the more significant programming task of printing all the primes under 100. The invocation of a second find as a sort of subroutine is quite neat IMHO...

View the script.

Since this one looks more like line noise than most, you might want to try a version with comments that attempt to explain how it works...

Some Explanatory Notes

The programs here follow a certain pattern. They are written as /bin/sh scripts to provide an easy way to run them and to ensure that the shell handles quoting in the way we expect. There's also a little boiler plate to create a directory to work in and delete it after the find script has run. Usually we need to use expr to do the arithmetic. But the flow control is entirely due to find.

The basic flow control structure with find programs works like this:

find . -noleaf 		# boilerplate code
-path . [initialisation code here] -o
-name [x]		# or some other condition selecting on the filename
[code]
-name [y]
[code]
...

where the [code] sections mkdir a subdirectory of the directory currently being worked on. find will then obligingly recurse into that directory and reexecute the commands.

The expr expression "{} : '.*/\(.*\)'" is an idiomatic way of getting the basename of the current directory using expr's pattern matching operator.

The -noleaf option tells find not to use an optimisation where it works out the number of subdirectories of a directory without having to actually read that directory. This doesn't take account of subdirectories created during the execution of find, which causes all these scripts not to work. If your find doesn't have -noleaf it probably doesn't have the optimisation either; try simply omitting it.

This page written by Peter Maydell (pmaydell@chiark.greenend.org.uk).