| Colin Watson | |||||
|
Subscribe
Flavours |
Sun, 03 Oct 2010 When I took over man-db in 2001, one of the major problems that became evident after maintaining it for a while was the way it handled subprocesses. The nature of man and friends means that it spends a lot of time calling sequences of programs such as In higher-level languages, there are usually standard constructs which are safer than just passing a command line to the shell. For example, in Perl you can use I wrote a couple of thousand lines of library code in man-db to address this problem, loosely and now quite distantly based on code in groff. In the following examples, function names starting with Constructing the simplified example pipeline from my first paragraph using this library looks like this: pipeline *p; int status; p = pipeline_new (); p->want_infile = "input-file"; pipeline_command_args (p, "zsoelim", NULL); pipeline_command_args (p, "tbl", NULL); pipeline_command_args (p, "nroff", "-mandoc", "-Tutf8", NULL); pipeline_start (p); status = pipeline_wait (p); pipeline_free (p); You might want to construct a command more dynamically:
command *manconv = command_new_args ("manconv", "-f", from_code,
"-t", "UTF-8", NULL);
if (quiet)
command_arg (manconv, "-q");
pipeline_command (p, manconv);
Perhaps you want an environment variable set only while running a certain command:
command *less = command_new ("less");
command_setenv (less, "LESSCHARSET", lesscharset);
You might find yourself needing to pass the output of one pipeline to several other pipelines, in a "tee" arrangement: pipeline *source, *sink1, *sink2; source = make_source (); sink1 = make_sink1 (); sink2 = make_sink2 (); pipeline_connect (source, sink1, sink2, NULL); /* Pump data among these pipelines until there's nothing left. */ pipeline_pump (source, sink1, sink2, NULL); pipeline_free (sink2); pipeline_free (sink1); pipeline_free (source); Maybe one of your commands is actually an in-process function, rather than an external program:
command *inproc = command_new_function ("in-process", &func, NULL, NULL);
pipeline_command (p, inproc);
Sometimes your program needs to consume the output of a pipeline, rather than sending it all to some other subprocess:
pipeline *p = make_pipeline ();
const char *line;
line = pipeline_peekline (p);
if (!strstr (line, "coding: UTF-8"))
printf ("Unicode text follows:\n");
while (line = pipeline_readline (p))
printf (" %s", line);
pipeline_free (p);
man-db deals with compressed files a lot, so I wrote an add-on library for opening compressed files (which is somewhat man-db-specific, but the implementation wasn't difficult given the underlying library): pipeline *decomp_file = decompress_open (compressed_filename); pipeline *decomp_stdin = decompress_fdopen (fileno (stdin)); This library has been in production in man-db for over five years now. The very careful signal handling code has been reviewed independently and the whole thing has been run through multiple static analysis tools, although I would always welcome more review; in particular I have no idea what it would take to make it safe for use in threaded programs since I generally avoid threading wherever possible. There have been a handful of bugs, which I've fixed promptly, and I've added various new features to support particular requirements of man-db (though in as general a way as possible). Every so often I see somebody asking about subprocess handling in C, and I wonder if I should split this library out into a standalone package so that it can be used elsewhere. Web searches for things like "pipeline library" and "libpipeline" don't reveal anything that's a particularly close match for what I have. The licensing would be GPLv2 or later; this isn't likely to be negotiable since some of the original code wasn't mine and in any case I don't feel particularly bad about giving an advantage to GPLed programs. For more details on the interface, the header file is well-commented. Is there enough interest in this to make the effort of producing a separate library package worthwhile? As well as the general effort of creating a new package, I'd need to do some work to disentangle it from a few bits and pieces specific to man-db. If you maintain a specific package that could use this and you're interested, please contact me with details, mentioning any extensions you think you'd need. I intentionally haven't enabled comments on my blog for various reasons, but you can e-mail me at cjwatson at debian.org or man-db-devel at nongnu.org. |
||||