Python SIGPIPE handling

Enrico writes about creating pipelines with Python’s subprocess module, and notes that you need to take care to close stdout in non-final subprocesses so that subprocesses get SIGPIPE correctly. This is correct as far as it goes (and true in any language, although there’s a Python bug report requesting that subprocess be able to do this itself, but there’s an additional gotcha with Python that you missed.

Python ignores SIGPIPE on startup, because it prefers to check every write and raise an IOError exception rather than taking the signal. This is all well and good for Python itself, but most Unix subprocesses don’t expect to work this way. Thus, when you are creating subprocesses from Python, it is very important to set SIGPIPE back to the default action. Before I realised this was necessary, I wrote code that caused serious data loss due to a child process carrying on out of control after its parent process died!

import signal
import subprocess

def subprocess_setup():
    # Python installs a SIGPIPE handler by default. This is usually not what
    # non-Python subprocesses expect.
    signal.signal(signal.SIGPIPE, signal.SIG_DFL)

subprocess.Popen(command, preexec_fn=subprocess_setup)

I filed a patch a while back to add a restore_sigpipe option to subprocess.Popen, which would take care of this. As I say in that bug report, in a future release I think this ought to be made the default, as it’s very easy to get things dangerously wrong right now.