Aug 30 2017 env —chdir

I was recently asked to sort things out so that snap builds on Launchpad could themselves install snaps as build-dependencies. To make this work we need to start doing builds in LXD containers rather than in chroots. As a result I’ve been doing some quite extensive refactoring of launchpad-buildd: it previously had the assumption that it was going to use a chroot for everything baked into lots of untested helper shell scripts, and I’ve been rewriting those in Python with unit tests and with a single Backend abstraction that isolates the high-level logic from the details of where each build is being performed.

This is all interesting work in its own right, but it’s not what I want to talk about here. While I was doing all this refactoring, I ran across a couple of methods I wrote a while back which looked something like this:

def chroot(self, args, echo=False):
    """Run a command in the chroot.

    :param args: the command and arguments to run.
    """
    args = set_personality(
        args, self.options.arch, series=self.options.series)
    if echo:
        print("Running in chroot: %s" %
              ' '.join("'%s'" % arg for arg in args))
        sys.stdout.flush()
    subprocess.check_call([
        "/usr/bin/sudo", "/usr/sbin/chroot", self.chroot_path] + args)

def run_build_command(self, args, env=None, echo=False):
    """Run a build command in the chroot.

    This is unpleasant because we need to run it in /build under sudo
    chroot, and there's no way to do this without either a helper
    program in the chroot or unpleasant quoting.  We go for the
    unpleasant quoting.

    :param args: the command and arguments to run.
    :param env: dictionary of additional environment variables to set.
    """
    args = [shell_escape(arg) for arg in args]
    if env:
        args = ["env"] + [
            "%s=%s" % (key, shell_escape(value))
            for key, value in env.items()] + args
    command = "cd /build && %s" % " ".join(args)
    self.chroot(["/bin/sh", "-c", command], echo=echo)

(I’ve already replaced the chroot method with a call to Backend.run, but it’s easier to see what I’m talking about in the original form.)

One thing to notice about this code is that it uses several adverbial commands: that is, commands that run another command in a different way. For example, sudo runs another command as another user, while chroot runs another command with a different root directory, and env runs another command with different environment variables set. These commands chain neatly, and they also have the useful property that they take the subsidiary command and its arguments as a list of arguments. coreutils has several other commands that behave this way, and adverbio is another useful example.

By contrast, su -c is something you might call a “quasi-adverbial” command: it does modify the behaviour of another command, but it takes it as a single argument which it then passes to sh -c. Every time you have something that’s passed to a shell like this, you need a corresponding layer of shell quoting to escape any shell metacharacters that should be interpreted literally. This is often cumbersome and is easy to get wrong. My Python implementation is as follows, and I wouldn’t be totally surprised to discover that it contained a bug:

import re

non_meta_re = re.compile(r'^[a-zA-Z0-9+,./:=@_-]+$')

def shell_escape(arg):
    if non_meta_re.match(arg):
        return arg
    else:
        return "'%s'" % arg.replace("'", "'\\''")

Python >= 3.3 has shlex.quote, which is an improvement and we should probably use that instead, but it’s still another thing to forget to call. This is why process-spawning libraries such as Python’s subprocess, Perl’s system and open, and my own libpipeline for C encourage programmers to use a list syntax and to avoid involving the shell entirely wherever possible.

One thing that the standard Unix tools don’t let you do in an adverbial way is to change your working directory, and I’ve run into this annoying limitation several times. This means that it’s difficult to chain that operation together with other adverbs, for example to run a command in a particular working directory inside a chroot. The workaround I used above was to invoke a shell that runs cd /build && ..., but that’s another command that’s only quasi-adverbial, since the extra shell means an extra layer of shell quoting.

(Ian Jackson rightly observes that you can in fact write the necessary adverb as something like sh -ec 'cd "$1"; shift; exec "$@"' chdir. I think that’s a bit uglier than I ideally want to use in production code, but you might reasonably think that it’s worth it to avoid the extra layer of shell quoting.)

I therefore decided that this was a feature that belonged in coreutils, and after a bit of mailing list discussion we felt it was best implemented as a new option to env(1). I sent a patch for this which has been accepted. This means that we have a new composable adverb, env --chdir=NEWDIR, which will allow the run_build_command method above to be rewritten as something like this:

def run_build_command(self, args, env=None, echo=False):
    """Run a build command in the chroot.

    :param args: the command and arguments to run.
    :param env: dictionary of additional environment variables to set.
    """
    env_args = ["env", "--chdir=/build"]
    if env:
        for key, value in env.items():
            env_args.append("%s=%s" % (key, value))
    self.chroot(env_args + args, echo=echo)

The env --chdir option will be in coreutils 8.28. We won’t be able to use it in launchpad-buildd until that’s available in all Ubuntu series we might want to build for, so in this particular application that’s going to take a few years; but other applications may well be able to make use of it sooner.