From: Colin Watson Date: Tue, 29 Aug 2017 23:56:07 +0000 (+0100) Subject: env --chdir X-Git-Url: https://www.chiark.greenend.org.uk/ucgi/~cjwatson/git?a=commitdiff_plain;h=0bd11a287829f59221a5a0eb2a3aafe1f8a74f39;p=blog.git env --chdir --- diff --git a/content/env-chdir.md b/content/env-chdir.md new file mode 100644 index 00000000..848ec896 --- /dev/null +++ b/content/env-chdir.md @@ -0,0 +1,147 @@ +Title: env --chdir +Slug: env-chdir +Date: 2017-08-30 00:54:42 +0100 +Tags: coreutils, launchpad, planet-debian, planet-ubuntu + +I was recently asked to sort things out so that +[snap](https://snapcraft.io/) builds on [Launchpad](https://launchpad.net/) +could themselves install snaps as build-dependencies. To make this work we +need to start doing builds in [LXD +containers](https://linuxcontainers.org/lxd/) rather than in chroots. As a +result I've been doing some quite extensive refactoring of +[launchpad-buildd](https://launchpad.net/launchpad-buildd): it previously +had the assumption that it was going to use a chroot for everything baked +into lots of untested helper shell scripts, and I've been rewriting those in +Python with unit tests and with a single `Backend` abstraction that isolates +the high-level logic from the details of where each build is being +performed. + +This is all interesting work in its own right, but it's not what I want to +talk about here. While I was doing all this refactoring, I ran across a +couple of methods I wrote a while back which looked something like this: + + :::python + def chroot(self, args, echo=False): + """Run a command in the chroot. + + :param args: the command and arguments to run. + """ + args = set_personality( + args, self.options.arch, series=self.options.series) + if echo: + print("Running in chroot: %s" % + ' '.join("'%s'" % arg for arg in args)) + sys.stdout.flush() + subprocess.check_call([ + "/usr/bin/sudo", "/usr/sbin/chroot", self.chroot_path] + args) + + def run_build_command(self, args, env=None, echo=False): + """Run a build command in the chroot. + + This is unpleasant because we need to run it in /build under sudo + chroot, and there's no way to do this without either a helper + program in the chroot or unpleasant quoting. We go for the + unpleasant quoting. + + :param args: the command and arguments to run. + :param env: dictionary of additional environment variables to set. + """ + args = [shell_escape(arg) for arg in args] + if env: + args = ["env"] + [ + "%s=%s" % (key, shell_escape(value)) + for key, value in env.items()] + args + command = "cd /build && %s" % " ".join(args) + self.chroot(["/bin/sh", "-c", command], echo=echo) + +(I've already replaced the `chroot` method with a call to `Backend.run`, but +it's easier to see what I'm talking about in the original form.) + +One thing to notice about this code is that it uses several *adverbial* +commands: that is, commands that run another command in a different way. +For example, `sudo` runs another command as another user, while `chroot` +runs another command with a different root directory, and `env` runs another +command with different environment variables set. These commands chain +neatly, and they also have the useful property that they take the subsidiary +command and its arguments as a list of arguments. coreutils has [several +other +commands](https://www.gnu.org/software/coreutils/manual/html_node/Modified-command-invocation.html) +that behave this way, and +[adverbio](http://www.greenend.org.uk/rjk/sw/adverbio.html) is another +useful example. + +By contrast, `su -c` is something you might call a "quasi-adverbial" +command: it does modify the behaviour of another command, but it takes it as +a single argument which it then passes to `sh -c`. Every time you have +something that's passed to a shell like this, you need a corresponding layer +of shell quoting to escape any shell metacharacters that should be +interpreted literally. This is often cumbersome and is easy to get wrong. +My Python implementation is as follows, and I wouldn't be totally surprised +to discover that it contained a bug: + + :::python + import re + + non_meta_re = re.compile(r'^[a-zA-Z0-9+,./:=@_-]+$') + + def shell_escape(arg): + if non_meta_re.match(arg): + return arg + else: + return "'%s'" % arg.replace("'", "'\\''") + +Python >= 3.3 has +[shlex.quote](https://docs.python.org/3/library/shlex#shlex.quote), which is +an improvement and we should probably use that instead, but it's still +another thing to forget to call. This is why process-spawning libraries +such as Python's [subprocess](https://docs.python.org/3/library/subprocess), +Perl's [system](http://perldoc.perl.org/functions/system.html) and +[open](http://perldoc.perl.org/functions/open.html), and my own +[libpipeline](http://libpipeline.nongnu.org/) for C encourage programmers to +use a list syntax and to avoid involving the shell entirely wherever +possible. + +One thing that the standard Unix tools don't let you do in an adverbial way +is to change your working directory, and I've run into this annoying +limitation several times. This means that it's difficult to chain that +operation together with other adverbs, for example to run a command in a +particular working directory inside a chroot. The workaround I used above +was to invoke a shell that runs `cd /build && ...`, but that's another +command that's only quasi-adverbial, since the extra shell means an extra +layer of shell quoting. + +(Ian Jackson rightly observes that you can in fact write the necessary +adverb as something like `sh -ec 'cd "$1"; shift; exec "$@"' chdir`. I +think that's a bit uglier than I ideally want to use in production code, but +you might reasonably think that it's worth it to avoid the extra layer of +shell quoting.) + +I therefore decided that this was a feature that belonged in +[coreutils](https://www.gnu.org/software/coreutils/), and after [a bit of +mailing list +discussion](https://lists.gnu.org/archive/html/coreutils/2017-08/msg00053.html) +we felt it was best implemented as a new option to +[env(1)](https://www.gnu.org/software/coreutils/manual/html_node/env-invocation.html). +I sent a patch for this which has been +[accepted](https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=57dea5ed07471b2192cc5edf08993e663a3f6802). +This means that we have a new composable adverb, `env --chdir=NEWDIR`, which +will allow the `run_build_command` method above to be rewritten as something +like this: + + :::python + def run_build_command(self, args, env=None, echo=False): + """Run a build command in the chroot. + + :param args: the command and arguments to run. + :param env: dictionary of additional environment variables to set. + """ + env_args = ["env", "--chdir=/build"] + if env: + for key, value in env.items(): + env_args.append("%s=%s" % (key, value)) + self.chroot(env_args + args, echo=echo) + +The `env --chdir` option will be in coreutils 8.28. We won't be able to use +it in launchpad-buildd until that's available in all Ubuntu series we might +want to build for, so in this particular application that's going to take a +few years; but other applications may well be able to make use of it sooner.