Re-signing PPAs
Julian has written about their efforts to strengthen security in APT, and shortly before that notified us that Launchpad’s signatures on PPAs use weak SHA-1 digests. Unfortunately we hadn’t noticed that before; GnuPG’s defaults tend to result in weak digests unless carefully tweaked, which is a shame.
I started on the necessary fixes for this immediately we heard of the problem, but it’s taken a little while to get everything in place, and I thought I’d explain why since some of the problems uncovered are interesting in their own right.
Firstly, there was the relatively trivial matter of using SHA-512 digests
on new
signatures.
This was mostly a matter of adjusting our configuration, although writing
the test was a bit tricky since
PyGPGME isn’t as helpful as it could
be. (Simpler repository implementations that call gpg
from the command
line should probably just add the --digest-algo SHA512
option instead of
imitating this.)
After getting that in place, any change to a suite in a PPA will result in it being re-signed with SHA-512, which is good as far as it goes, but we also want to re-sign PPAs that haven’t been modified. Launchpad hosts more than 50000 active PPAs, though, a significant percentage of which include packages for sufficiently recent Ubuntu releases that we’d want to re-sign them for this. We can’t expect everyone to push new uploads, and we need to run this through at least some part of our usual publication machinery rather than just writing a hacky shell script to do the job (which would have no idea which keys to sign with, to start with); but forcing full reprocessing of all those PPAs would take a prohibitively long time, and at the moment we need to interrupt normal PPA publication to do this kind of work. I therefore had to spend some quality time working out how to make things go fast enough.
The first couple of changes
(1,
2)
were to add options to our publisher script to let us run just the one step
we need in “careful” mode: that is, forcibly re-run the Release
file
processing step even if it thinks nothing has changed, and entirely disable
the other steps such as generating Packages
and Sources
files. Then
last week I finally got around to timing things on one of our staging
systems so that we could estimate how long a full run would take. It was
taking a little over two seconds per archive, which meant that if we were to
re-sign all published PPAs then that would take more than 33 hours!
Obviously this wasn’t viable; even just re-signing xenial would be
prohibitively slow.
The next question was where all that time was going. I thought perhaps that
the actual signing might be slow for some reason, but it was taking about
half a second per archive: not great, but not enough to account for most of
the slowness. The main part of the delay was in fact when we committed the
database transaction after processing each archive, but not in the actual
PostgreSQL commit, rather in the ORM invalidate
method called to prepare for a commit.
Launchpad uses the excellent Storm for all of its database interactions. One property of this ORM (and possibly of others; I’ll cheerfully admit to not having spent much time with other ORMs) is that it uses a WeakValueDictionary to keep track of the objects it’s populated with database results. Before it commits a transaction, it iterates over all those “alive” objects to note that if they’re used in future then information needs to be reloaded from the database first. Usually this is a very good thing: it saves us from having to think too hard about data consistency at the application layer. But in this case, one of the things we did at the start of the publisher script was:
def getPPAs(self, distribution):
"""Find private package archives for the selected distribution."""
if (self.isCareful(self.options.careful_publishing) or
self.options.include_non_pending):
return distribution.getAllPPAs()
else:
return distribution.getPendingPublicationPPAs()
def getTargetArchives(self, distribution):
"""Find the archive(s) selected by the script's options."""
if self.options.partner:
return [distribution.getArchiveByComponent('partner')]
elif self.options.ppa:
return filter(is_ppa_public, self.getPPAs(distribution))
elif self.options.private_ppa:
return filter(is_ppa_private, self.getPPAs(distribution))
elif self.options.copy_archive:
return self.getCopyArchives(distribution)
else:
return [distribution.main_archive]
That innocuous-looking filter
means that we do all the public/private
filtering of PPAs up-front and return a list of all the PPAs we intend to
operate on. This means that all those objects are alive as far as Storm is
concerned and need to be considered for invalidation on every commit, and
the time required for that stacks up when many thousands of objects are
involved: this is essentially accidentally
quadratic behaviour, because all
archives are considered when committing changes to each archive in turn.
Normally this isn’t too bad because only a few hundred PPAs need to be
processed in any given run; but if we’re running in a mode where we’re
processing all PPAs rather than just ones that are pending publication, then
suddenly this balloons to the point where it takes a couple of seconds. The
fix
is very simple, using an
iterator instead
so that we don’t need to keep all the objects alive:
from itertools import ifilter
def getTargetArchives(self, distribution):
"""Find the archive(s) selected by the script's options."""
if self.options.partner:
return [distribution.getArchiveByComponent('partner')]
elif self.options.ppa:
return ifilter(is_ppa_public, self.getPPAs(distribution))
elif self.options.private_ppa:
return ifilter(is_ppa_private, self.getPPAs(distribution))
elif self.options.copy_archive:
return self.getCopyArchives(distribution)
else:
return [distribution.main_archive]
After that, I turned to that half a second for signing. A good chunk of
that was accounted for by the signContent
method taking a fingerprint
rather than a key, despite the fact that we normally already had the key in
hand; this caused us to have to ask GPGME to reload the key, which requires
two subprocess calls. Converting this to take a key rather than a
fingerprint
gets the per-archive time down to about a quarter of a second on our staging
system, about eight times faster than where we started.
Using this, we’ve now re-signed all xenial Release
files in PPAs using
SHA-512 digests. On production, this took about 80 minutes to iterate over
around 70000 archives, of which 1761 were modified. Most of the time
appears to have been spent skipping over unmodified archives; even a few
hundredths of a second per archive adds up quickly there. The remaining
time comes out to around 0.4 seconds per modified archive. There’s
certainly still room for speeding this up a bit.
We wouldn’t want to do this procedure every day, but it’s acceptable for
occasional tasks like this. I expect that we’ll similarly re-sign wily,
vivid, and trusty Release
files soon in the same way.