--- /dev/null
+Title: Porting Launchpad to Python 3: progress report
+Slug: lp-python3-progress
+Date: 2020-09-25 12:01:40 +01:00
+Category: launchpad
+Tags: launchpad, planet-debian, planet-ubuntu
+
+[Launchpad](https://launchpad.net/) still requires Python 2, which in 2020
+is [a bit of a problem](https://www.python.org/doc/sunset-python-2/).
+Unlike a lot of the rest of 2020, though, there's good reason to be
+optimistic about progress.
+
+I've been porting Python 2 code to Python 3 on and off for a long time, from
+back when I was on the Ubuntu Foundations team and maintaining things like
+the [Ubiquity installer](https://launchpad.net/ubiquity). When I moved to
+Launchpad in 2015 it was certainly on my mind that this was a large body of
+code still stuck on Python 2. One option would have been to just accept
+that and leave it as it is, maybe doing more backporting work over time as
+support for Python 2 fades away. I've long been of the opinion that this
+would doom Launchpad to being unmaintainable in the long run, and since I
+genuinely love working on Launchpad - I find it an incredibly rewarding
+project - this wasn't something I was willing to accept. We're already
+seeing some of our important dependencies dropping support for Python 2,
+which is perfectly reasonable on their terms but which is starting to become
+a genuine obstacle to delivering important features when we need new
+features from newer versions of those dependencies. It also looks as though
+it may be difficult for us to run on Ubuntu 20.04 LTS (we're currently on
+16.04, with an upgrade to 18.04 in progress) as long as we still require
+Python 2, since we have some system dependencies that 20.04 no longer
+provides. And then there are exciting new features like [type
+hints](https://docs.python.org/3/library/typing.html) and
+[async/await](https://docs.python.org/3/library/asyncio.html) that we'd like
+to be able to use.
+
+However, until last year there were so many blockers that even considering a
+port was barely conceivable. What changed in 2019 was sorting out a
+trifecta of core dependencies. We [ported]({filename}/storm-py3.md) our
+database layer, [Storm](https://storm.canonical.com/). We
+[upgraded](https://code.launchpad.net/~cjwatson/launchpad/+git/launchpad/+merge/376781)
+to modern versions of our [Zope](https://www.zope.org/) Toolkit dependencies
+(after contributing various fixes upstream, including some substantial
+changes to Zope's [test runner](https://pypi.org/project/zope.testrunner/)
+that we'd carried as local patches for some years). And we
+[ported](https://code.launchpad.net/~cjwatson/launchpad/+git/launchpad/+merge/373805)
+our Bazaar code hosting infrastructure to
+[Breezy](https://www.breezy-vcs.org/). With all that in place, a port
+seemed more of a realistic possibility.
+
+Still, even with this, it was never going to be a matter of just following
+some [standard porting advice](http://python3porting.com/) and calling it
+good. Launchpad has almost a million lines of Python code in its [main git
+tree](https://git.launchpad.net/launchpad), and around 250 dependencies of
+which a number are quite Launchpad-specific. In a project that size, not
+only is following standard porting advice an extremely time-consuming task
+in its own right, but just about every strange corner case is going to show
+up somewhere. (Did you know that `StringIO.StringIO(None)` and
+`io.StringIO(None)` do different things even after you account for the
+native string vs. Unicode text difference? How about [the behaviour of
+`.union()` on a subclass of
+`frozenset`](https://code.launchpad.net/~cjwatson/launchpad/+git/launchpad/+merge/385711)?)
+Launchpad's test suite is fortunately extremely thorough, but even just
+starting up the test suite involves importing most of the data model code,
+so before you can start taking advantage of it you have to make a large
+fraction of the codebase be at least syntactically-correct Python 3 code and
+use only modules that exist in Python 3 while still working in Python 2; in
+a project this size that turns out to be a large effort on its own, and can
+be quite
+[risky](https://blog.launchpad.net/general/login-regression-for-users-with-non-ascii-names)
+in places.
+
+Canonical's product engineering teams work on a six-month cycle, but it just
+isn't possible to cram this sort of thing into six months unless you do
+literally nothing else, and "please can we put all feature development on
+hold while we run to stand still" is a pretty tough sell to even the most
+understanding management. Fortunately, we've been able to grow the
+[Launchpad team](https://launchpad.net/~launchpad) in the last year or so,
+and so it's been possible to put "Python 3" on our roadmap in the
+understanding that we aren't going to get all the way there in one cycle,
+while still being able to do other substantial feature development work as
+well.
+
+So, with all that preamble, what have we done this cycle? We've taken a
+two-pronged approach. From one end, we identified 147 classes that needed
+to be ported away from some compatibility code in our database layer that
+was substantially less friendly to Python 3: we've ported 38 of those, so
+there's clearly a fair bit more to do, but we were able to distribute this
+work out among the team quite effectively. From the other end, it was clear
+that it would be very inefficient to do general porting work when any
+attempt to even run the test suite would run straight into the same crashes
+in the same order, so I set myself a target of getting the test suite to
+start up, and started hacking on an enormous git branch that I never
+expected to try to land directly: instead, I felt free to commit just about
+anything that looked reasonable and moved things forward even if it was very
+rough, and every so often went back to tidy things up and cherry-pick
+individual commits into a form that included some kind of explanation and
+passed existing tests so that I could propose them for review.
+
+This strategy has been dramatically more successful than anything I've tried
+before at this scale. So far this cycle, considering only Launchpad's main
+git tree, we've landed 137 Python-3-relevant merge proposals for a total of
+39552 lines of `git diff` output, keeping our existing tests passing along
+the way and deploying incrementally to production. We have about 27000 more
+lines of patch at varying degrees of quality to tidy up and merge. Our main
+development branch is only perhaps 10 or 20 more patches away from the test
+suite being able to start up, at which point we'll be able to get a buildbot
+running so that multiple developers can work on this much more easily and
+see the effect of their work. With the full unlanded patch stack, about 75%
+of the test suite passes on Python 3! This still leaves a long tail of
+several thousand tests to figure out and fix, but it's a much more
+incrementally-tractable kind of problem than where we started.
+
+Finally: the funniest (to me) bug I've encountered in this effort was the
+one I encountered in the test runner and fixed in
+[zopefoundation/zope.testrunner#106](https://github.com/zopefoundation/zope.testrunner/pull/106):
+IDs of failing tests were written to a pipe, so if you have a test suite
+that's large enough and broken enough then eventually that pipe would reach
+its capacity and your test runner would just give up and hang. Pretty
+annoying when it meant an overnight test run didn't give useful results, but
+also eloquent commentary of sorts.