From: Colin Watson Date: Sun, 22 Sep 2019 07:57:47 +0000 (+0100) Subject: Porting Storm to Python 3 X-Git-Url: https://www.chiark.greenend.org.uk/ucgi/~cjwatson/git?a=commitdiff_plain;h=a149a339deb81d19894f16980182ec2fbea202cd;p=blog.git Porting Storm to Python 3 --- diff --git a/content/storm-py3.md b/content/storm-py3.md new file mode 100644 index 00000000..b2b19c0c --- /dev/null +++ b/content/storm-py3.md @@ -0,0 +1,152 @@ +Title: Porting Storm to Python 3 +Slug: storm-py3 +Date: 2019-09-22 08:56:42 +0100 +Tags: launchpad, planet-debian, planet-ubuntu, storm + +We released [Storm](https://storm.canonical.com/) 0.21 on Friday (the +release announcement seems to be stuck in moderation, but you can look at +the [NEWS](https://bazaar.launchpad.net/+branch/storm/view/head:/NEWS) file +directly). For me, the biggest part of this release was adding Python 3 +support. + +Storm is a really nice and lightweight ORM (object-relational mapper) for +Python, developed by Canonical. We use it for some major products +([Launchpad](https://launchpad.net/) and +[Landscape](https://landscape.canonical.com/) are the ones I know of), and +it's also free software and used by some other folks as well. Other popular +ORMs for Python include [SQLObject](http://sqlobject.org/), +[SQLAlchemy](https://www.sqlalchemy.org/) and the +[Django](https://www.djangoproject.com/) ORM; we use those in various places +too depending on the context, but personally I've always preferred Storm for +the readability of code that uses it and for how easy it is to debug and +extend it. + +It's been a problem for a while that Storm only worked with Python 2. It's +one of a handful of major blockers to getting Launchpad running on Python 3, +which we definitely want to do; [stoq](https://github.com/stoq/stoq) ended +up with a local fork of Storm to cope with this; and it was recently +[removed from Debian](https://bugs.debian.org/933983) for this and other +reasons. None of that was great. So, with significant assistance from a +large patch contributed by Thiago Bellini, and with patient code review from +Simon Poirier and some of my other colleagues, we finally managed to get +that sorted out in this release. + +In many ways, Storm was in fairly good shape already for a project that +hadn't yet been ported to Python 3: while its internal idea of which strings +were bytes and which text required quite a bit of untangling in the way that +Python 2 code usually does, its normal class used for text database columns +was already `Unicode` which only accepted text input (`unicode` in Python +2), so it could have been a lot worse; this also means that applications +that use Storm tend to get at least this part right even in Python 2. Aside +from the bytes/text thing, many of the required changes were just the usual +largely-mechanical ones that anyone who's done 2-to-3 porting will be +familiar with. But there were some areas that required non-trivial thought, +and I'd like to talk about some of those here. + +## Exception types + +Concrete database implementations such as +[psycopg2](http://initd.org/psycopg/) raise implementation-specific +exception types. The inheritance hierarchy for these is defined by the +[Python Database API](https://www.python.org/dev/peps/pep-0249/) (DB-API), +but the actual exception classes aren't in a common place; rather, you might +get an instance of `psycopg2.errors.IntegrityError` when using PostgreSQL +but an instance of `sqlite3.IntegrityError` when using SQLite. To make +things easier for applications that don't have a strict requirement for a +particular database backend, Storm arranged to inject its own virtual +exception types as additional base classes of these concrete exceptions by +patching their `__bases__` attribute, so for example, you could import +`IntegrityError` from `storm.exceptions` and catch that rather than having +to catch each backend-specific possibility. + +Although this was always a bit of a cheat, it worked well in practice for a +while, but the first sign of trouble even before porting to Python 3 was +with psycopg2 2.5. This release started implementing its DB-API exception +types in a C extension, which meant that it was no longer possible to patch +`__bases__`. To get around that, a few years ago I landed a +[patch](https://code.launchpad.net/~cjwatson/storm/psycopg-2.5/+merge/278330) +to Storm to use `abc.ABCMeta.register` instead to register the DB-API +exceptions as virtual subclasses of Storm's exceptions, which solved the +problem for Python 2. However, even at the time I landed that, I knew that +it would be a porting obstacle due to [Python issue +12029](https://bugs.python.org/issue12029); Django ran into that as well. + +In the end, I opted to +[refactor](https://code.launchpad.net/~cjwatson/storm/refactor-exception-wrapping/+merge/369319) +how Storm handles exceptions: it now wraps cursor and connection objects in +such a way as to catch DB-API exceptions raised by their methods and +properties and re-raise them using wrapper exception types that inherit from +both the appropriate subclass of `StormError` and the original DB-API +exception type, and with some care I even managed to avoid this being +painfully repetitive. Out-of-tree database backends will need to make some +minor adjustments (removing `install_exceptions`, adding an +`_exception_module` property to their `Database` subclass, adjusting the +`raw_connect` method of their `Database` subclass to do exception wrapping, +and possibly implementing `_make_combined_exception_type` and/or +`_wrap_exception` if they need to add extra attributes to the wrapper +exceptions). Applications that follow the usual Storm idiom of catching +`StormError` or any of its subclasses should continue to work without +needing any changes. + +## SQLObject compatibility + +Storm includes some API compatibility with SQLObject; this was from before +my time, but I believe it was mainly because Launchpad and possibly +Landscape previously used SQLObject and this made the port to Storm very +much easier. It still works fine for the parts of Launchpad that haven't +been ported to Storm, but I wouldn't be surprised if there were newer +features of SQLObject that it doesn't support. + +The main question here was what to do with `StringCol` and its associated +`AutoUnicodeVariable`. I opted to make these explicitly only accept text on +Python 3, since the main reason for them to accept bytes was to allow using +them with Python 2 native strings (i.e. `str`), and on Python 3 `str` is +already text so there's much less need for the porting affordance in that +case. + +Since releasing 0.21 I realised that the `StringCol` implementation in +SQLObject itself in fact accepts both bytes and text even on Python 3, so +it's possible that we'll need to change this in the future, although we +haven't yet found any real code using Storm's SQLObject compatibility layer +that might rely on this. Still, it's much easier for Storm to start out on +the stricter side and perhaps become more lenient than it is to go the other +way round. + +## inspect.getargspec + +Storm had some fairly complicated use of `inspect.getargspec` on Python 2 as +part of its test mocking arrangements. This didn't work in Python 3 due to +some subtleties relating to bound methods. I +[switched](https://code.launchpad.net/~cjwatson/storm/py3-mocker-inspect/+merge/371174) +to the modern `inspect.signature` API in Python 3 to fix this, which in any +case is rather simpler with the exception of a wrinkle in how method +descriptors work. + +(It's possible that these mocking arrangements could be simplified nowadays +by using some more off-the-shelf mocking library; I haven't looked into that +in any detail.) + +## What's next? + +I'm [working on getting Storm back into +Debian](https://bugs.debian.org/940876) now, which will be with Python 3 +support only since Debian is in the process of gradually removing Python 2 +module support. Other than that I don't really have any particular plans +for Storm at the moment (although of course I'm not the only person with an +interest in it), aside from ideally avoiding leaving six years between +releases again. I expect we can go back into bug-fixing mode there for a +while. + +From the Launchpad side, I've recently made progress on one of the other +major Python 3 blockers (porting Bazaar code hosting to +[Breezy](https://www.breezy-vcs.org/), coming soon). There are still some +other significant blockers, the largest being migrating to Mailman 3, +subvertpy fixes so that we can port code importing to Breezy as well, and +porting the lazr.restful stack; but we may soon be able to reach the point +where it's possible to start running interesting subsets of the test suite +using Python 3 and categorising the failures, at which point we'll be able +to get a much better idea of how far we still have to go. Porting a project +with the best part of a million lines of code and around three hundred +dependencies is always going to take a while, but I'm happy to be making +progress there, both due to Python 2's impending end of upstream support and +so that eventually we can start using new language facilities.