chiark / gitweb /
An odd test failure
authorColin Watson <cjwatson@debian.org>
Tue, 19 Dec 2017 13:53:15 +0000 (13:53 +0000)
committerColin Watson <cjwatson@debian.org>
Tue, 19 Dec 2017 13:53:15 +0000 (13:53 +0000)
content/odd-test-failure.md [new file with mode: 0644]

diff --git a/content/odd-test-failure.md b/content/odd-test-failure.md
new file mode 100644 (file)
index 0000000..80d43c6
--- /dev/null
@@ -0,0 +1,131 @@
+Title: An odd test failure
+Slug: odd-test-failure
+Date: 2017-12-19 13:52:52 +00:00
+Category: launchpad
+Tags: launchpad, planet-debian, planet-ubuntu
+
+Weird test failures are great at teaching you things that you didn't realise
+you might need to know.
+
+[As previously
+mentioned]({filename}/mysterious-bug-with-twisted-plugins.md), I've been
+working on converting Launchpad from [Buildout](http://www.buildout.org/) to
+[virtualenv](https://virtualenv.pypa.io/en/stable/) and
+[pip](https://pip.pypa.io/en/stable/), and I finally landed that change on
+our development branch today.  The final landing was mostly quite smooth,
+except for one test failure on our buildbot that I hadn't seen before:
+
+    :::pytb
+    ERROR: lp.codehosting.codeimport.tests.test_worker.TestBzrSvnImport.test_stacked
+    worker ID: unknown worker (bug in our subunit output?)
+    ----------------------------------------------------------------------
+    Traceback (most recent call last):
+    _StringException: log: {{{
+    36.384  creating repository in file:///tmp/testbzr-6CwSLV.tmp/lp.codehosting.codeimport.tests.test_worker.TestBzrSvnImport.test_stacked/work/stacked-on/.bzr/.
+    36.388  creating branch <bzrlib.branch.BzrBranchFormat7 object at 0xeb85b36c> in file:///tmp/testbzr-6CwSLV.tmp/lp.codehosting.codeimport.tests.test_worker.TestBzrSvnImport.test_stacked/work/stacked-on/
+    }}}
+    
+    Traceback (most recent call last):
+      File "/srv/buildbot/lpbuildbot/lp-devel-xenial/build/lib/lp/codehosting/codeimport/tests/test_worker.py", line 1108, in test_stacked
+        stacked_on.fetch(Branch.open(source_details.url))
+      File "/srv/buildbot/lpbuildbot/lp-devel-xenial/build/env/local/lib/python2.7/site-packages/bzrlib/branch.py", line 186, in open
+        possible_transports=possible_transports, _unsupported=_unsupported)
+      File "/srv/buildbot/lpbuildbot/lp-devel-xenial/build/env/local/lib/python2.7/site-packages/bzrlib/controldir.py", line 689, in open
+        _unsupported=_unsupported)
+      File "/srv/buildbot/lpbuildbot/lp-devel-xenial/build/env/local/lib/python2.7/site-packages/bzrlib/controldir.py", line 718, in open_from_transport
+        find_format, transport, redirected)
+      File "/srv/buildbot/lpbuildbot/lp-devel-xenial/build/env/local/lib/python2.7/site-packages/bzrlib/transport/__init__.py", line 1719, in do_catching_redirections
+        return action(transport)
+      File "/srv/buildbot/lpbuildbot/lp-devel-xenial/build/env/local/lib/python2.7/site-packages/bzrlib/controldir.py", line 706, in find_format
+        probers=probers)
+      File "/srv/buildbot/lpbuildbot/lp-devel-xenial/build/env/local/lib/python2.7/site-packages/bzrlib/controldir.py", line 1155, in find_format
+        raise errors.NotBranchError(path=transport.base)
+    NotBranchError: Not a branch: "/tmp/tmpdwqrc6/trunk/".
+
+When I investigated this locally, I found that I could reproduce it if I ran
+just that test on its own, but not if I ran it together with the other tests
+in the same class.  That's certainly my favourite way round for test
+isolation failures to present themselves (it's more usual to find state from
+one test leaking out and causing another one to fail, which can make for a
+very time-consuming exercise of trying to find the critical combination),
+but it's still pretty odd.
+
+I stepped through the `Branch.open` call in each case in the hope of some
+enlightenment.  The interesting difference was that the custom probers
+installed by the `bzr-svn` plugin weren't installed when I ran that one test
+on its own, so it was trying to open a branch as a Bazaar branch rather than
+using the foreign-branch logic for Subversion, and this presumably depended
+on some configuration that only some tests put in place.  I was on the verge
+of just explicitly setting up that plugin in the test suite's `setUp`
+method, but I was still curious about exactly what was breaking this.
+
+Launchpad installs several Bazaar plugins, and
+`lib/lp/codehosting/__init__.py` is responsible for putting most of these in
+place: anything in Launchpad itself that uses Bazaar is generally supposed
+to do something like `import lp.codehosting` to set everything up.  I
+therefore put a breakpoint at the top of `lp.codehosting` and stepped
+through it to see whether anything was going wrong in the initial setup.
+Sure enough, I found that `bzrlib.plugins.svn` was failing to import due to
+an exception raised by `bzrlib.i18n.load_plugin_translations`, which was
+being swallowed silently but meant that its custom probers weren't being
+installed.  Here's what that function looks like:
+
+    :::python
+    def load_plugin_translations(domain):
+        """Load the translations for a specific plugin.
+    
+        :param domain: Gettext domain name (usually 'bzr-PLUGINNAME')
+        """
+        locale_base = os.path.dirname(
+            unicode(__file__, sys.getfilesystemencoding()))
+        translation = install_translations(domain=domain,
+            locale_base=locale_base)
+        add_fallback(translation)
+        return translation
+
+In this case, `sys.getfilesystemencoding` was returning `None`, which isn't
+a valid `encoding` argument to `unicode`.  But why would that be?  It gave
+me a sensible result when I ran it from a Python shell in this environment.
+A bit of head-scratching later and it occurred to me to look at a backtrace:
+
+    :::text
+    (Pdb) bt
+      /home/cjwatson/src/canonical/launchpad/lp-branches/testfix/env/lib/python2.7/site.py(703)<module>()
+    -> main()
+      /home/cjwatson/src/canonical/launchpad/lp-branches/testfix/env/lib/python2.7/site.py(694)main()
+    -> execsitecustomize()
+      /home/cjwatson/src/canonical/launchpad/lp-branches/testfix/env/lib/python2.7/site.py(548)execsitecustomize()
+    -> import sitecustomize
+      /home/cjwatson/src/canonical/launchpad/lp-branches/testfix/env/lib/python2.7/sitecustomize.py(7)<module>()
+    -> lp_sitecustomize.main()
+      /home/cjwatson/src/canonical/launchpad/lp-branches/testfix/lib/lp_sitecustomize.py(193)main()
+    -> dont_wrap_bzr_branch_classes()
+      /home/cjwatson/src/canonical/launchpad/lp-branches/testfix/lib/lp_sitecustomize.py(139)dont_wrap_bzr_branch_classes()
+    -> import lp.codehosting
+    > /home/cjwatson/src/canonical/launchpad/lp-branches/testfix/lib/lp/codehosting/__init__.py(54)<module>()
+    -> load_plugins([_get_bzr_plugins_path()])
+
+I wonder if there's something interesting about being imported from a
+`sitecustomize` hook?  Sure enough, when I went to look at Python for where
+`sys.getfilesystemencoding` is set up, I found this in `Py_InitializeEx`:
+
+    :::c
+        if (!Py_NoSiteFlag)
+            initsite(); /* Module site */
+        ...
+    #if defined(Py_USING_UNICODE) && defined(HAVE_LANGINFO_H) && defined(CODESET)
+        /* On Unix, set the file system encoding according to the
+           user's preference, if the CODESET names a well-known
+           Python codec, and Py_FileSystemDefaultEncoding isn't
+           initialized by other means. Also set the encoding of
+           stdin and stdout if these are terminals, unless overridden.  */
+    
+        if (!overridden || !Py_FileSystemDefaultEncoding) {
+            ...
+        }
+
+I [moved this out of
+sitecustomize](https://code.launchpad.net/~cjwatson/launchpad/avoid-importing-bzr-plugins-from-site/+merge/335379),
+and it's working better now.  But did you know that a `sitecustomize` hook
+can't safely use anything that depends on `sys.getfilesystemencoding`?  I
+certainly didn't, until it bit me.