From 76768116254f22bcd11deb14c8c2d7e1f524a556 Mon Sep 17 00:00:00 2001 From: Colin Watson Date: Tue, 19 Dec 2017 13:53:15 +0000 Subject: [PATCH] An odd test failure --- content/odd-test-failure.md | 131 ++++++++++++++++++++++++++++++++++++ 1 file changed, 131 insertions(+) create mode 100644 content/odd-test-failure.md diff --git a/content/odd-test-failure.md b/content/odd-test-failure.md new file mode 100644 index 00000000..80d43c68 --- /dev/null +++ b/content/odd-test-failure.md @@ -0,0 +1,131 @@ +Title: An odd test failure +Slug: odd-test-failure +Date: 2017-12-19 13:52:52 +00:00 +Category: launchpad +Tags: launchpad, planet-debian, planet-ubuntu + +Weird test failures are great at teaching you things that you didn't realise +you might need to know. + +[As previously +mentioned]({filename}/mysterious-bug-with-twisted-plugins.md), I've been +working on converting Launchpad from [Buildout](http://www.buildout.org/) to +[virtualenv](https://virtualenv.pypa.io/en/stable/) and +[pip](https://pip.pypa.io/en/stable/), and I finally landed that change on +our development branch today. The final landing was mostly quite smooth, +except for one test failure on our buildbot that I hadn't seen before: + + :::pytb + ERROR: lp.codehosting.codeimport.tests.test_worker.TestBzrSvnImport.test_stacked + worker ID: unknown worker (bug in our subunit output?) + ---------------------------------------------------------------------- + Traceback (most recent call last): + _StringException: log: {{{ + 36.384 creating repository in file:///tmp/testbzr-6CwSLV.tmp/lp.codehosting.codeimport.tests.test_worker.TestBzrSvnImport.test_stacked/work/stacked-on/.bzr/. + 36.388 creating branch in file:///tmp/testbzr-6CwSLV.tmp/lp.codehosting.codeimport.tests.test_worker.TestBzrSvnImport.test_stacked/work/stacked-on/ + }}} + + Traceback (most recent call last): + File "/srv/buildbot/lpbuildbot/lp-devel-xenial/build/lib/lp/codehosting/codeimport/tests/test_worker.py", line 1108, in test_stacked + stacked_on.fetch(Branch.open(source_details.url)) + File "/srv/buildbot/lpbuildbot/lp-devel-xenial/build/env/local/lib/python2.7/site-packages/bzrlib/branch.py", line 186, in open + possible_transports=possible_transports, _unsupported=_unsupported) + File "/srv/buildbot/lpbuildbot/lp-devel-xenial/build/env/local/lib/python2.7/site-packages/bzrlib/controldir.py", line 689, in open + _unsupported=_unsupported) + File "/srv/buildbot/lpbuildbot/lp-devel-xenial/build/env/local/lib/python2.7/site-packages/bzrlib/controldir.py", line 718, in open_from_transport + find_format, transport, redirected) + File "/srv/buildbot/lpbuildbot/lp-devel-xenial/build/env/local/lib/python2.7/site-packages/bzrlib/transport/__init__.py", line 1719, in do_catching_redirections + return action(transport) + File "/srv/buildbot/lpbuildbot/lp-devel-xenial/build/env/local/lib/python2.7/site-packages/bzrlib/controldir.py", line 706, in find_format + probers=probers) + File "/srv/buildbot/lpbuildbot/lp-devel-xenial/build/env/local/lib/python2.7/site-packages/bzrlib/controldir.py", line 1155, in find_format + raise errors.NotBranchError(path=transport.base) + NotBranchError: Not a branch: "/tmp/tmpdwqrc6/trunk/". + +When I investigated this locally, I found that I could reproduce it if I ran +just that test on its own, but not if I ran it together with the other tests +in the same class. That's certainly my favourite way round for test +isolation failures to present themselves (it's more usual to find state from +one test leaking out and causing another one to fail, which can make for a +very time-consuming exercise of trying to find the critical combination), +but it's still pretty odd. + +I stepped through the `Branch.open` call in each case in the hope of some +enlightenment. The interesting difference was that the custom probers +installed by the `bzr-svn` plugin weren't installed when I ran that one test +on its own, so it was trying to open a branch as a Bazaar branch rather than +using the foreign-branch logic for Subversion, and this presumably depended +on some configuration that only some tests put in place. I was on the verge +of just explicitly setting up that plugin in the test suite's `setUp` +method, but I was still curious about exactly what was breaking this. + +Launchpad installs several Bazaar plugins, and +`lib/lp/codehosting/__init__.py` is responsible for putting most of these in +place: anything in Launchpad itself that uses Bazaar is generally supposed +to do something like `import lp.codehosting` to set everything up. I +therefore put a breakpoint at the top of `lp.codehosting` and stepped +through it to see whether anything was going wrong in the initial setup. +Sure enough, I found that `bzrlib.plugins.svn` was failing to import due to +an exception raised by `bzrlib.i18n.load_plugin_translations`, which was +being swallowed silently but meant that its custom probers weren't being +installed. Here's what that function looks like: + + :::python + def load_plugin_translations(domain): + """Load the translations for a specific plugin. + + :param domain: Gettext domain name (usually 'bzr-PLUGINNAME') + """ + locale_base = os.path.dirname( + unicode(__file__, sys.getfilesystemencoding())) + translation = install_translations(domain=domain, + locale_base=locale_base) + add_fallback(translation) + return translation + +In this case, `sys.getfilesystemencoding` was returning `None`, which isn't +a valid `encoding` argument to `unicode`. But why would that be? It gave +me a sensible result when I ran it from a Python shell in this environment. +A bit of head-scratching later and it occurred to me to look at a backtrace: + + :::text + (Pdb) bt + /home/cjwatson/src/canonical/launchpad/lp-branches/testfix/env/lib/python2.7/site.py(703)() + -> main() + /home/cjwatson/src/canonical/launchpad/lp-branches/testfix/env/lib/python2.7/site.py(694)main() + -> execsitecustomize() + /home/cjwatson/src/canonical/launchpad/lp-branches/testfix/env/lib/python2.7/site.py(548)execsitecustomize() + -> import sitecustomize + /home/cjwatson/src/canonical/launchpad/lp-branches/testfix/env/lib/python2.7/sitecustomize.py(7)() + -> lp_sitecustomize.main() + /home/cjwatson/src/canonical/launchpad/lp-branches/testfix/lib/lp_sitecustomize.py(193)main() + -> dont_wrap_bzr_branch_classes() + /home/cjwatson/src/canonical/launchpad/lp-branches/testfix/lib/lp_sitecustomize.py(139)dont_wrap_bzr_branch_classes() + -> import lp.codehosting + > /home/cjwatson/src/canonical/launchpad/lp-branches/testfix/lib/lp/codehosting/__init__.py(54)() + -> load_plugins([_get_bzr_plugins_path()]) + +I wonder if there's something interesting about being imported from a +`sitecustomize` hook? Sure enough, when I went to look at Python for where +`sys.getfilesystemencoding` is set up, I found this in `Py_InitializeEx`: + + :::c + if (!Py_NoSiteFlag) + initsite(); /* Module site */ + ... + #if defined(Py_USING_UNICODE) && defined(HAVE_LANGINFO_H) && defined(CODESET) + /* On Unix, set the file system encoding according to the + user's preference, if the CODESET names a well-known + Python codec, and Py_FileSystemDefaultEncoding isn't + initialized by other means. Also set the encoding of + stdin and stdout if these are terminals, unless overridden. */ + + if (!overridden || !Py_FileSystemDefaultEncoding) { + ... + } + +I [moved this out of +sitecustomize](https://code.launchpad.net/~cjwatson/launchpad/avoid-importing-bzr-plugins-from-site/+merge/335379), +and it's working better now. But did you know that a `sitecustomize` hook +can't safely use anything that depends on `sys.getfilesystemencoding`? I +certainly didn't, until it bit me. -- 2.30.2