Tue, 15 Apr 2014
We had some requests to get GHC (the Glasgow Haskell Compiler) up and running on two new Ubuntu architectures: arm64, added in 13.10, and ppc64el, added in 14.04. This has been something of a saga, and has involved rather more late-night hacking than is probably good for me.
Book the First: Recalled to a life of strange build systems
You might not know it from the sheer bulk of uploads I do sometimes, but I actually don't speak a word of Haskell and it's not very high up my list of things to learn. But I am a pretty experienced build engineer, and I enjoy porting things to new architectures: I'm firmly of the belief that breadth of architecture support is a good way to shake out certain categories of issues in code, that it's worth doing aggressively across an entire distribution, and that, even if you don't think you need something now, new requirements have a habit of coming along when you least expect them and you might as well be prepared in advance. Furthermore, it annoys me when we have excessive noise in our build failure and proposed-migration output and I often put bits and pieces of spare time into gardening miscellaneous problems there, and at one point there was a lot of Haskell stuff on the list and it got a bit annoying to have to keep sending patches rather than just fixing things myself, and ... well, I ended up as probably the only non-Haskell-programmer on the Debian Haskell team and found myself fixing problems there in my free time. Life is a bit weird sometimes.
Bootstrapping packages on a new architecture is a bit of a black art that only a fairly small number of relatively bitter and twisted people know very much about. Doing it in Ubuntu is specifically painful because we've always forbidden direct binary uploads: all binaries have to come from a build daemon. Compilers in particular often tend to be written in the language they compile, and it's not uncommon for them to build-depend on themselves: that is, you need a previous version of the compiler to build the compiler, stretching back to the dawn of time where somebody put things together with a big magnet or something. So how do you get started on a new architecture? Well, what we do in this case is we construct a binary somehow (usually involving cross-compilation) and insert it as a build-dependency for a proper build in Launchpad. The ability to do this is restricted to a small group of Canonical employees, partly because it's very easy to make mistakes and partly because things like the classic "Reflections on Trusting Trust" are in the backs of our minds somewhere. We have an iron rule for our own sanity that the injected build-dependencies must themselves have been built from the unmodified source package in Ubuntu, although there can be source modifications further back in the chain. Fortunately, we don't need to do this very often, but it does mean that as somebody who can do it I feel an obligation to try and unblock other people where I can.
As far as constructing those build-dependencies goes, sometimes we look for binaries built by other distributions (particularly Debian), and that's pretty straightforward. In this case, though, these two architectures are pretty new and the Debian ports are only just getting going, and as far as I can tell none of the other distributions with active arm64 or ppc64el ports (or trivial name variants) has got as far as porting GHC yet. Well, OK. This was somewhere around the Christmas holidays and I had some time. Muggins here cracks his knuckles and decides to have a go at bootstrapping it from scratch. It can't be that hard, right? Not to mention that it was a blocker for over 600 entries on that build failure list I mentioned, which is definitely enough to make me sit up and take notice; we'd even had the odd customer request for it.
Several attempts later and I was starting to doubt my sanity, not least for trying in the first place. We ship GHC 7.6, and upgrading to 7.8 is not a project I'd like to tackle until the much more experienced Haskell folks in Debian have switched to it in unstable. The porting documentation for 7.6 has bitrotted more or less beyond usability, and the corresponding documentation for 7.8 really isn't backportable to 7.6. I tried building 7.8 for ppc64el anyway, picking that on the basis that we had quicker hardware for it and didn't seem likely to be particularly more arduous than arm64 (ho ho), and I even got to the point of having a cross-built stage2 compiler (stage1, in the cross-building case, is a GHC binary that runs on your starting architecture and generates code for your target architecture) that I could copy over to a ppc64el box and try to use as the base for a fully-native build, but it segfaulted incomprehensibly just after spawning any child process. Compilers tend to do rather a lot, especially when they're built to use GCC to generate object code, so this was a pretty serious problem, and it resisted analysis. I poked at it for a while but didn't get anywhere, and I had other things to do so declared it a write-off and gave up.
Book the Second: The golden thread of progress
In March, another mailing list conversation prodded me into finding a blog entry by Karel Gardas on building GHC for arm64. This was enough to be worth another look, and indeed it turned out that (with some help from Karel in private mail) I was able to cross-build a compiler that actually worked and could be used to run a fully-native build that also worked. Of course this was 7.8, since as I mentioned cross-building 7.6 is unrealistically difficult unless you're considerably more of an expert on GHC's labyrinthine build system than I am. OK, no problem, right? Getting a GHC at all is the hard bit, and 7.8 must be at least as capable as 7.6, so it should be able to build 7.6 easily enough ...
Not so much. What I'd missed here was that compiler engineers generally only care very much about building the compiler with older versions of itself, and if the language in question has any kind of deprecation cycle then the compiler itself is likely to be behind on various things compared to more typical code since it has to be buildable with older versions. This means that the removal of some deprecated interfaces from 7.8 posed a problem, as did some changes in certain primops that had gained an associated compatibility layer in 7.8 but nobody had gone back to put the corresponding compatibility layer into 7.6. GHC supports running Haskell code through the C preprocessor, and there's a
This was all interesting, and finally all that work was actually paying off in terms of getting to watch a slew of several hundred build failures vanish from arm64 (the final count was something like 640, I think). The fly in the ointment was that ppc64el was still blocked, as the problem there wasn't building 7.6, it was getting a working 7.8. But now I really did have other much more urgent things to do, so I figured I just wouldn't get to this by release time and stuck it on the figurative shelf.
Book the Third: The track of a bug
Then, last Friday, I cleared out my urgent pile and thought I'd have another quick look. (I get a bit obsessive about things like this that smell of "interesting intellectual puzzle".) slyfox on the #ghc IRC channel gave me some general debugging advice and, particularly usefully, a reduced example program that I could use to debug just the process-spawning problem without having to wade through noise from running the rest of the compiler. I reproduced the same problem there, and then found that the program crashed earlier (in
You see, the vast majority of porting bugs come down to what I might call gross properties of the architecture. You have things like whether it's 32-bit or 64-bit, big-endian or little-endian, whether
From some painstaking gdb work, one thing I eventually noticed was that
There's still more to do. I gather there may be a Google Summer of Code project in Linaro to write proper native code generation for GHC on arm64: this would make things a good deal faster, but also enable GHCi (the interpreter) and Template Haskell, and thus clear quite a few more build failures. Since there's already native code generation for ppc64 in GHC, getting it going for ppc64el would probably only be a couple of days' work at this point. But these are niceties by comparison, and I'm more than happy with what I got working for 14.04.
The upshot of all of this is that I may be the first non-Haskell-programmer to ever port GHC to two entirely new architectures. I'm not sure if I gain much from that personally aside from a lot of lost sleep and being considered extremely strange. It has, however, been by far the most challenging set of packages I've ported, and a fascinating trip through some odd corners of build systems and undefined behaviour that I don't normally need to touch.Sat, 18 Jan 2014
This is mostly a repost of my ubuntu-devel mail for a wider audience, but see below for some additions.
I'd like to upgrade to GRUB 2.02 for Ubuntu 14.04; it's currently in beta. This represents a year and a half of upstream development, and contains many new features, which you can see in the NEWS file.
Obviously I want to be very careful with substantial upgrades to the default boot loader. So, I've put this in trusty-proposed, and filed a blocking bug to ensure that it doesn't reach trusty proper until it's had a reasonable amount of manual testing. If you are already using trusty and have some time to try this out, it would be very helpful to me. I suggest that you only attempt this if you're comfortable driving
Please ensure that you have rescue media to hand before starting testing. The simplest way to upgrade is to enable trusty-proposed, upgrade ONLY packages whose names start with "grub" (e.g. use
Please report your experiences (positive and negative) with this upgrade in the tracking bug. I'm particularly interested in systems that are complex in any way: UEFI Secure Boot, non-trivial disk setups, manual configuration, that kind of thing. If any of the problems you see are also ones you saw with earlier versions of GRUB, please identify those clearly, as I want to prioritise handling regressions over anything else. I've assigned myself to that bug to ensure that messages to it are filtered directly into my inbox.
I'll add a couple of things that weren't in my ubuntu-devel mail. Firstly, this is all in Debian experimental as well (I do all the work in Debian and sync it across, so the grub2 source package in Ubuntu is a verbatim copy of the one in Debian these days). There are some configuration differences applied at build time, but a large fraction of test cases will apply equally well to both. I don't have a definite schedule for pushing this into jessie yet - I only just finished getting 2.00 in place there, and the release schedule gives me a bit more time - but I certainly want to ship jessie with 2.02 or newer, and any test feedback would be welcome. It's probably best to just e-mail feedback to me directly for now, or to the pkg-grub-devel list.
Secondly, a couple of news sites have picked this up and run it as "Canonical intends to ship Ubuntu 14.04 LTS with a beta version of GRUB". This isn't in fact my intent at all. I'm doing this now because I think GRUB 2.02 will be ready in non-beta form in time for Ubuntu 14.04, and indeed that putting it in our development release will help to stabilise it; I'm an upstream GRUB developer too and I find the exposure of widely-used packages very helpful in that context. It will certainly be much easier to upgrade to a beta now and a final release later than it would be to try to jump from 2.00 to 2.02 in a month or two's time.
Even if there's some unforeseen delay and 2.02 isn't released in time, though, I think nearly three months of stabilisation is still plenty to yield a boot loader that I'm comfortable with shipping in an LTS. I've been backporting a lot of changes to 2.00 and even 1.99, and, as ever for an actively-developed codebase, it gets harder and harder over time (in particular, I've spent longer than I'd like hunting down and backporting fixes for non-512-byte sector disks). While I can still manage it, I don't want to be supporting 2.00 for five more years after upstream has moved on; I don't think that would be in anyone's best interests. And I definitely want some of the new features which aren't sensibly backportable, such as several of the new platforms (ARM, ARM64, Xen) and various networking improvements; I can imagine a number of our users being interested in things like optional signature verification of files GRUB reads from disk, improved Mac support, and the TrueCrypt ISO loader, just to name a few. This should be a much stronger base for five-year support.Fri, 26 Oct 2012
I've just finished deploying automatic installability checking for Ubuntu's development release, which is more or less equivalent to the way that uploads are promoted from Debian unstable to testing. See my ubuntu-devel post and my ubuntu-devel-announce post for details. This now means that we'll be opening the archive for general development once glibc 2.16 packages are ready.
I'm very excited about this because it's something I've wanted to do for a long, long time. In fact, back in 2004 when I had my very first telephone conversation with a certain spaceman about this crazy Debian-based project he wanted me to work on, I remember talking about Debian's testing migration system and some ways I thought it could be improved. I don't remember the details of that conversation any more and what I just deployed may well bear very little resemblance to it, but it should transform the extent to which our development release is continuously usable.
The next step is to hook in autopkgtest results. This will allow us to do a degree of automatic testing of reverse-dependencies when we upgrade low-level libraries.Sun, 27 May 2012
OpenSSH 6.0p1 was released a little while back; this weekend I belatedly got round to uploading packages of it to Debian unstable and Ubuntu quantal.
I was a bit delayed by needing to put together an improvement to privsep sandbox selection that particularly matters in the context of distributions. One of the experts on seccomp_filter has commented favourably on it, but I haven't yet had a comment from upstream themselves, so I may need to refine this depending on what they say.
(This is a good example of how it matters that software is often not built on the system that it's going to run on, and in particular that the kernel version is rather likely to be different. Where possible it's always best to detect kernel capabilities at run-time rather than at build-time.)
I didn't make it very clear in the changelog, but using the new seccomp_filter sandbox currently requires
I've released libpipeline 1.2.1, and uploaded it to Debian unstable. This is a bug-fix release:
I've managed to go for eleven years working on Debian and nearly eight on Ubuntu without ever needing to teach myself how APT's resolver works. I get the impression that there's a certain mystique about it in general (alternatively, I'm just the last person to figure this out). Recently, though, I had a couple of Ubuntu upgrade bugs to fix that turned out to be bugs in the resolver, and I thought it might be interesting to walk through the process of fixing them based on the
Breakage with Breaks
The first was Ubuntu bug #922485 (apt.log). To understand the log, you first need to know that APT makes up to ten passes of the resolver to attempt to fix broken dependencies by upgrading, removing, or holding back packages; if there are still broken packages after this point, it's generally because it's got itself stuck in some kind of loop, and it bails out rather than carrying on forever. The current pass number is shown in each "Investigating" log entry, so they start with "Investigating (0)" and carry on up to at most "Investigating (9)". Any packages that you see still being investigated on the tenth pass are probably something to do with whatever's going wrong.
In this case, most packages have been resolved by the end of the fourth pass, but
Broken xserver-xorg-core:i386 Breaks on xserver-xorg-video-6 [ i386 ] < none > ( none )
This is a
Investigating (0) xserver-xorg-core [ i386 ] < 2:1.7.6-2ubuntu7.10 -> 2:1.11.3-0ubuntu8 > ( x11 ) Fixing xserver-xorg-core:i386 via remove of xserver-xorg-video-tseng:i386 Investigating (1) xserver-xorg-core [ i386 ] < 2:1.7.6-2ubuntu7.10 -> 2:1.11.3-0ubuntu8 > ( x11 ) Fixing xserver-xorg-core:i386 via remove of xserver-xorg-video-i740:i386 Investigating (2) xserver-xorg-core [ i386 ] < 2:1.7.6-2ubuntu7.10 -> 2:1.11.3-0ubuntu8 > ( x11 ) Fixing xserver-xorg-core:i386 via remove of xserver-xorg-video-nv:i386
OK, so that makes sense - presumably upgrading those packages didn't help at the time. But look at the pass numbers. Rather than just fixing all the packages that provide
My cup overfloweth
The second bug I looked at was Ubuntu bug #917173 (apt.log). Just as in the previous case, we can see the resolver "running out of time" by reaching the end of the tenth pass with some dependencies still broken. This one is a lot less obvious, though. The last few entries clearly indicate that the resolver is stuck in a loop:
Investigating (8) dpkg [ i386 ] < 184.108.40.206ubuntu4.5 -> 220.127.116.11ubuntu5 > ( admin ) Broken dpkg:i386 Breaks on dpkg-dev [ i386 ] < 18.104.22.168ubuntu4.5 -> 22.214.171.124ubuntu5 > ( utils ) (< 1.15.8) Considering dpkg-dev:i386 29 as a solution to dpkg:i386 7205 Upgrading dpkg-dev:i386 due to Breaks field in dpkg:i386 Investigating (8) dpkg-dev [ i386 ] < 126.96.36.199ubuntu4.5 -> 188.8.131.52ubuntu5 > ( utils ) Broken dpkg-dev:i386 Depends on libdpkg-perl [ i386 ] < none -> 184.108.40.206ubuntu5 > ( perl ) (= 220.127.116.11ubuntu5) Considering libdpkg-perl:i386 12 as a solution to dpkg-dev:i386 29 Holding Back dpkg-dev:i386 rather than change libdpkg-perl:i386 Investigating (9) dpkg [ i386 ] < 18.104.22.168ubuntu4.5 -> 22.214.171.124ubuntu5 > ( admin ) Broken dpkg:i386 Breaks on dpkg-dev [ i386 ] < 126.96.36.199ubuntu4.5 -> 188.8.131.52ubuntu5 > ( utils ) (< 1.15.8) Considering dpkg-dev:i386 29 as a solution to dpkg:i386 7205 Upgrading dpkg-dev:i386 due to Breaks field in dpkg:i386 Investigating (9) dpkg-dev [ i386 ] < 184.108.40.206ubuntu4.5 -> 220.127.116.11ubuntu5 > ( utils ) Broken dpkg-dev:i386 Depends on libdpkg-perl [ i386 ] < none -> 18.104.22.168ubuntu5 > ( perl ) (= 22.214.171.124ubuntu5) Considering libdpkg-perl:i386 12 as a solution to dpkg-dev:i386 29 Holding Back dpkg-dev:i386 rather than change libdpkg-perl:i386
The new version of
Investigating (1) libdpkg-perl [ i386 ] < none -> 126.96.36.199ubuntu5 > ( perl ) Broken libdpkg-perl:i386 Depends on perl [ i386 ] < 5.10.1-8ubuntu2.1 -> 5.14.2-6ubuntu1 > ( perl ) Considering perl:i386 1472 as a solution to libdpkg-perl:i386 12 Holding Back libdpkg-perl:i386 rather than change perl:i386
Investigating (1) perl [ i386 ] < 5.10.1-8ubuntu2.1 -> 5.14.2-6ubuntu1 > ( perl ) Broken perl:i386 Depends on perl-base [ i386 ] < 5.10.1-8ubuntu2.1 -> 5.14.2-6ubuntu1 > ( perl ) (= 5.14.2-6ubuntu1) Considering perl-base:i386 5806 as a solution to perl:i386 1472 Removing perl:i386 rather than change perl-base:i386
Investigating (1) perl-base [ i386 ] < 5.10.1-8ubuntu2.1 -> 5.14.2-6ubuntu1 > ( perl ) Broken perl-base:i386 PreDepends on libc6 [ i386 ] < 2.11.1-0ubuntu7.8 -> 2.13-24ubuntu2 > ( libs ) (>= 2.11) Considering libc6:i386 -17473 as a solution to perl-base:i386 5806 Added libc6:i386 to the remove list
Investigating (0) libc6 [ i386 ] < 2.11.1-0ubuntu7.8 -> 2.13-24ubuntu2 > ( libs ) Broken libc6:i386 Depends on libc-bin [ i386 ] < 2.11.1-0ubuntu7.8 -> 2.13-24ubuntu2 > ( libs ) (= 2.11.1-0ubuntu7.8) Considering libc-bin:i386 10358 as a solution to libc6:i386 -17473 Removing libc6:i386 rather than change libc-bin:i386
So ultimately the problem is something to do with libc6; but what? As Steve Langasek said in the bug, libc6's dependencies have been very carefully structured, and surely we would have seen some hint of it elsewhere if they were wrong. At this point ideally I wanted to break out GDB or at the very least experiment a bit with
Eventually I noticed something. The numbers after the package names in the third line of each of these log entries are "scores": roughly, the more important a package is, the higher its score should be. The function that calculates these is
Scores[I->ID] += abs(OldScores[D.ParentPkg()->ID]);
The only exceptions are an initial -1 or -2 points for
Oh. This is computer programming, not mathematics ... and each score is stored in a
I'd expected this to be a pretty challenging pair of bugs. While I certainly haven't lost any respect for the APT maintainers for dealing with this stuff regularly, it wasn't as bad as I thought. I'd expected to have to figure out how to retune some slightly out-of-balance heuristics and not really know whether I'd broken anything else in the process; but in the end both patches were very straightforward.Mon, 24 Oct 2011
As is natural for an LTS cycle, lots of people are thinking and talking about work focused on quality rather than features. With Canonical extending LTS support to five years on the desktop for 12.04, much of this is quite rightly focused on the desktop. I'm really not a desktop hacker in any way, shape, or form, though. I spent my first few years in Ubuntu working mainly on the installer - I still do, although I do some other things now too - and I used to say only half-jokingly that my job was done once X started. Of course there are plenty of bugs I can fix, but I wanted to see if I could do something with a bit more structure, so I got to thinking about projects we could work on at the foundations level that would make a big difference.
Image build pipeline
One difficulty we have is that quite a few of our bugs - especially installer bugs, although this goes for some other things too - are only really caught when people are doing coordinated image testing just before a milestone release. Now, it takes a while to do all the builds and then it takes a while to test them. The excellent work of the QA team has meant that testing is much quicker now than it used to be, and a certain amount of smoke-testing is automated (particularly for server images). On the other hand, the build phase has only got longer as we've added more flavours and architectures, particularly as some parts of the process are still serialised per architecture or subarchitecture so ARM builds in particular take a very long time indeed. Exact timings are a bit difficult to get for various reasons, but I think the minimum time between a developer uploading a fix and us having a full set of candidate images on all architectures including that fix is currently somewhere north of eight hours, and that's with people cutting corners and pulling strings which is a suboptimal thing to have to do around release time. This obviously makes us reluctant to respin for anything short of showstopper bugs. If we could get things down to something closer to two hours, respins would be a much less horrible proposition and so we might be able to fix a few bugs that are serious but not showstoppers, not to mention that the release team would feel less burned out.
We discussed this problem at the release sprint, and came up with a laundry list of improvements; I've scheduled this for discussion at UDS in case we can think of any more. Please come along if you're interested!
One thing in particular that I'm working on is refactoring Germinate, a tool which dates right back to our first meeting before Ubuntu was even called Ubuntu and whose job is to expand dependencies starting from our lists of "seed" packages; we use this, among other things, to generate
Maintaining the development release
Our release cycle always starts with syncing and merging packages from Debian unstable (or testing in the case of LTS cycles). The vast majority of packages in Ubuntu arrive this way, and generally speaking if we didn't do this we would fall behind in ways that would be difficult to recover from later. However, this does mean that we get a "big bang" of changes at the start of the cycle, and it takes a while for the archive to be usable again. Furthermore, even once we've taken care of this, we have a long-established rhythm where the first part of the cycle is mainly about feature development and the second part of the cycle is mainly about stabilisation. As a result, we've got used to the archive being fairly broken for the first few months, and we even tell people that they shouldn't expect things to work reliably until somewhere approaching beta.
This makes some kind of sense from the inside. But how are you supposed to do feature development that relies on other things in the development release?
In the first few years of Ubuntu, this question didn't matter very much. Nearly all the people doing serious feature development were themselves serious Ubuntu developers; they were capable of fixing problems in the development release as they went along, and while it got in their way a little bit it wasn't all that big a deal. Now, though, we have people focusing on things like Unity development, and we shouldn't assume that just because somebody is (say) an OpenGL expert or a window management expert that they should be able to recover from arbitrary failures in development release upgrades. One of the best things we could do to help the 12.04 desktop be more stable is to have the entire system be less unstable as we go along, so that developers further up the stack don't have to be distracted by things wobbling underneath them. Plus, it's just good software engineering to keep the basics working as you go along: it should always build, it should always install, it should always upgrade. Ubuntu is too big to do something like having everyone stop any time the build breaks, the way you might do in a smaller project, but we shouldn't let things slide for months either.
I've been talking to Rick Spencer and the other Ubuntu engineering leads at Canonical about this. Canonical has a system of "rotations", where you can go off to another team for a while if you're in need of a change or want to branch out a bit; so I proposed that we allow our engineers to spend a month or two at a time on what I'm calling the +1 Maintenance Team, whose job is simply to keep the development release buildable, installable, and upgradeable at all times. Rick has been very receptive to this, and we're going to be running this as a trial throughout the 12.04 cycle, with probably about three people at a time. As well as being professional archive gardeners, these people will also work on developing infrastructure to help us keep better track of what we need to do. For instance, we could deploy better tools from Debian QA to help us track uninstallable packages, or we could enhance some of our many existing reports to have bug links and/or comment facilities, or we could spruce up the weather report; there are lots of things we could do to make our own lives easier.
By 12.04, I would like, in no particular order:
Of course, this overlaps to a certain extent with the kinds of things that the MOTU team have been doing for years, not to mention with what all developers should be doing to keep their own houses in reasonable order, and I'd like us to work together on this; we're trying to provide some extra hands here to make Ubuntu better for everyone, not take over! I would love this to be an opportunity to re-energise MOTU and bring some new people on board.
I've registered a couple of blueprints (priorities, infrastructure) for discussion at UDS. These are deliberately open-ended skeleton sessions, and I'll try to make sure they're scheduled fairly early in the week, so that we have time for break-out sessions later on. If you're interested, please come along and give your feedback!Thu, 06 Oct 2011
The Ubuntu Technical Board conducts a regular review of the most popular Ubuntu Brainstorm ideas (previous reviews conducted by Matt Zimmerman and Martin Pitt). This time it was my turn. Apologies for the late arrival of this review.
Contact lens in the Unity Dash (#27584)
Unity supports Lenses, which provide a consistent way for users to quickly search for information via the Dash. Current lenses include Applications, Files, and Music, but a number of people have asked for contacts to be accessible using the same interface.
While Canonical's DX team isn't currently working on this for Ubuntu 11.10 or 12.04, we'd love somebody who's interested in this to get involved. Allison Randal explains how to get started, including some skeleton example code and several useful links.
Displaying Ubuntu version information (#27460)
Several people have asked for it to be more obvious what Ubuntu version they're running, as well as other general information about their system.
John Lea, user experience architect on the Unity team, responds that in Ubuntu 11.10 the new LightDM greeter shows the Ubuntu version number, making that basic information very easily visible. For more detail, System Settings -> System Info provides a simple summary.
Volume adjustments for headphone use (#27275)
People often find that they need to adjust their sound volume when plugging in or removing headphones. It seems as though the computer ought to be able to remember this kind of thing and do it automatically; after all, a major goal of Ubuntu is to make the desktop Just Work.
David Henningson, a member of Canonical's OEM Services group and an Ubuntu audio developer, responds on his blog with a summary of how PulseAudio jack detection has improved matters in Ubuntu 11.10, and what's left to do:
Making it easier to find software to handle a file (#28148)
Ubuntu is not always as helpful as it could be when you don't have the right software installed to handle a particular file.
Michael Vogt, one of the developers of the Ubuntu Software Center, responded to this. It seems that most of the pieces to make this work nicely are in place, but there are a few more bits of glue required:
Show pop-up alert on low battery (#28037)
Some users have reported on Brainstorm that they are not alerted frequently enough when their laptop's battery is low, as they clearly ought to be.
This is an odd one, because there are already several power alert levels and this has been working well for us for some time. Nevertheless, enough people have voted for this idea that there must be something behind it, perhaps a bug that only affects certain systems. Martin Pitt, technical lead of the Ubuntu desktop team, has responded directly to the Brainstorm idea with a description of the current system and how to file a bug when it does not work as intended.Sat, 09 Apr 2011
I've released man-db 2.6.0 (announcement, NEWS, ChangeLog), and uploaded it to Debian unstable. Ubuntu is rapidly approaching beta freeze so I'm not going to try to cram this into 11.04; it'll be in 11.10.Tue, 15 Mar 2011
I spent most of last week working on Ubuntu bug 693671 ("wubi install will not boot - phase 2 stops with: Try (hd0,0): NTFS5"), which was quite a challenge to debug since it involved digging into parts of the Wubi boot process I'd never really touched before. Since I don't think much of this is very well-documented, I'd like to spend a bit of time explaining what was involved, in the hope that it will help other developers in the future.
Wubi is a system for installing Ubuntu into a file in a Windows filesystem, so that it doesn't require separate partitions and can be uninstalled like any other Windows application. The purpose of this is to make it easy for Windows users to try out Ubuntu without the need to worry about repartitioning, before they commit to a full installation. Wubi started out as an external project, and initially patched the installer on the fly to do all the rather unconventional things it needed to do; we integrated it into Ubuntu 8.04 LTS, which involved turning these patches into proper installer facilities that could be accessed using preseeding, so that Wubi only needs to handle the Windows user interface and other Windows-specific tasks.
Anyone familiar with a GNU/Linux system's boot process will immediately see that this isn't as simple as it sounds. Of course, ntfs-3g is a pretty solid piece of software so we can handle the Windows filesystem without too much trouble, and loopback mounts are well-understood so we can just have the initramfs loop-mount the root filesystem. Where are you going to get the kernel and initramfs from, though? Well, we used to copy them out to the NTFS filesystem so that GRUB could read them, but this was overly complicated and error-prone. When we switched to GRUB 2, we could instead use its built-in loopback facilities, and we were able to simplify this. So all was more or less well, except for the elephant in the room. How are you going to load GRUB?
In a Wubi installation, NTLDR (or BOOTMGR in Windows Vista and newer) still owns the boot process. Ubuntu is added as a boot menu option using BCDEdit. You might then think that you can just have the Windows boot loader chain-load GRUB. Unfortunately, NTLDR only loads 16 sectors - 8192 bytes - from disk. GRUB won't fit in that: the smallest core.img you can generate at the moment is over 18 kilobytes. Thus, you need something that is small enough to be loaded by NTLDR, but that is intelligent enough to understand NTFS to the point where it can find a particular file in the root directory of a filesystem, load boot loader code from it, and jump to that. The answer for this was GRUB4DOS. Most of GRUB4DOS is based on GRUB Legacy, which is not of much interest to us any more, but it includes an assembly-language program called GRLDR that supports doing this very thing for FAT, NTFS, and ext2. In Wubi, we build GRLDR as
Now, the messages shown in the bug report suggested a failure either within GRLDR or very early in GRUB. The first thing I did was to remember that GRLDR has been integrated into the grub-extras
Well, yes, I mostly could, but that 8192-byte limit came back to bite me, along with an internal 2048-byte limit that GRLDR allocates for its NTFS bootstrap code. There were only a few spare bytes. Something like this would more or less fit, to print a single mark character at various points so that I could see how far it was getting:
pushal xorw %bx, %bx /* video page 0 */ movw $0x0e4d, %ax /* print 'M' */ int $0x10 popal
In a few places, if I removed some code I didn't need on my test machine (say, CHS compatibility), I could even fit in cheap and nasty code to print a single register in hex (as long as you didn't mind 'A' to 'F' actually being ':' to '?' in ASCII; and note that this is real-mode code, so the loop counter is
/* print %edx in dumbed-down hex */ pushal xorw %bx, %bx movb $0xe, %ah movw $8, %cx 1: roll $4, %edx movb %dl, %al andb $0xf, %al int $0x10 loop 1b popal
After a considerable amount of work tracking down problems by bisection like this, I also observed that GRLDR's NTFS code bears quite a bit of resemblance in its logical flow to GRUB 2's NTFS module, and indeed the same person wrote much of both. Since I knew that the latter worked, I could use it to relieve my brain of trying to understand assembly code logic directly, and could compare the two to look for discrepancies. I did find a few of these, and corrected a simple one. Testing at this point suggested that the boot process was getting as far as GRUB but still wasn't printing anything. I removed some Ubuntu patches which quieten down GRUB's startup: still nothing - so I switched my attentions to grub-core/kern/i386/pc/startup.S, which contains the first code executed from GRUB's core image. Code before the first call to
Around this point I was venting on IRC, and somebody asked if it was reproducible in QEMU. Although I'd tried that already, I went back and tried again. Ubuntu's
(gdb) target remote | qemu -gdb stdio -no-kvm -hda /dev/sda
This let me run until the point when NTLDR was about to hand over control, then interrupt and set a breakpoint at
Single-stepping showed that GRLDR was loading the entirety of
The lesson for me from all of this has been to try hard to get an interactive debugger working. Really hard. It's worth quite a bit of up-front effort if it saves you from killing neurons stepping through pages of code by hand. I think the real-mode debugging tricks I picked up should be useful for working on GRUB in the future.Sat, 11 Dec 2010
I've released libpipeline 1.1.0, and uploaded it to Debian unstable. The changes are mostly just to add a few occasionally useful interfaces:
The shared library actually ends up being a few kilobytes smaller on Debian than 1.0.0, probably because I tweaked the set of Gnulib modules I'm using.Mon, 06 Dec 2010
The Ubuntu Technical Board is currently conducting a review of the top ten Brainstorm issues users have raised about Ubuntu, and Matt asked me to investigate and respond to Idea #25301: Keeping the time accurate over the Internet by default.
My first reaction was "hey, that's odd - I thought we already did that?". We install the
I brought up a clean virtual machine with a development version of Natty (the current Ubuntu development version, which will eventually become 11.04), and had a look in its logs: it was indeed synchronising its time from
So, I started tracing through the scripts to figure out what was going on. It turned out that I had an empty
That left the question of where that file came from. It didn't seem to be owned by any package; I was pretty sure I hadn't created it by hand either. I had a look through some bug reports, and soon found ntpdate 1:4.2.2.p4+dfsg-1ubuntu2 has a flawed configuration file. It turns out that
Once I knew where the problems were, it was easy to fix them. I've uploaded the following changes, which will be in the 11.04 release:
There are still a few problems. The "Synchronise now" button doesn't work quite right in general (bug #90524), and if your network doesn't allow time synchronisation from
It's always possible that I missed some other problem that breaks automatic time synchronisation for people. Please do file a bug report if it still doesn't work for you in 11.04, or contact me directly (cjwatson at ubuntu.com).Thu, 02 Dec 2010
I just found out by chance that Fedora 14 switched from their old man package to man-db. This is great news: it should now be the beginning of the end of the divergence of man implementations that happened way back in the mid-1990s, when two different people took John W. Eaton's man package and developed it in different directions without being aware of each other's existence. For a while it looked as though man-db was stuck on just the Debian family and openSUSE, but a number of distributions have switched over in the last few years. As of now, the only remaining major distribution not using man-db is Gentoo, and they have a bug for switching which I think should be unblocked fairly soon.
In some ways man-db's package name didn't help it; people thought that the main difference was that man-db had a database backend stuck around apropos. These days, the database is one of the least important parts of man-db as far as I'm concerned. Other ways in which it's very significantly superior to anything man could do without years of equivalent effort include correct encoding support, robust child process handling, and use of more modern development facilities (dear catgets: you belong to a previous millennium, so please go away). I'm glad that Fedora has recognised this.Fri, 29 Oct 2010
In my previous post, I described the pipeline library from man-db and asked whether people were interested in a standalone release of it. Several people expressed interest, and so I've now released libpipeline version 1.0.0. It's in the Debian NEW queue, and my PPA contains packages of it for Ubuntu lucid and maverick.
I gave a lightning talk on this at UDS in Orlando, and my slides are available. I hope there'll be a video at some point which I can link to.
Thanks to Scott James Remnant for code review (some time back), Ian Jackson for an extensive design review, and Kees Cook and Matthias Klose for helpful conversations.Sun, 03 Oct 2010
When I took over man-db in 2001, one of the major problems that became evident after maintaining it for a while was the way it handled subprocesses. The nature of man and friends means that it spends a lot of time calling sequences of programs such as
In higher-level languages, there are usually standard constructs which are safer than just passing a command line to the shell. For example, in Perl you can use
I wrote a couple of thousand lines of library code in man-db to address this problem, loosely and now quite distantly based on code in groff. In the following examples, function names starting with
Constructing the simplified example pipeline from my first paragraph using this library looks like this:
pipeline *p; int status; p = pipeline_new (); p->want_infile = "input-file"; pipeline_command_args (p, "zsoelim", NULL); pipeline_command_args (p, "tbl", NULL); pipeline_command_args (p, "nroff", "-mandoc", "-Tutf8", NULL); pipeline_start (p); status = pipeline_wait (p); pipeline_free (p);
You might want to construct a command more dynamically:
command *manconv = command_new_args ("manconv", "-f", from_code, "-t", "UTF-8", NULL); if (quiet) command_arg (manconv, "-q"); pipeline_command (p, manconv);
Perhaps you want an environment variable set only while running a certain command:
command *less = command_new ("less"); command_setenv (less, "LESSCHARSET", lesscharset);
You might find yourself needing to pass the output of one pipeline to several other pipelines, in a "tee" arrangement:
pipeline *source, *sink1, *sink2; source = make_source (); sink1 = make_sink1 (); sink2 = make_sink2 (); pipeline_connect (source, sink1, sink2, NULL); /* Pump data among these pipelines until there's nothing left. */ pipeline_pump (source, sink1, sink2, NULL); pipeline_free (sink2); pipeline_free (sink1); pipeline_free (source);
Maybe one of your commands is actually an in-process function, rather than an external program:
command *inproc = command_new_function ("in-process", &func, NULL, NULL); pipeline_command (p, inproc);
Sometimes your program needs to consume the output of a pipeline, rather than sending it all to some other subprocess:
pipeline *p = make_pipeline (); const char *line; line = pipeline_peekline (p); if (!strstr (line, "coding: UTF-8")) printf ("Unicode text follows:\n"); while (line = pipeline_readline (p)) printf (" %s", line); pipeline_free (p);
man-db deals with compressed files a lot, so I wrote an add-on library for opening compressed files (which is somewhat man-db-specific, but the implementation wasn't difficult given the underlying library):
pipeline *decomp_file = decompress_open (compressed_filename); pipeline *decomp_stdin = decompress_fdopen (fileno (stdin));
This library has been in production in man-db for over five years now. The very careful signal handling code has been reviewed independently and the whole thing has been run through multiple static analysis tools, although I would always welcome more review; in particular I have no idea what it would take to make it safe for use in threaded programs since I generally avoid threading wherever possible. There have been a handful of bugs, which I've fixed promptly, and I've added various new features to support particular requirements of man-db (though in as general a way as possible). Every so often I see somebody asking about subprocess handling in C, and I wonder if I should split this library out into a standalone package so that it can be used elsewhere. Web searches for things like "pipeline library" and "libpipeline" don't reveal anything that's a particularly close match for what I have. The licensing would be GPLv2 or later; this isn't likely to be negotiable since some of the original code wasn't mine and in any case I don't feel particularly bad about giving an advantage to GPLed programs. For more details on the interface, the header file is well-commented.
Is there enough interest in this to make the effort of producing a separate library package worthwhile? As well as the general effort of creating a new package, I'd need to do some work to disentangle it from a few bits and pieces specific to man-db. If you maintain a specific package that could use this and you're interested, please contact me with details, mentioning any extensions you think you'd need. I intentionally haven't enabled comments on my blog for various reasons, but you can e-mail me at cjwatson at debian.org or man-db-devel at nongnu.org.Sat, 28 Aug 2010
If you find that running Windows makes a GRUB 2-based system unbootable (Debian bug, Ubuntu bug), then I'd like to hear from you. This is a bug in which some proprietary Windows-based software overwrites particular sectors in the gap between the master boot record and the first partition, sometimes called the "embedding area". GRUB Legacy and GRUB 2 both normally use this part of the disk to store one of their key components: GRUB Legacy calls this component Stage 1.5, while GRUB 2 calls it the core image (comparison). However, Stage 1.5 is less useful than the core image (for example, the latter provides a rescue shell which can be used to recover from some problems), and is therefore rather smaller: somewhere around 10KB vs. 24KB for the common case of ext on plain block devices. It seems that the Windows-based software writes to a sector which is after the end of Stage 1.5, but before the end of the core image. This is why the problem appears to be new with GRUB 2.
At least some occurrences of this are with software which writes a signature to the embedding area which hangs around even after uninstallation (even with one of those tools that tracks everything the installation process did and reverses it, I gather), so that you cannot uninstall and reinstall the application to defeat a trial period. This seems like a fine example of an antifeature, especially given its destructive consequences for free software, and is in general a poor piece of engineering; what happens if multiple such programs want to use the same sector, I wonder? They clearly aren't doing much checking that the sector is unused, not that that's really possible anyway. While I do not normally think that GRUB should go to any great lengths to accommodate proprietary software, this is a case where we need to defend ourselves against the predatory practices of some companies making us look bad: a relatively small number of people do enough detective work to realise that it's the fault of a particular Windows application, but many more simply blame our operating system because it won't start any more.
I believe that it may be possible to assemble a collection of signatures of such software, and arrange to avoid the disk sectors they have stolen. Indeed, I have a first draft of the necessary code. This is not a particularly pleasant solution, but it seems to be the most practical way around the problem; I'm hoping that several of the programs at fault are using common "licence manager" code or something like that, so that we can address most of the problems with a relatively small number of signatures. In order to do this, I need to hear from as many people as possible who are affected by this problem.
If you suffer from this problem, then please do the following:
I hope that this will help me to assemble enough information to fix this bug at least for most people, and of course if you provide this information then I can make sure to fix your particular version of this problem. Thanks in advance!Sat, 10 Jul 2010
Apropos of my previous post, I see that dh has now overtaken CDBS as the most popular rules helper system of its kind in Debian unstable, and shows no particular sign of slowing its rate of uptake any time soon. The resolution of the graph is such that you can't see it yet, but dh drew dead level with CDBS on Thursday, and today 3836 packages are using dh as opposed to 3823 using CDBS.
Fri, 02 Jul 2010
... this version, or something not too far away from it, might actually stand a chance of getting into testing.
I've just uploaded grub2 1.98+20100702-1. The most significant set of changes in this release is that it switches
I did this work first in Ubuntu as one of my major goals for 10.04 LTS, which exposed a few problems that I wanted to fix before inflicting it on Debian as well (fixes for those are now under testing for 10.04.1). Most significantly, I felt it was necessary to start offering partitions in the select list for
My next priority will be making whatever fixes are necessary to get this version into testing, since the problems with
Other improvements since my last entry have included:
(This is partly a repost of material I've posted to bug reports and to debian-release, put together with some more detail for a wider audience.)
You could be forgiven for looking at the RC bug activity on grub2 over the last couple of days and thinking that it's all gone to hell in a handbasket with recent uploads. In fact, aside from an interesting case which turned out to be due to botched handling of the GRUB Legacy to GRUB 2 chainloading setup (which prompted me to fix three other RC bugs along the way), all the recent problems people have been having have been duplicates of one of these bugs which have existed essentially forever:
When GRUB boots, its boot sector first loads its "core image", which is usually embedded in the gap between the boot sector and the first partition on the same disk as the boot sector. This core image then figures out where to find /boot/grub, and loads grub.cfg from it as well as more GRUB modules.
The thing that tends to go wrong here is that the core image must be from the same version of GRUB as any modules it loads. /boot/grub/*.mod are updated only by grub-install, so this normally works OK. However, for various reasons (deliberate or accidental) some people install GRUB to multiple disks. In this case, grub-install might update /boot/grub/*.mod along with the core image on one disk, but your BIOS might actually be booting from a different disk. The effect of this will be that you'll have an old core image and new modules, which will probably blow up in any number of possible ways. Quite often, this problem lies dormant for a while because GRUB happens not to change in a way that causes incompatibility between the core image and modules, but then we get massive spikes of bug reports any time the interface does change. Since these bugs sometimes bite people upgrading from testing to unstable, they get interpreted as regressions from the version in testing even though that isn't strictly true (but it tends not to be very productive to argue this line; after all, people's computers suddenly don't boot!). Any problem that causes the core image to be installed to a disk other than the one actually being booted from, or not to be installed at all, will show up this way sooner or later.
On 2010-06-10, there was a substantial upstream change to the handling of list iterators (to reduce core image size and make code clearer and faster) which introduced an incompatibility between old core images and newer modules. This caused a bunch of dormant problems to flare up again, and so there was a flood of reports of booting problems with 1.98+20100614-1 and newer, often described as "the unaligned pointer bug" due to how it happened to manifest this time round. In previous cases, GRUB reported undefined symbols on boot, but it's all essentially the same problem even though there are different symptoms.
The confusing bit when handling bug reports is that not only are there different symptoms with the same cause, but there are also multiple causes for the same symptom! This takes a certain amount of untangling, especially when lots of people have thought "ooh, that bug looks a bit like mine" and jumped in with their own comments. Working through this was a worthwhile exercise, as it came up with an entirely new cause for a problem I thought was fairly well-understood (thanks to debugging assistance from Sedat Dilek). If you had set up GRUB 2 to be automatically chainloaded from GRUB Legacy (which happens automatically on upgrade from the latter to the former), never got round to running
Unless anything new shows up, that just leaves the problems that were already understood. Today, I posted a patch to generate stable device names in device.map by default. If this is accepted, then we can do something or other to fix up device.map on upgrade, switch over to
Since my last blog entry on GRUB 2, improvements have included:
The next upstream snapshot will bring several improvements to EFI video support, mainly thanks to Vladimir Serbinenko. I've been working on making
Various people observed in a long thread on debian-devel that the grub2 package was in a bit of a mess in terms of its release-critical bug count, and Jordi and Stefano both got in touch with me directly to gently point out that I probably ought to be doing something about it as one of the co-maintainers.
Actually, I don't think grub2 was in quite as bad a state as its 18 RC bugs suggested. Of course every boot loader failure is critical to the person affected by it, not to mention that GRUB 2 offers more complex functionality than any other boot loader (e.g. LVM and RAID), and so it tends to accumulate RC bugs at rather a high rate. That said, we'd been neglecting its bug list for some time; Robert and Felix have both been taking some time off, Jordi mostly only cared about PowerPC and can't do that any more due to hardware failure, and I hadn't been able to pick up the slack.
Most of my projects at work for the next while involve GRUB in one way or another, so I decided it was a perfectly reasonable use of work time to do something about this; I was going to need fully up-to-date snapshots anyway, and practically all the Debian grub2 bugs affect Ubuntu too. Thus, with the exception of some other little things like releasing the first Maverick alpha, I've spent pretty much the last week and a half solidly trying to get the grub2 package back into shape, with four uploads so far.
The RC issues that remain are:
If we can fix that lot, or even just the ones that are reasonably well-understood, I think we'll be in reasonable shape. I'd also like to make
On the upside, progress has been good. We have multiple terminal support thanks to a new upstream snapshot (#506707),
If you'd like to help, contact me, especially if there's something particular that isn't being handled that you think you could work on. GRUB 2 is actually quite a pleasant codebase to work on once you get used to its layout; it's certainly much easier to fix bugs in than GRUB Legacy ever was, as far as I'm concerned. Thanks to tools like
For various reasons, I chose to leave Ubuntu 10.04 LTS using OpenSSH 5.3p1. The new features in 5.4p1 such as certificate authentication, the new smartcard handling, netcat mode, and tab-completion in sftp are great, but unfortunately it was available just a little bit too late for me to be able to land it for 10.04 LTS. I realise that many Lucid users want to make use of these features for one reason or another, though, so as a compromise here's a PPA containing OpenSSH 5.5p1 for Lucid.
I intend to keep this up to date for as long as I reasonably can, and I'm happy to accept bug reports on it in the usual place.Fri, 26 Mar 2010
Note: I wrote most of this before Neil Williams' recent comments on the 3.0 family of formats, so despite the timing this isn't really a reaction to that although I do have a couple of responses. On the whole I think I agree that the Lintian message is a bit heavy-handed and I'm not sure I'm thrilled about the idea of the default source format being changed (though I can see why the dpkg maintainers are interested in that). That said, as far as I personally am concerned, there is a vast cognitive benefit to me in having as much as possible be common to all my packages. Once I have more than a couple of packages that require patching and benefit from the
Anyway, on to the main body of this post:
I've been one of the holdouts resisting use of patch systems for a long time, on the basis that I felt strongly that
After experimenting with a couple of small packages, I moved over to the deep end and converted openssh a few weekends ago, since quite a few people have requested over the years that the Debian changes to openssh be easier to audit. This was a substantial job - over 6000 lines of upstream patches - but not actually as much work as I expected. I took a fairly simplistic approach: first, I unapplied all the upstream patches from my tree; then I ran
On the whole I'm satisfied with this, and the benefits definitely outweigh the costs. Thanks to the dpkg team for all their work on this!Mon, 22 Mar 2010
I've started the transition of parted 2.2 to unstable. This is a major update needed for sensible support of newer hard disks with alignment requirements different from the archaic cylinder alignment tradition. I posted to debian-boot with a summary of the partman changes involved.Wed, 03 Mar 2010
I don't know if anyone else has been tracking this recently, but a while back I got curious about the relative proportions of dh(1) and CDBS in the archive, and started running some daily analysis on the Lintian lab. Apologies for my poor graphing abilities, but the graph is here (occasionally updated):
Although dh is still a bit behind CDBS, the steady upward trend is quite striking - it looks set to break 20% soon, up from under 13% in September - compared with CDBS which has been sitting within half a percentage point of 25% the whole time.
Incidentally, was that an ftpmaster trying to sign his name in the graph over Christmas or something? :-)Sun, 21 Feb 2010
I did a bit of catching up on my Debian backlog over the last week or so. Among the things I got round to:
So nothing really earth-shaking, and as ever openssh could use some attention, but I feel a bit better about my backlog now. I do still have a critical bug in makepasswd to fix, and a sponsored upload of parrot; those are the next two things on my to-do list.Fri, 13 Nov 2009
In case it isn't obvious, in "Ubuntu 9.10 SP1 coming in spring 2010", "Ubuman" is blatantly lying in attributing a number of statements to me. None of the text there was written by me, and if you thought any of it was true then you should probably make sure your troll radar is working properly. Nice joke, but try harder next time - it doesn't even look like my writing style.
(I wouldn't normally bother to respond, since I'm probably just giving it more publicity, but apparently one or two people may already have been taken in by it. One person was sensible enough to write to me and check the facts.)Fri, 31 Jul 2009
If you're generating one of these shiny new RSA keys, do please remember to generate an encryption subkey too if you expect people to sign it - at least your more obscure UIDs. I'm not going to mail unencrypted signatures around unless I have some out-of-band knowledge that the e-mail address actually belongs to the person I met.
I generated a new 4096-bit RSA key myself at DebConf (baa!), and have just published a key transition document. Please consider signing my new key if you signed my old one.Tue, 14 Jul 2009
I recently implemented
The upshot is that, with a hot cache, man-db takes around 40 seconds to search all manual pages on my laptop; the man package (also with a hot cache) takes around five minutes, and interactive performance goes down the drain while it's doing it since it's spawning subprocesses like crazy. If I limit to a single section, the disparity is closer to 3x than 10x, but it's still very noticeable. It's interesting how much good libraries can do to help guide efficient approaches to problems.
Of course, a proper full-text search engine would be much better still, but that's a project for some other time ...Thu, 02 Jul 2009
writes about creating pipelines with Python's
import signal import subprocess def subprocess_setup(): # Python installs a SIGPIPE handler by default. This is usually not what # non-Python subprocesses expect. signal.signal(signal.SIGPIPE, signal.SIG_DFL) subprocess.Popen(command, preexec_fn=subprocess_setup)
I filed a patch a while
back to add a
Joey Hess posted a draft of a code_swarm video for d-i a couple of weeks ago, which reminded me that I've been meaning to do something similar for Ubuntu for a while now as it's just about our archive's fifth birthday. I have a more or less complete archive of all our -changes mailing lists locally (I think I'm missing some of the very early ones, before the end of July 2004; let me know if you were one of the very early Canonical employees and have a record of these), and with the aid of launchpadlib it's fairly easy to map all the e-mail addresses into Launchpad user names, massage out some of the more obvious duplicates, and then treat the stream of uploads as if it were a stream of commits.
If you haven't seen code_swarm before, each dot represents an upload, and the dots "swarm" around their corresponding committers' names; more active committers have larger swarms of dots and brighter names. I assigned a colour to each of our archive components (uploads aren't really at the C code vs. Python code vs. translations vs. whatever kind of granularity that you see in other code_swarm videos), which mostly means that people who predominantly upload to main are in roughly an Ubuntu tan colour, people who predominantly upload to universe are coloured bluish, and people with a good mixture tend to come out coloured green. If I get a bit more time I may try to figure out enough about video editing software to add some captions.
Here's the video (194 MB).Thu, 05 Mar 2009
I've been a bit surprised by the strong positive response to my previous post. People generally seemed to think it was quite non-ranty; maybe I should clean the rust off my flamethrower. :-) My hope was that I'd be able to persuade people to change some practices, so I guess that's a good thing.
Of course, there are many very smart people doing bug triage very well, and I don't want to impugn their fine work. Like its medical namesake, bug triage is a skilled discipline. While it's often repetitive, and there are lots of people showing up with similar symptoms, a triage nurse can really make a difference by spotting urgent cases, cleaning up some of the initial blood, and referring the patient quickly to a doctor for attention. Or, if a pattern of cases suddenly appears, a triage nurse might be able to warn of an incipient epidemic. [Note: I have no medical experience, so please excuse me if I'm talking crap here. :-)] The bug triagers who do this well are an absolute godsend; especially when they respond to repetitive tasks with tremendously useful pieces of automation like bughelper. The cases I have trouble with are more like somebody showing up untrained, going through everyone in the waiting room, and telling each of them that they just need to go home, get some rest, and stop complaining so much. Sometimes of course they'll be right, but without taking the time to understand the problem they're probably going to do more harm than good.
Ian Jackson reminded me that it's worth mentioning the purpose of bug reports on free software: namely, to improve the software. The GNU Project has some advice to maintainers on this. I think sometimes we stray into regarding bug reports more like support tickets. In that case it would be appropriate to focus on resolving each case as quickly as possible, if necessary by means of a workaround rather than by a software change, and only bother the developers when necessary. This is the wrong way to look at bug reports, though. The reason that we needed to set up a bug triage community in Ubuntu was that we had a relatively low developer-to-package ratio and a very high user-to-developer ratio, and we were getting a lot of bug reports that weren't fleshed out enough for a developer to investigate them without spending a lot of time in back-and-forth with the reporter, so a number of people volunteered to take care of the initial back-and-forth so that good clear bug reports could be handed over to developers. This is all well and good, and indeed I encouraged it because I was personally finding myself unable to keep up with incoming bugs and actually fix anything at the same time. Somewhere along the way, though, some people got the impression that what we wanted was a first-line support firewall to try to defend developers from users, which of course naturally leads to ideas such as closing wishlist bugs containing ideas because obviously those important developers wouldn't want to be bothered by them, and closing old bugs because clearly they must just be getting in developers' way. Let me be clear about this now: I absolutely appreciate help getting bug reports into a state where I can deal with them efficiently, but I do not want to be defended from my users! I don't have a basis from which to state that all developers feel the same way, but my guess is that most do.
Antti-Juhani Kaijanaho said he'd experienced most of these problems in Debian. I hadn't actually intended my post to go to Planet Debian - I'd forgotten that the "ubuntu" category on my blog goes there too, which generally I see as a feature, but if I'd remembered that I would have been a little clearer that I was talking about Ubuntu bug triage. If I had been talking about Debian bug triage I'd probably have emphasised different things. Nevertheless, it's interesting that at least one Debian (and non-Ubuntu) developer had experienced similar problems.
Justin Dugger mentions a practice of marking duplicate bugs invalid that he has problems with. I agree that this is suboptimal and try not to do it myself. That said, this is not something I object to to the same extent. Given that the purpose of bugs is to improve the software, the real goal is to be able to spend more time fixing bugs, not to get bugs into the ideal state when the underlying problem has already been solved. If it's a choice between somebody having to spend time tracking down the exact duplicate bug number versus fixing another bug, I know which I'd take. Obviously, when doing this, it's worth apologising that you weren't able to find the original bug number, and explaining what the user can do if they believe that you're mistaken (particularly if it's a bug that's believed to be fixed); the stock text people often use for this doesn't seem informative enough to me.
Sebastien Bacher commented that preferred bug triage practices differ among teams: for instance, the Ubuntu desktop team deals with packages that are very much to the forefront of users' attention and so get a lot of duplicate bugs. Indeed - and bug triagers who are working closely with the desktop team on this are almost certainly doing things the way the developers on the desktop team prefer, so I have no problem with that. The best advice I can give bug triagers is that their ultimate aim is to help developers, and so they should figure out which developers they need to work with and go and talk to them! That way, rather than duplicating work or being counterproductive, they can tailor their work to be most effective. Everybody wins.Mon, 02 Mar 2009
I hate to say this, but often when somebody does lots of bug triage on a package I work on, I find it to be a net loss for me. I end up having to go through all the things that were changed, correct a bunch of them, occasionally pacify angry bug submitters, and all the rest of it, and often the benefits are minimal at best.
I would very much like this not to be the case. Bug triage is supposed to help developers be more efficient, and I think most people who do bug triage are generally well-intentioned and eager to help. Accordingly, here is a series of mini-rants intended to have educational value.
A while back, the BBC approached Canonical about providing seamless access to unencumbered BBC content for all Ubuntu users (in the UK and elsewhere). We agreed to approach this by way of a plugin for our primary media player, Totem, and asked Collabora Multimedia to do the plugin development work.
The results are in what will shortly be released as Ubuntu 8.10, and are looking really rather good. At the moment the content available from the BBC at present is mostly audio, but support for video is in place and the feed is expected to be fleshed out here over time. We have a genre classification scheme in place, and will see how that scales as the amount of available content grows. The code has been submitted upstream, although there are still a few issues to work out there.
This is not the same thing as iPlayer; all the content available here is DRM-free. Some of it is geographically restricted to the UK, and these restrictions are handled on the server side to make sure that the client is free of encumbrances.
Christian Schaller from Collabora posted about this a little while ago. Since then, the UI has been improved somewhat and some I/O issues have been fixed to the point where we felt comfortable enabling the BBC plugin (as well as the YouTube plugin) by default in Ubuntu 8.10. Here's a screenshot of the current interface.
This is exciting stuff with a lot of potential. To try it out, run Applications -> Sound & Video -> Movie Player and select the "BBC" entry from the drop-down box labelled "Playlist". If you find bugs, please report them!Mon, 23 Jun 2008
$ perl -le 'print "yoo" if (1 + 1) =~ /3/'
perlop(1) has a useful table of precedence.
To my horror, I recently saw this online SSH key generator.
I hope nobody reading this needs to be told why this is a bad idea. However, in case you do, here are a few reasons:
I think this is probably being done in innocent seriousness (although I kind of hope it's a joke in poor taste), and have e-mailed the contact address offering to explain why it's a bad idea.Sat, 12 Apr 2008
Ubuntu's live CD installer, Ubiquity, needs to suppress desktop automounting while it's doing partitioning and generally messing about with mount points, otherwise its temporary mount points end up busy on unmount due to some smart-arse desktop component that decides to open a window for it.
To date, it employs the following methods, each of which was sufficient at the time:
This is getting ridiculous. Dear desktop implementors: please pick a configuration mechanism and stick to it, and provide backward compatibility if you can't. This is not a rocket-science concept.
I rather liked the
I hacked together a little timesaver for developers this morning: omni completion for Launchpad bugs in Vim's debchangelog mode. To use it, install vim 7.1-138+1ubuntu3 once it hits the mirrors, open up a debian/changelog file, type "LP: #", and hit Ctrl-X Ctrl-O. It'll think for a while and then give you a list of all the bugs open in Launchpad against the package in question, from which you can select to insert the bug number into your changelog.
Here's a screenshot to make it clearer:
Thanks to Stefano Zacchiroli for doing the same for Debian bugs back in July.Tue, 29 Jan 2008
See Encodings in man-db for context.
Yesterday, I uploaded man-db 2.5.1-1 to unstable. With this version, not only is it possible to install manual pages in UTF-8 (as with 2.5.0, although with fewer bugs), but it's also possible to ask man to produce a version of an arbitrary page in the encoding of your choice, and have it guess the source encoding for you fairly reliably. This finally provides enough support to have debhelper automatically recode manual pages to UTF-8.
It'll probably take a little while to shake out the corner-case bugs, but I'm generally pretty happy with this. Once the new man-db and debhelper land in testing, I'll send a note to debian-devel-announce and push harder on my policy amendment.
Considering the historical state of man-db when it comes to localisation, and all of the dependencies and general yak-shaving that had to be tackled to get here, this represents the end of probably several hundred hours of work, so I'm pretty happy that this is out the door. The only remaining step is to add UTF-8 input support to groff, which fortunately Brian M. Carlson is working on. After that, we can reasonably claim to have dragged manual pages kicking and screaming into the 21st century.Thu, 29 Nov 2007
Erich: I do sometimes wonder why we don't relax the definition of "safe" upgrades to include installing new packages but still not removing old ones. I know that many of my uses of dist-upgrade are just for when something grows a new dependency that I didn't previously have installed.
(Of course this wouldn't always help as it wouldn't account for a new dependency that conflicted with an old dependency, but never mind. It would certainly do wonders for the metapackage case.)Mon, 17 Sep 2007
I've spent some quality upstream time lately with man-db. Specifically, I've been upgrading its locale support. I recently published a pre-release, man-db 2.5.0-pre2, mainly for translators, but other people may be interested in having a look at it as well. I hope to release 2.5.0 quite soon so that all of this can land in Debian.
Firstly, man-db now supports creating and using databases for per-locale hierarchies of manual pages, not just English. This means that apropos and whatis can now display information about localised manual pages.
Secondly, I've been working on the transition to UTF-8 manual pages. Now, modulo some hacks, groff can't yet deal with Unicode input; some possible input characters are reserved for its internal use which makes full 32-bit input difficult to do properly until that's fixed. However, with a few exceptions, manual pages generally just need the subset of Unicode that corresponds to their language's usual legacy character set, so for now it's good enough to just recode on the fly from UTF-8 to some appropriate 8-bit character set and use groff's support for that.
man-db has actually supported doing this kind of thing for a while, but
it's been difficult to use since it only applies to
I'm still debating whether Debian policy should recommend installing
UTF-8 manual pages in