Colin Watson
   


About
Colin Watson's blog
cjwatson@debian.org

Subscribe
Subscribe to a syndicated feed of my weblog, brought to you by the wonders of RSS.

Flavours
There's more than one way to view this weblog; try these flavours on for size.


Powered by Blosxom

       
Fri, 13 Nov 2009

Tissue of lies

In case it isn't obvious, in "Ubuntu 9.10 SP1 coming in spring 2010", "Ubuman" is blatantly lying in attributing a number of statements to me. None of the text there was written by me, and if you thought any of it was true then you should probably make sure your troll radar is working properly. Nice joke, but try harder next time - it doesn't even look like my writing style.

(I wouldn't normally bother to respond, since I'm probably just giving it more publicity, but apparently one or two people may already have been taken in by it. One person was sensible enough to write to me and check the facts.)

[/ubuntu] permanent link

Fri, 31 Jul 2009

Keysigning bits

If you're generating one of these shiny new RSA keys, do please remember to generate an encryption subkey too if you expect people to sign it - at least your more obscure UIDs. I'm not going to mail unencrypted signatures around unless I have some out-of-band knowledge that the e-mail address actually belongs to the person I met.

I generated a new 4096-bit RSA key myself at DebConf (baa!), and have just published a key transition document. Please consider signing my new key if you signed my old one.

[] permanent link

Tue, 14 Jul 2009

man-db: 'man -K'

I recently implemented man -K (full-text search over all manual pages) in man-db. This was inspired by a similar feature in Federico Lucifredi's man package (formerly maintained by Andries Brouwer). I think I did a much better job of it, though. The man package just forks grep for every manual page; man-db takes advantage of the pipeline library I wrote for it a while back and does it entirely in-process (decompression requires a fork but no exec, while the man package has to exec gunzip as well).

The upshot is that, with a hot cache, man-db takes around 40 seconds to search all manual pages on my laptop; the man package (also with a hot cache) takes around five minutes, and interactive performance goes down the drain while it's doing it since it's spawning subprocesses like crazy. If I limit to a single section, the disparity is closer to 3x than 10x, but it's still very noticeable. It's interesting how much good libraries can do to help guide efficient approaches to problems.

Of course, a proper full-text search engine would be much better still, but that's a project for some other time ...

[] permanent link

Thu, 02 Jul 2009

Python SIGPIPE handling

Enrico writes about creating pipelines with Python's subprocess module, and notes that you need to take care to close stdout in non-final subprocesses so that subprocesses get SIGPIPE correctly. This is correct as far as it goes (and true in any language, although there's a Python bug report requesting that subprocess be able to do this itself), but there's an additional gotcha with Python that you missed.

Python ignores SIGPIPE on startup, because it prefers to check every write and raise an IOError exception rather than taking the signal. This is all well and good for Python itself, but most Unix subprocesses don't expect to work this way. Thus, when you are creating subprocesses from Python, it is very important to set SIGPIPE back to the default action. Before I realised this was necessary, I wrote code that caused serious data loss due to a child process carrying on out of control after its parent process died!

import signal
import subprocess

def subprocess_setup():
    # Python installs a SIGPIPE handler by default. This is usually not what
    # non-Python subprocesses expect.
    signal.signal(signal.SIGPIPE, signal.SIG_DFL)

subprocess.Popen(command, preexec_fn=subprocess_setup)

I filed a patch a while back to add a restore_sigpipe option to subprocess.Popen, which would take care of this. As I say in that bug report, in a future release I think this ought to be made the default, as it's very easy to get things dangerously wrong right now.

[] permanent link

Thu, 28 May 2009

code_swarm video of Ubuntu uploads

Joey Hess posted a draft of a code_swarm video for d-i a couple of weeks ago, which reminded me that I've been meaning to do something similar for Ubuntu for a while now as it's just about our archive's fifth birthday. I have a more or less complete archive of all our -changes mailing lists locally (I think I'm missing some of the very early ones, before the end of July 2004; let me know if you were one of the very early Canonical employees and have a record of these), and with the aid of launchpadlib it's fairly easy to map all the e-mail addresses into Launchpad user names, massage out some of the more obvious duplicates, and then treat the stream of uploads as if it were a stream of commits.

If you haven't seen code_swarm before, each dot represents an upload, and the dots "swarm" around their corresponding committers' names; more active committers have larger swarms of dots and brighter names. I assigned a colour to each of our archive components (uploads aren't really at the C code vs. Python code vs. translations vs. whatever kind of granularity that you see in other code_swarm videos), which mostly means that people who predominantly upload to main are in roughly an Ubuntu tan colour, people who predominantly upload to universe are coloured bluish, and people with a good mixture tend to come out coloured green. If I get a bit more time I may try to figure out enough about video editing software to add some captions.

Here's the video (194 MB).

[/ubuntu] permanent link

Thu, 05 Mar 2009

Bug triage, redux

I've been a bit surprised by the strong positive response to my previous post. People generally seemed to think it was quite non-ranty; maybe I should clean the rust off my flamethrower. :-) My hope was that I'd be able to persuade people to change some practices, so I guess that's a good thing.

Of course, there are many very smart people doing bug triage very well, and I don't want to impugn their fine work. Like its medical namesake, bug triage is a skilled discipline. While it's often repetitive, and there are lots of people showing up with similar symptoms, a triage nurse can really make a difference by spotting urgent cases, cleaning up some of the initial blood, and referring the patient quickly to a doctor for attention. Or, if a pattern of cases suddenly appears, a triage nurse might be able to warn of an incipient epidemic. [Note: I have no medical experience, so please excuse me if I'm talking crap here. :-)] The bug triagers who do this well are an absolute godsend; especially when they respond to repetitive tasks with tremendously useful pieces of automation like bughelper. The cases I have trouble with are more like somebody showing up untrained, going through everyone in the waiting room, and telling each of them that they just need to go home, get some rest, and stop complaining so much. Sometimes of course they'll be right, but without taking the time to understand the problem they're probably going to do more harm than good.

Ian Jackson reminded me that it's worth mentioning the purpose of bug reports on free software: namely, to improve the software. The GNU Project has some advice to maintainers on this. I think sometimes we stray into regarding bug reports more like support tickets. In that case it would be appropriate to focus on resolving each case as quickly as possible, if necessary by means of a workaround rather than by a software change, and only bother the developers when necessary. This is the wrong way to look at bug reports, though. The reason that we needed to set up a bug triage community in Ubuntu was that we had a relatively low developer-to-package ratio and a very high user-to-developer ratio, and we were getting a lot of bug reports that weren't fleshed out enough for a developer to investigate them without spending a lot of time in back-and-forth with the reporter, so a number of people volunteered to take care of the initial back-and-forth so that good clear bug reports could be handed over to developers. This is all well and good, and indeed I encouraged it because I was personally finding myself unable to keep up with incoming bugs and actually fix anything at the same time. Somewhere along the way, though, some people got the impression that what we wanted was a first-line support firewall to try to defend developers from users, which of course naturally leads to ideas such as closing wishlist bugs containing ideas because obviously those important developers wouldn't want to be bothered by them, and closing old bugs because clearly they must just be getting in developers' way. Let me be clear about this now: I absolutely appreciate help getting bug reports into a state where I can deal with them efficiently, but I do not want to be defended from my users! I don't have a basis from which to state that all developers feel the same way, but my guess is that most do.

Antti-Juhani Kaijanaho said he'd experienced most of these problems in Debian. I hadn't actually intended my post to go to Planet Debian - I'd forgotten that the "ubuntu" category on my blog goes there too, which generally I see as a feature, but if I'd remembered that I would have been a little clearer that I was talking about Ubuntu bug triage. If I had been talking about Debian bug triage I'd probably have emphasised different things. Nevertheless, it's interesting that at least one Debian (and non-Ubuntu) developer had experienced similar problems.

Justin Dugger mentions a practice of marking duplicate bugs invalid that he has problems with. I agree that this is suboptimal and try not to do it myself. That said, this is not something I object to to the same extent. Given that the purpose of bugs is to improve the software, the real goal is to be able to spend more time fixing bugs, not to get bugs into the ideal state when the underlying problem has already been solved. If it's a choice between somebody having to spend time tracking down the exact duplicate bug number versus fixing another bug, I know which I'd take. Obviously, when doing this, it's worth apologising that you weren't able to find the original bug number, and explaining what the user can do if they believe that you're mistaken (particularly if it's a bug that's believed to be fixed); the stock text people often use for this doesn't seem informative enough to me.

Sebastien Bacher commented that preferred bug triage practices differ among teams: for instance, the Ubuntu desktop team deals with packages that are very much to the forefront of users' attention and so get a lot of duplicate bugs. Indeed - and bug triagers who are working closely with the desktop team on this are almost certainly doing things the way the developers on the desktop team prefer, so I have no problem with that. The best advice I can give bug triagers is that their ultimate aim is to help developers, and so they should figure out which developers they need to work with and go and talk to them! That way, rather than duplicating work or being counterproductive, they can tailor their work to be most effective. Everybody wins.

[/ubuntu] permanent link

Mon, 02 Mar 2009

Bug triage rants

I hate to say this, but often when somebody does lots of bug triage on a package I work on, I find it to be a net loss for me. I end up having to go through all the things that were changed, correct a bunch of them, occasionally pacify angry bug submitters, and all the rest of it, and often the benefits are minimal at best.

I would very much like this not to be the case. Bug triage is supposed to help developers be more efficient, and I think most people who do bug triage are generally well-intentioned and eager to help. Accordingly, here is a series of mini-rants intended to have educational value.

  • Bugs are not like fruit.

    Fruit goes bad if you leave it too long. By and large, bugs don't, especially if they're on software that doesn't change very much. There is no reason why a bug filed against a package in Ubuntu 4.10 where the relevant code hasn't changed much since shouldn't still be perfectly valid. Even if it isn't, it deserves proper consideration.

    My biggest single annoyance with bug triage is people coming around and asking if bugs are still valid when they haven't put any effort into reproducing them themselves. This annoys bug submitters too; every so often somebody replies and says "didn't you even bother to check?". This gives a very bad impression of us as a project - wouldn't it be better if we looked as if we knew what we were talking about? There is a good reason to do this kind of check, of course: random undiagnosed crash reports and the like may well go away due to related changes, and it is occasionally worth checking. But if the bug is already well-understood and/or well-described, you should just go and check whether it's still there rather than asking.

    As I understand it, the intended workflow is that people file bugs, then if they aren't clear enough bug triagers work with the submitter to gather information until they are, then they're passed to developers for further work. We seem to have added an extra step wherein submitters must periodically give their bug a health-check, and if they don't then it gets closed as being out of date. In a small minority of cases this is useful; in most cases, frankly, it makes us look a bit clueless. Can we please stop doing this? The more we waste people's time doing this, the less likely it is that they'll bother to respond to us, and this might help our statistics but doesn't help the project as a whole.

    I know that there's a problem with bug count. I think every project of non-trivial size has that problem. But, honestly, the right answer is to fix more bugs - and, personally, I would be able to spend more time doing that if I weren't often running around trying to make sure that bugs I care about aren't getting overenthusiastically closed just because somebody thinks they've been lying around too long.

    There is a good way to expire bugs like this, of course. It goes something like this: "I've read through your bug and tried to reproduce it with a current release, but I'm afraid I can't do so. Are you still experiencing it? If not, then I think it might have been fixed by [this change I found in the package's history that seems to be related]." You can't do this en masse, but you'll get a much better response from submitters, you'll learn more doing it, and in the process of doing the necessary investigation of each bug you'll find that there are many cases you don't have to ask about at all.

  • Wishlist bugs are not intrinsically bad.

    There are certainly cases where something is far too broad or vague for a bug report; but there are also plenty of cases, probably far more, where the wish in question is a relatively small change to the program, or doesn't need any more sophisticated tracking, and a wishlist bug is just right. If you don't know the program very well, it may be difficult to tell whether a wishlist bug is appropriate or not; in that case, just leave the bug alone.

    Please, for the love of all that's holy, don't close wishlist bugs saying that people should use Brainstorm or write a specification instead! If you don't want to see wishlist bugs in your statistics, just filter them out; it's quite easy to do. Even worse, don't tell people that something probably isn't a good idea when you aren't familiar with the software; people who have gone to the effort of writing up their idea for us deserve a response from somebody who knows the software well. I've encountered cases where friends of mine submitted a bug report (sometimes even at my request) and then a triager told them it was a bad idea and closed their bug. This sort of thing puts people off Ubuntu.

    Specifications are software design documents. As such, they are best written by software designers. People who tell other people to go and write a specification may not realise that as a result of doing this for three years it's now essentially impossible to find anything in the specification system! The intent was never that every user of Ubuntu would need to write a specification to get anything changed; specifications are used by developers to document the results of discussions and write up plans. They are not a straightforward alternative to wishlist bugs, nor do they turn out to work very well as what many formal processes call "requirements documents"; the process of refining the latter in the context of Ubuntu might involve wishlist bugs, mailing list threads, wiki pages, private discussions with developers, or things of that nature, and probably shouldn't involve creating a specification until the requirements-gathering process is well underway.

  • Closing a bug is taking an item off somebody's to-do list.

    You wouldn't go up to a colleague's whiteboard and take an eraser to it unless you were sure that was OK, would you? Yet people seem to do that all the time with bugs. It's OK when the bug is really just like a support request - "help, it crashed, what do I do?" - and either you're pretty sure it's user error or there's just no way to get enough information to fix it. But once the initial triage process is done, now it's on somebody's to-do list.

    This is closely related to ...

  • If a developer has accepted it, leave it alone.

    Every so often I find that there's a bug that I have accepted by way of a bug comment or setting to Triaged or whatever, or even a bug that I filed on a package I work on as a reminder to myself, and somebody comes along and asks for more information or asks if we can still reproduce it or something. The hit rate on this kind of thing is extraordinarily low. There's a good chance that the developer went and verified the bug against the code, and in that case it certainly doesn't need more information (or they would have asked for it) and it probably isn't going to go away without anyone noticing.

    In most other free software projects, developers file bug reports themselves as a reminder about things that need to be done, and people leave them alone unless they're intending to help with the fix. In Ubuntu, developers also have to spend time making sure that those to-do items don't get expired. Nobody is helped by this.

    launchpad-gm-scripts includes a Greasemonkey script called lp_karma_suffix, which can help you to identify developers without having to spend lots of time clicking around.

  • Check whether the package is being actively worked on.

    Some packages are actively worked on in Ubuntu; some aren't (e.g. we just sync packages from Debian, or they're basically orphaned, or whatever). It's worth checking which is which before doing any kind of extensive triage work. If it's being actively worked on, why not go and talk to the developer(s) in question first? It's only polite, and it will probably help you to do a better job.

[/ubuntu] permanent link

Mon, 27 Oct 2008

Totem BBC plugin

A while back, the BBC approached Canonical about providing seamless access to unencumbered BBC content for all Ubuntu users (in the UK and elsewhere). We agreed to approach this by way of a plugin for our primary media player, Totem, and asked Collabora Multimedia to do the plugin development work.

The results are in what will shortly be released as Ubuntu 8.10, and are looking really rather good. At the moment the content available from the BBC at present is mostly audio, but support for video is in place and the feed is expected to be fleshed out here over time. We have a genre classification scheme in place, and will see how that scales as the amount of available content grows. The code has been submitted upstream, although there are still a few issues to work out there.

This is not the same thing as iPlayer; all the content available here is DRM-free. Some of it is geographically restricted to the UK, and these restrictions are handled on the server side to make sure that the client is free of encumbrances.

Christian Schaller from Collabora posted about this a little while ago. Since then, the UI has been improved somewhat and some I/O issues have been fixed to the point where we felt comfortable enabling the BBC plugin (as well as the YouTube plugin) by default in Ubuntu 8.10. Here's a screenshot of the current interface.

This is exciting stuff with a lot of potential. To try it out, run Applications -> Sound & Video -> Movie Player and select the "BBC" entry from the drop-down box labelled "Playlist". If you find bugs, please report them!

[/ubuntu] permanent link

Mon, 23 Jun 2008

Re: Perl is strange

Christoph: That's because =~ binds more tightly than +. This does what you meant:

$ perl -le 'print "yoo" if (1 + 1) =~ /3/'

perlop(1) has a useful table of precedence.

[/debian] permanent link

Don't use sshkeygen.com to generate keys!

To my horror, I recently saw this online SSH key generator.

I hope nobody reading this needs to be told why this is a bad idea. However, in case you do, here are a few reasons:

  • Every SSH implementation I know of - certainly all the major ones - that support public key authentication also provide a key generation utility. Even aside from all the good reasons not to, there is simply no reason why you should need to use a web-based tool in the first place.
  • How can you trust the person running this site? Without implying that I know he or she is untrustworthy (I don't), and with the best will in the world, it's a big Internet with a lot of nasty people on it. Do you really want somebody you don't know in a position to keep a copy of all your private keys?
  • Even if the person is trustworthy, the server running sshkeygen.com is now a giant blinking target. If lots of people use it, there is every incentive in the world for the bad guys to try to take control of it so that they can keep a copy of all your private keys. (Or, as we know from recent bitter experience, they can just give out keys from a limited set and it will probably take a couple of years before anyone notices ...)
  • The front page of sshkeygen.com says that the keys are escrowed. The plain English meaning of this would be that the operator of that site keeps a copy of the private key, to be held in trust in case (presumably) you lose it and need to retrieve it. Normally this sort of thing depends on a legal trust relationship, perhaps linked to a contract. What does it mean here? Is it just a buzzword? If it isn't, then this just makes sshkeygen.com even more of a target.
  • sshkeygen.com delivers keys to you over unencrypted HTTP. Yes, this is on its to-do list. That isn't really an excuse.
  • Even if keys were delivered over HTTPS, that still relies on people diligently checking the authenticity of the certificate. A self-signature (as suggested as an alternative in the to-do list) would be impossible to check with any reliability; and will people who have trouble with non-web-based key generation software really be able or inclined to confirm the signature chain? Browsers typically don't enforce this very strictly, or if they do they provide fairly simple ways to bypass the enforcement, simply because so many sites have broken or poorly-signed SSL certificates, and keeping up with all the CAs is pretty hard work too.
  • Furthermore, delivering private keys over HTTPS makes that SSL certificate a single giant blinking target. Might it be compromised? How would you tell? What servers would need to be compromised in order to get a copy of the private SSL key?
  • Sure, Debian is in an awkward position here given the recent OpenSSL random number generation vulnerability. However, how do you know that sshkeygen.com is running on a system that doesn't suffer from this? (As it happens, I have checked, and it doesn't appear to suffer from this vulnerability - but most people won't check and won't know how to check.)

I think this is probably being done in innocent seriousness (although I kind of hope it's a joke in poor taste), and have e-mailed the contact address offering to explain why it's a bad idea.

[] permanent link

Sat, 12 Apr 2008

Desktop automounting pain

Ubuntu's live CD installer, Ubiquity, needs to suppress desktop automounting while it's doing partitioning and generally messing about with mount points, otherwise its temporary mount points end up busy on unmount due to some smart-arse desktop component that decides to open a window for it.

To date, it employs the following methods, each of which was sufficient at the time:

  • Set the /desktop/gnome/volume_manager/automount_drives and /desktop/gnome/volume_manager/automount_media gconf keys to false.
  • Tell kded to unload its medianotifier module, and load it again just before the installer exits.
  • Set the /apps/nautilus/desktop/volumes_visible gconf key to false.
  • Set the AutomountDrives and AutomountMedia keys in $HOME/.config/Thunar/volmanrc to FALSE.
  • Set the /apps/nautilus/preferences/media_automount and /apps/nautilus/preferences/media_automount_open gconf keys to false.
  • The entire installer is run under hal-lock --interface org.freedesktop.Hal.Device.Storage --exclusive.
  • Set the /apps/nautilus/preferences/media_autorun_never gconf key to true (experimental, but apparently now required since nautilus uses the gio volume monitor).

This is getting ridiculous. Dear desktop implementors: please pick a configuration mechanism and stick to it, and provide backward compatibility if you can't. This is not a rocket-science concept.

I rather liked the hal-lock mechanism; it was simple and involved minimal fuss. I had hoped that it might end up as a standard, but I guess that would be too easy.

[/ubuntu] permanent link

Thu, 31 Jan 2008

Vim omni completion for Launchpad bugs

I hacked together a little timesaver for developers this morning: omni completion for Launchpad bugs in Vim's debchangelog mode. To use it, install vim 7.1-138+1ubuntu3 once it hits the mirrors, open up a debian/changelog file, type "LP: #", and hit Ctrl-X Ctrl-O. It'll think for a while and then give you a list of all the bugs open in Launchpad against the package in question, from which you can select to insert the bug number into your changelog.

Here's a screenshot to make it clearer:

Thanks to Stefano Zacchiroli for doing the same for Debian bugs back in July.

[/ubuntu] permanent link

Tue, 29 Jan 2008

UTF-8 manual pages

See Encodings in man-db for context.

Yesterday, I uploaded man-db 2.5.1-1 to unstable. With this version, not only is it possible to install manual pages in UTF-8 (as with 2.5.0, although with fewer bugs), but it's also possible to ask man to produce a version of an arbitrary page in the encoding of your choice, and have it guess the source encoding for you fairly reliably. This finally provides enough support to have debhelper automatically recode manual pages to UTF-8.

It'll probably take a little while to shake out the corner-case bugs, but I'm generally pretty happy with this. Once the new man-db and debhelper land in testing, I'll send a note to debian-devel-announce and push harder on my policy amendment.

Considering the historical state of man-db when it comes to localisation, and all of the dependencies and general yak-shaving that had to be tackled to get here, this represents the end of probably several hundred hours of work, so I'm pretty happy that this is out the door. The only remaining step is to add UTF-8 input support to groff, which fortunately Brian M. Carlson is working on. After that, we can reasonably claim to have dragged manual pages kicking and screaming into the 21st century.

[] permanent link

Thu, 29 Nov 2007

aptitude safe-upgrade

Erich: I do sometimes wonder why we don't relax the definition of "safe" upgrades to include installing new packages but still not removing old ones. I know that many of my uses of dist-upgrade are just for when something grows a new dependency that I didn't previously have installed.

(Of course this wouldn't always help as it wouldn't account for a new dependency that conflicted with an old dependency, but never mind. It would certainly do wonders for the metapackage case.)

[/debian] permanent link

Mon, 17 Sep 2007

Encodings in man-db

I've spent some quality upstream time lately with man-db. Specifically, I've been upgrading its locale support. I recently published a pre-release, man-db 2.5.0-pre2, mainly for translators, but other people may be interested in having a look at it as well. I hope to release 2.5.0 quite soon so that all of this can land in Debian.

Firstly, man-db now supports creating and using databases for per-locale hierarchies of manual pages, not just English. This means that apropos and whatis can now display information about localised manual pages.

Secondly, I've been working on the transition to UTF-8 manual pages. Now, modulo some hacks, groff can't yet deal with Unicode input; some possible input characters are reserved for its internal use which makes full 32-bit input difficult to do properly until that's fixed. However, with a few exceptions, manual pages generally just need the subset of Unicode that corresponds to their language's usual legacy character set, so for now it's good enough to just recode on the fly from UTF-8 to some appropriate 8-bit character set and use groff's support for that.

man-db has actually supported doing this kind of thing for a while, but it's been difficult to use since it only applies to /usr/share/man/ll_CC.UTF-8/ directories, while manual pages usually aren't country-specific. So, man-db 2.5.0 supports using /usr/share/man/ll.UTF-8/ instead, which is a bit more appropriate. Also, following a discussion with Adam Borowski, man-db can now try decoding manual pages as UTF-8 and fall back to 8-bit encodings even in directories without an explicit encoding tag; if this fails for some reason, you can put a '\" -*- coding: UTF-8 -*- line at the top of the page.

I'm still debating whether Debian policy should recommend installing UTF-8 manual pages in /usr/share/man/ll.UTF-8/ or just in /usr/share/man/ll/. Initially I was very strongly in favour of an encoding declaration, but now that man-db can do a pretty good job of guesswork I'm coming round to Adam Borowski's position that people should be able to forget about character sets with UTF-8. Opinions here would be welcome. One thing I haven't moved on is that any design that assumes that the encoding of manual pages on the filesystem has anything to do with the user's locale is demonstrably incorrect and broken; I'm not going to use LC_CTYPE for anything except output. However, maybe "UTF-8 or the usual legacy encoding provided that the latter is not typically confused for the former" is a good enough specification, and that still has the desirable property of not requiring a flag day. I'll try to come down from the fence before unleashing this code on the world.

[] permanent link

Wed, 04 Jul 2007

Keysigning public service announcement

If your key has so many UIDs and such a combinatorially exploded number of signatures on it that it takes gpg minutes just to start up in --edit-key mode, then I probably won't bother signing it. HTH, HAND.

[] permanent link

Sat, 23 Dec 2006

Moving conffiles between packages, redux

I spent far too much of today cleaning up an upgrade bug to do with conffiles, which I suspect also affects other packages that have attempted to work around dpkg conffile prompts when moving conffiles between packages. If you maintain such a package, please review your code to make sure that it works properly when upgrading both with sarge's dpkg and with etch's dpkg. See my debian-devel post for a full description.

[/debian] permanent link

Fri, 26 May 2006

Google Summer of Code project started (Ubuntu)

More on the Google Summer of Code: as well as the project I'm mentoring for Debian, I'm mentoring Evan Dandrea (no blog yet?), writing a migration assistant for Ubiquity.

I haven't talked much about Ubiquity, mostly because I've been far too busy writing it. Ubuntu has needed an installer for its live CD for a while, partly because, well, loads of users want it, and partly because it will cut Canonical's costs quite a lot if we only have to send out half the number of CDs in shipit (apparently single-CD packaging is significantly cheaper than double-CD packaging). I'd been resisting doing something from scratch because I love d-i and I think it would be a really bad idea to end up maintaining two installer implementations from the ground up (live CDs are nice, but they don't cut it for everyone). So when I was given the task of doing a shiny live CD installer with a custom-designed UI, while I started with a more-or-less from-scratch implementation put together by Guadalinex (thanks!), I fairly quickly diverged from that and morphed it into something that uses d-i code for as much of the backend operation and logic as it sensibly can. I'd call it a sort of debconf frontend except that its design is almost opposite to how a debconf frontend should work, in that it's highly specialised to react to particular question names. This had a lot of advantages in terms of being fairly quick to write, although in the long term I think I might prefer something closer to cdebconf plugins for the job; we'll see how things turn out.

Evan wanted to write an extension to this to automatically migrate settings and documents from an existing Windows installation, which I think would be an absolutely excellent thing to have: automatic migration is a real killer feature in an installer. In fact, d-i already has a start at this, namely os-prober, which in conjunction with some bootloader installer code magically sets up boot menu entries for other operating systems on your disk. So, I suggested to Evan that he might want to put most of the clever logic in a udeb, so that it can be used in d-i too, and to my relief he seemed quite enthused by the idea. He's starting on the preliminary work now and I look forward to seeing his progress.

Best of luck, Evan!

[/summerofcode/2006] permanent link

Google Summer of Code project started (Debian)

I'm mentoring Matheus Morais in the Google Summer of Code, porting d-i to the Hurd. We've exchanged a few mails and he has in hand all the preliminary (but not yet functional; wouldn't want to make it too easy :-)) patches I've put together in the past. I think I should be reasonably well-placed to judge his progress.

Best of luck, Matheus!

[/summerofcode/2006] permanent link

Mon, 06 Feb 2006

Unix tools: sponge

Joey writes about the lack of new tools that fit into the Unix philosophy. My favourite of such things I've written is sponge. It addresses the problem of editing files in-place with Unix tools, namely that if you just redirect output to the file you're trying to edit then the redirection takes effect (clobbering the contents of the file) before the first command in the pipeline gets round to reading from the file. Switches like sed -i and perl -i work around this, but not every command you might want to use in a pipeline has such an option, and you can't use that approach with multiple-command pipelines anyway.

I normally use sponge a bit like this:

sed '...' file | grep '...' | sponge file

Since it's so trivial I imagine lots of other people have written something similar (another common name for it seems to be inplace; my name indicates soaking up all the input and then squeezing it all out again); but I do keep meaning to try to get a rewritten version into coreutils at some point.

[] permanent link

Fri, 27 Jan 2006

debconf/cdebconf coinstallability

Joey has been campaigning for a while to get everything in the archive changed to depend on debconf | debconf-2.0 or similar rather than just debconf, in order that we can start rolling out cdebconf as its replacement. Like most jobs that involve touching the bulk of the archive, this looks set to take quite a while, as the list of bugs should indicate.

Recently it occurred to me that we didn't necessarily have to do it that way round. In a bout of late-night hacking while staying awake to look after a sick child (he seems mostly OK now, although the rushed trip to the hospital earlier was a bit on the nerve-wracking side), I've shuffled things around in the cdebconf package so that it no longer has any file conflicts with debconf or debconf-doc, and changed the debconf confmodule to fire up the cdebconf frontend rather than its own if the DEBCONF_USE_CDEBCONF environment variable is non-empty. (The details of this may change before it actually gets uploaded, as I'd like to get Joey to look it over and approve it first.) This allows you to install cdebconf, set that environment variable, and play around with cdebconf with relative ease; when we come to switch to cdebconf for real, instead of a huge conflicting mess that apt will probably have trouble resolving, it'll just be a matter of changing a couple of lines in /usr/share/debconf/confmodule.

Of course, don't expect cdebconf to be a complete working replacement for debconf just yet; if you try using it for a dist-upgrade run it'll fall over. Due to its d-i heritage, it doesn't yet load templates automatically; that has to be done by hand. Frontend names differ from debconf's, which will need some migration code. At the moment it can only handle UTF-8 templates, which are mandated in the installer but only optional in the rest of the system. It doesn't have all of debconf's rich array of database modules. I haven't adapted the Perl or Python confmodules yet. The list goes on. However, I think we at least stand a chance of getting a handle on the problem now.

(I'll post this article to debian-devel once the changes have been reviewed and uploaded.)

[/debian] permanent link

Mon, 09 Jan 2006

Killer apps: bzr shelve

Working on free software has made me fairly revision control system-agnostic; I can't afford to get too wedded to any one system because as soon as I do somebody will invent something new and I'll have to convert again, so I just work with whatever other people on the same project are using. Even CVS doesn't make a lot of difference to the way I work as long as I'm working online and have cvsps handy. And of course I usually don't bother with revision control if I'm just tweaking somebody else's Debian source package a bit (in which case I just use debdiff for paranoia).

Using bzr at work, though, I think I just found my killer app in Michael Ellerman's shelve plugin. My working style generally involves alternating between doing lots and lots of stuff in the one working copy and (after testing) going through and committing it in logical chunks. This is fine if everything's in separate files (most revision control systems let you commit just some files), but if several of the chunks are in the one file then I'm reduced to saving diffs and manually editing out the bits I don't want to commit yet, which is obviously pretty tedious and error-prone.

bzr shelve presents each diff hunk in your working copy to you in turn and asks you whether you want to keep it. If you say no, that hunk gets unapplied and goes into a "shelf", where bzr unshelve can later reapply it. In the meantime commits act as though the shelved hunks didn't exist. This doesn't help if you want to defer only one of two immediately adjacent changes that end up in the same hunk, of course, but it vastly reduces the scale of the problem.

I suppose it would be easy enough to write a shelve-a-like for any other system; it's just that I haven't seen it for any other system yet. If working with systems that lack it really starts to annoy me, I may have to rip out the guts of shelve and figure out how to make it generic.

[] permanent link

Tue, 03 Jan 2006

Single-stage installer

Hot on the heels of Joey's tale of getting rid of base-config (the second stage of the installer) in Debian, we've now pretty much got rid of it in Ubuntu Dapper too. The upshot of this is that rather than asking a bunch of questions, installing the base system, and rebooting to install everything else, we now just install everything in one go and reboot into a completed system.

This does mean that, if your system doesn't boot, you don't get to find out about it for a bit longer. However, it has lots of advantages in terms of speed (the much-maligned archive-copier mostly goes away), reducing code duplication (base-config had a bunch of infrastructure of its own which was done better in the core installer anyway), comprehensibility, and killing off some annoying bugs like #13561 (duplicate mirror questions in netboot installs), #15213 (second stage hangs if you skip archive-copier in the first stage), and #19571 (kernel messages scribble over base-config's UI).

To go with Joey's Debian timeline, the Ubuntu history looks a bit like this:

  • 2004 (Jul): First base-config modifications for Ubuntu; we need to install the default desktop rather than dropping into tasksel.
  • 2004 (Aug): Mark phones me up and asks if I can make the installer not need the CD in the second stage by copying all the packages across beforehand. Although it's a bit awkward, I can see the UI advantages in that, so I write archive-copier at the Canonical conference in Oxford.
  • 2004 (Sep): Mark asks me if we can ask the timezone, user/password, and apt configuration questions before the first reboot. With less than a month to go until our first release, I have a heart-attack at how much needs to be done, and it eventually gets deferred to Hoary.
  • 2005 (Jan): Matt fixes up debconf's passthrough frontend for use on the live CD, and we realise that this is an obvious way to run bits of base-config before the first reboot. It's rather messy and takes until March or so before it really works right, but we get there in the end.
  • 2005 (Apr): I get "put a progress bar in front of the dpkg output in the second stage" as a goal for Breezy. Naïvely, I think it's a simple matter of programming, since I'd already done something similar for debootstrap and base-installer the previous year.
  • 2005 (May): I hack progress bar support into debconf. Nothing actually uses it for anything yet, except as a convenient passthrough stub.
  • 2005 (Jul/Aug): I actually try to implement the second-stage progress bar and realise that it's about an order of magnitude harder than I thought, requiring a whole load of extra infrastructure in apt. Fortunately Michael Vogt saves the day here by writing lots of working code, and the progress bar works by early August.
  • 2005 (Sep-Dec): Upstream d-i development ramps back up again, with tzsetup, clock-setup, apt-setup, and user-setup all being cranked out in short order and the corresponding pieces removed from base-config. I merge these as they mature, and manage to get agreement on including the Ubuntu debconf template changes in upstream apt-setup, which helps the diff size a lot.
  • 2005 (Nov/Dec): Joey and I chat one evening about the Ubuntu second-stage progress bar work, and we end up designing and writing debconf-apt-progress based on its ideas, after which Joey knocks up pkgsel in no time flat.
  • 2006 (Jan): The rest of the pieces land in Ubuntu, and we drop base-config out of the installer. To my surprise, nearly everything still just works.

Although it caused some friction, I'm glad that we did the first cuts of many of these things outside Debian and got to try things out before landing version-2-quality code in Debian. The end result is much nicer than the intermediate ones ever were.

[/ubuntu] permanent link

Forwarding bugs to the IETF

Sometimes following up on a bug takes you a lot further than you expected. Debian bug #337041 looked like it was going to be fairly straightforward once I upgraded coreutils to figure out what the new IUTF8 flag actually did, since the SSH2 protocol already supports transferring termios flags around.

Unfortunately, since IUTF8 is relatively new, it doesn't have a number assigned in the draft connection protocol. Moreover, that Internet-Draft is in the last stages before becoming an RFC and can't be modified any more, and it doesn't include any facility for private-use extensions. D'oh. To add further complication, since IUTF8 is Linux-specific, it's not hard to imagine that some other OS might introduce something with the same name but subtly different semantics, and so the SSH protocols can't just defer to POSIX for the definition but instead have to spell out exactly what they mean.

As a result of all of this, it looks like the best way to make progress might be for me to write an I-D myself that creates a channel extension to set or clear IUTF8, and attempt to enlist support from some upstream implementors. I didn't expect bug triage to lead me into the Internet standardisation process quite so quickly!

[/debian] permanent link

Hello!

New year, new blog. I've had a LiveJournal for a while, but don't write very much in it, and many of its readers wouldn't be interested in me talking about Debian and such anyway. I think the best solution is for me to keep technical posts here.

[] permanent link