Government email looks like spam !
Richard Clayton
ukcrypto@chiark.greenend.org.uk
Wed, 8 Oct 2003 18:47:20 +0100
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
This is isn't really crypto policy, so apologies for that, but it is a
lot to do with wider issues of e-commerce and in particular using email
to communicate with the UK Government.
As seems regrettably common when I start a thread, I need to set the
scene first.. the interesting policy-related stuff comes towards the end
of rather a long email. Sorry!
People have probably heard of SpamAssassin, which is a widely used tool
for classifying incoming email using a wide range of heuristics. The
idea is that "real email" will score low and "spam" will score highly.
As a rule of thumb you can reckon that no real email will ever score
more than 4 (with most scoring less than 1) and only a very small amount
of spam will ever score less than 3 (with most scoring 6 or a lot more).
People set their own tolerance levels as to when they discard email or
place it into a seldom looked at folder... but popular decision points
are at 10 (very generous, see a lot of spam), 5 (a reasonable median)
and 3 (very seldom see spam or lose anything important).
One of the heuristics that excites SpamAssassin is email which contains
HTML. This is not because the authors think this a sign of foolishness
in the writer of the email in arrogantly assuming that the remote site
can cope with more than mere ASCII; but because their experience from
looking at lots and lots of spam is that a lot of bulk unsolicited email
uses HTML these days.
Many of the people in corporate Britain, the Government and the police
use Microsoft (let's name names! it's MS creating much of the problem)
email clients which tend to generate lots of HTML output by default :(
This means that their email is more likely to look like spam than the
sort of email that somebody like myself would send.
Since, in general, you can reckon on an incoming HTML email scoring 1.2
or so on the "spamminess" scale, and if your correspondent is using some
big fonts (perhaps in a fancy signature) and some colours (red and blue
score 0.1, green 0.7!) then this email can often hit a score exceeding
2.0 Viz: this type of corporate email is still being classified as
"proper email", but looks a great deal more like spam than most of the
email that arrives in one's inbox.
BTW: SpamAssassin has recently been _incorrectly_ detecting users
of the very latest versions of Outlook as being spam senders
(Microsoft changed some of the header formats and that fooled the
heuristics). This was worth 4 points all on it's own!! and it
turns out there's a lot of police etc playing with these new
versions.. and scoring loads of "looks like spam" points. See
http://bugzilla.spamassassin.org/show_bug.cgi?id=2538 for all the
gory details of the fix ... and if you're using SpamAssassin with
a rejection level of 4 or so and some correspondents don't seem
to write any more -- now you know why! Upgrade!!
Anyway, with the groundwork laid, I can now finally start getting to the
point: one of the several hundred tests that SpamAssassin makes is to
determine whether any of the mail machines through which the email has
been sent is listed in "ipconfig.rfc-ignorant.org". This organisation
keeps several different lists, the common theme being that they list
people who do not fulfill their RFC (strictly STD) mandated obligations.
The reason for the test is, as ever, that spam tends to fail this test
and legitimate email does not. So it's a useful discriminator and the
more useful it is, the higher the score it gets.
Now if you look at the headers of any email you receive from Government
departments or from policemen (and doubtless other similar
organisations) you will see that the Government's part of the Internet
infrastructure uses IP addresses from 51.0.0.0/8 (what we once called
the class A network 51.0.0.0 -> 51.255.255.255).
If you look up these addresses at the US-based ARIN regional registry
(not at the European RIPE registry, because this netblock was allocated
a long long long time ago) you will find a single WHOIS record that
reads:
OrgName: Department of Social Security of UK
OrgID: DSSU
Address: Naming and Addressing Authority c/o DITA
Address: Government Buildings - GZI
Address: Moorland Road
Address: Lytham St. Annes, Lancashire FY8 3ZZ
City:
StateProv:
PostalCode:
Country: GB
NetRange: 51.0.0.0 - 51.255.255.255
CIDR: 51.0.0.0/8
NetName: ITSANET
NetHandle: NET-51-0-0-0-1
Parent:
NetType: Direct Assignment
Comment:
RegDate: 1991-09-16
Updated: 1999-04-13
and you will also see that there are no "whois" records for any smaller
allocations.
There's some problems with this "whois" record. The Department of Social
Security was abolished over two years ago, there is no phone number
(should you suddenly wish to talk with the operator of a 51.0.0.0/8
machine) and there is no email address if you wanted to report some
abuse, and indeed, having spent some time with BT's phone number
searching system trying to locate any better contact details than the
postal address given above, there's not even an obvious switchboard
number to have a chat with!
Now this inability to talk to a responsible sysadmin person _doesn't_
_really_ _actually_ _matter_ because you'll find that 51.0.0.0/8 is not
currently announced on the Internet at all (there are no routes to it
and a traceroute will go a couple of hops to a box on your border and
then fail). So you'll never get any genuine traffic from this network or
ever send any packets to it -- so why would you ever want to speak to a
sysadmin there ??
So all was fine and dandy with nothing very much going wrong, until mid-
afternoon on the 15th September 2003 when someone at the College de
France, 11, Place Marcelin Berthelot, Paris (venerable institution, been
around since 1530 or so) decided to submit 51.0.0.0/8 to the "ipwhois"
part of the rfc-ignorant website on the basis that they did not feel
that the record above constituted sufficiently detailed "whois"
information in the sense of RFC954 ....
... and ever since 17th September when the submission was accepted into
the rfc-ignorant list and published to the world, SpamAssassin has been
scoring all email from the UK Government, police etc which passes
through a mail machine on the 51.0.0.0/8 network (which is a very high
proportion of the email from this type of organisation) with an extra
1.45 points on the spamminess scoreboard !
This extra score is sufficient, when combined with other factors such as
writing in HTML, to make some of the UK Government's email look so much
like spam that I strongly suspect that a lot of SpamAssassin users are
now filtering it out :(
Now you might argue that SpamAssassin is at fault here, but they'd say
that this is just one heuristic amongst many and produce some statistics
to show that it's a really good measure of spam.
You might argue that rfc-ignorant is at fault, but they'd say that they
were only applying the tests in the standards documentation.
You might say that the 'Department of Social Security' is at fault (hey,
maybe this is on-topic for ukcrypto@ after all?) but I suspect even the
people at the Department for Work and Pensions that they morphed into
have forgotten that they're the nominal owner of all the UK Government's
IP address space :(
You might say all the people writing HTML email are at fault (and you'd
not get a huge argument from me about that!)... or you may decide that
it's all so much simpler to blame the French!
But perhaps it's no-one's fault, but just one of those things. In that
case I wonder who will fix it ??
The UK Government has an oft-quoted mantra to the effect that they'd
wish to be the best place in the world to do e-business. Well, just at
the moment, we can quantify how well they are doing -- and they're at a
1.45 point disadvantage to everyone else!
- --
richard Richard Clayton
They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety. Benjamin Franklin
-----BEGIN PGP SIGNATURE-----
Version: PGPsdk version 1.7.1
iQA/AwUBP4RNqBfnRQV/feRLEQKpfgCfXz2Fwvupf89353HEeY8k6k6btYgAnikM
pWuZ4mc9fU/8ySUNC5rieJiO
=iWw2
-----END PGP SIGNATURE-----