Buckinghamshire CC ANPR cameras

John Wilson tugwilson at gmail.com
Mon Feb 6 10:24:48 GMT 2012


On 5 February 2012 22:55, Brian Morrison <bdm at fenrir.org.uk> wrote:
> On Sun, 5 Feb 2012 21:56:38 +0000
> Ian Batten <igb at batten.eu.org> wrote:
>
>> You have to wonder how on earth an obscure, unreviewed algorithm published
>> in a hobbyist magazine ends up being used in a production system,
>> don't you?
>
> Sadly no, I imagine that the reason it was used was because someone
> found it and didn't do any more thinking about what was needed and how
> the algorithm would affect those needs.


The document claims they tested a range of hash functions with 100,000
valid registration numbers. It seems quite a small test sample.

I'm not sniffy about contributors to Dr Dobbs. Back in the day it was
a good deal more useful to the working programmer than the ACM
Communications.

I've dome some quick tests using generated registration numbers. If
you try all the possible valid registration numbers between 51 and 12
(that's about 130 million) 24% have 10 collisions or fewer, 53% have
25 collisions or fewer, 92% have 100 collisions or fewer. 1% of the
numbers have 246 collisions or more. The highest number of collisions
is 2080. Of course, in the field you will have pre 2001 registration
numbers, "cherished numbers" and foreign numbers none of which I take
into account with this test.

It would be interesting to know if the DVLA manages the numbers it
issues to minimise the number of collisions. The know all the current
registration numbers so could suppress new numbers which would have a
high collision rate.

I think the hash function is good enough to argue that large datasets
of traffic data are not really fully anonymised.

John Wilson



More information about the ukcrypto mailing list