cleanfeed and wikipedia

Ian Batten ukcrypto at chiark.greenend.org.uk
Wed, 10 Dec 2008 10:21:15 +0000


On 10 Dec 08, at 0821, Roland Perry wrote:

> In article <493EF0D4.3080707@ernest.net>, Nicholas Bohm <nbohm@ernest.net 
> > writes
>>> You are going for the "vacuum" theory, which I believe to be  
>>> unsustainable.
>>
>> No, I'm not going for a theory, I'm asking for facts.
>
>        "The reports received via the IWF internet 'Hotline' are
>        assessed by Internet Content Analysts (ICAs) who have
>        comprehensive and in-depth training on relevant UK legislation
>        by the appropriate UK police personnel"

Let's assume that they are trained, by a combination of rigourous  
checklists and a pre-assessed collection of examples, so that their  
assessments will align with those a court would give to a reasonable  
extent.  It's like risk assessment frameworks: the best you can hope  
for is that the same assessment done by different people will cluster  
with a reasonably normal distribution.

That still doesn't explain the training on relevant UK legislation  
that would have been involved in yesterday's decision to remove the  
image at hand from the black list.

The IWF had previously been process-driven, which made allegations of  
bias and general badness hard to sustain.  Their assessors acted as an  
oracle for PoCA 1978 and other legislation, and that was used to  
accept or reject reports for inclusion in the black list.  The process  
was very simple: take a report, test it with the PoCA oracle, then  
either include it in the blacklist or drop it on the floor.

If you had a feed of the complaints they received, and a similarly  
capable oracle, you would be able to construct a blacklist that was,  
plus or minus, the same.  That's the test of a reproducible process:  
given the inputs, the documentation and the trained staff, you can  
produce a similar output.  The definition of `similar' is the stuff  
that keeps auditors in work.

And that oracle can be independently constructed: I'd be fairly  
confident that with the co-operation of the police, you could train a  
group of people who had never met an IWF employee to produce the same  
decisions, +/- 10%.

As of yesterday, however, that all changed.  The process is now that  
reports are processed by the oracle to determine legality, and then re- 
processed by the board of directors with no published criteria.  You  
now can't replicate their work: there's a black-box on the output  
which you can't reproduce.  No external body can predict the decisions  
that IWF will make when confronted with a given case.

As I said last night, I think the IWF have reached the right outcome  
by entirely the wrong route, and everyone will live to regret it.   
They've admitted a whole stack of criteria --- availability, age,  
collateral damage --- which are not in the legislation, not in their  
charter and not codified anywhere.  They've taken something that was,  
outside a small pool of people, uncontentious and given it a political  
dimension that was previously missing.

ian