Mastering the Internet

Peter Fairbrother ukcrypto at chiark.greenend.org.uk
Tue, 05 May 2009 17:00:27 +0100


ken wrote:
> Peter Fairbrother wrote:
> 
>  > What if the black boxes look for keywords,
>  > encrypted material, etc in  content, and just
>  > send that type of content back to Cheltenham?
>  > That solves the bandwidth problem.
> 
> Only if you either store the content somewhere so you can look at it 
> later if it turns out to have been interesting, or else then use that 
> keyword information to set up a more detailed monitoring of some stream 
> of communications in future.

Or both.
> 
> You have to actually READ the stuff. If you don't, keywords don't win 
> you anything that traffic analysis already did.

I'm not entirely sure that that's true - but even if it is, what's the 
problem? GCHQ has hectares of analysts.
> 
> There is (I imagine but don't know for sure) too much encrypted or 
> compressed data flying around already for the mere presence of 
> encryption in otherwise untargeted traffic to be sufficient to attract 
> the attention of spooks.
> 
>  > Or maybe GCHQ has a magical compression algorithm?
>  > We have very little
>  > idea of the actual entropy of most communications.
> 
> Yes we do. Really. People who do boring things like designing routers 
> and network cards have a very good idea of what is sent in real life. 
> And the maths is the same whatever magic they have in Cheltenham. And is 
> exactly the same maths as we use to do bioinformatics and stuff - for 
> example trying to tell genes from "junk" DNA, or find where genes start 
> and end on a genome. Linguists do it too. Telling signal from noise can 
> be hard, but we know about compression and entropy. Really.

What's the average per-second entropy of spoken English?

splitting my reply to your long email in two, so bye for now,

-- Peter Fairbrother