Mastering the Internet
Peter Fairbrother
ukcrypto at chiark.greenend.org.uk
Tue, 05 May 2009 17:00:27 +0100
ken wrote:
> Peter Fairbrother wrote:
>
> > What if the black boxes look for keywords,
> > encrypted material, etc in content, and just
> > send that type of content back to Cheltenham?
> > That solves the bandwidth problem.
>
> Only if you either store the content somewhere so you can look at it
> later if it turns out to have been interesting, or else then use that
> keyword information to set up a more detailed monitoring of some stream
> of communications in future.
Or both.
>
> You have to actually READ the stuff. If you don't, keywords don't win
> you anything that traffic analysis already did.
I'm not entirely sure that that's true - but even if it is, what's the
problem? GCHQ has hectares of analysts.
>
> There is (I imagine but don't know for sure) too much encrypted or
> compressed data flying around already for the mere presence of
> encryption in otherwise untargeted traffic to be sufficient to attract
> the attention of spooks.
>
> > Or maybe GCHQ has a magical compression algorithm?
> > We have very little
> > idea of the actual entropy of most communications.
>
> Yes we do. Really. People who do boring things like designing routers
> and network cards have a very good idea of what is sent in real life.
> And the maths is the same whatever magic they have in Cheltenham. And is
> exactly the same maths as we use to do bioinformatics and stuff - for
> example trying to tell genes from "junk" DNA, or find where genes start
> and end on a genome. Linguists do it too. Telling signal from noise can
> be hard, but we know about compression and entropy. Really.
What's the average per-second entropy of spoken English?
splitting my reply to your long email in two, so bye for now,
-- Peter Fairbrother