Mastering the Internet
ken
ukcrypto at chiark.greenend.org.uk
Tue, 05 May 2009 13:14:59 +0100
Peter Fairbrother wrote:
> What if the black boxes look for keywords,
> encrypted material, etc in content, and just
> send that type of content back to Cheltenham?
> That solves the bandwidth problem.
Only if you either store the content somewhere so you can look
at it later if it turns out to have been interesting, or else
then use that keyword information to set up a more detailed
monitoring of some stream of communications in future.
You have to actually READ the stuff. If you don't, keywords
don't win you anything that traffic analysis already did.
There is (I imagine but don't know for sure) too much encrypted
or compressed data flying around already for the mere presence
of encryption in otherwise untargeted traffic to be sufficient
to attract the attention of spooks.
> Or maybe GCHQ has a magical compression algorithm?
> We have very little
> idea of the actual entropy of most communications.
Yes we do. Really. People who do boring things like designing
routers and network cards have a very good idea of what is sent
in real life. And the maths is the same whatever magic they have
in Cheltenham. And is exactly the same maths as we use to do
bioinformatics and stuff - for example trying to tell genes from
"junk" DNA, or find where genes start and end on a genome.
Linguists do it too. Telling signal from noise can be hard, but
we know about compression and entropy. Really.
For what its worth I have very little doubt that GCHQ and the
other three and four letter acronym agencies in this country and
abroad now and again read people's mail, whatever the law says.
I have no idea at all whether they do that a few dozen times a
year or a few hundred times a second. My guess is the latter is
more likely.
Which is (pretty obviously) the reasoning behind UK government's
reluctance to use intercept data in criminal prosecutions.
Sometimes the ordinary police get tipped off from illegal
intercepts.
But its just very UNlikely that they can intelligently scan ALL
electronic communications, even all unencrypted electronic
communication, and extract sense from it. Not only, as Roland
said, would they need a computer & comms infrastructure of the
same order of magnitude as the whole rest of the world (& the
NSA & the CIA & the FBI are big but they are not THAT big -
although they are conveniently located right next to the US end
of loads of transatlantic cables) but also to read the stuff
they would either have to employ ten percent of the population
to snoop on the other ninety percent (like they used to in
Romania and East Germany) or else they would have to have some
sort of AI that we are pretty sure they don't (because if they
did it would be getting used for all sorts of other things by now)
Keyword scans don't cut it. They can can only alert you to some
possible information of interest but you still need to read it
to be able to take action on it. Even really clever stuff like
Google does. (Google is relevant. I suspect that the spooks are
the followers not the leaders on the software side of this
technology, as they have been on the hardware side for decades)