Mastering the Internet
Peter Fairbrother
ukcrypto at chiark.greenend.org.uk
Wed, 06 May 2009 01:12:02 +0100
ken wrote:
> But its just very UNlikely that they can intelligently scan ALL
> electronic communications, even all unencrypted electronic
> communication, and extract sense from it. Not only, as Roland said,
> would they need a computer & comms infrastructure of the same order
> of magnitude as the whole rest of the world
AFAIR, Roland didn't mention computer infrastructure. He said it would
take ten times the volume of the network for the boxes to sent all
traffic back for analysis, as a single message would on average be seen
by about ten black boxes.
However the boxes could filter duplicate traffic, at least in part - and
more important, the boxes don't have to send all the traffic they see to
GCHQ.
All that has to happen to that volume of traffic is an initial
examination, which could be done by the blackboxes, whose search
parameters etc would be updated in real-time by GCHQ.
The boxes would then send on their product to GCHQ - but the volume of
product would be much smaller, and perhaps tailored to fill the capacity
of the links back to GCHQ (or vice-versa).
I've decided to rough-out a design for an actual black box, just to see
what might be possible.
I reckon 300 well- distributed 10-GigE/OC-192 capable boxes could access
99% plus of all UK internet traffic, assuming installing them became
mandatory (either by law, bribery, blackmail or by machinegun).
Hmmm, thinking about it, don't the FBI do black boxes?
The other question is what GCHQ could do with that data - but first
let's see how much data, and of what type, they could get.
I'll stop for a bit now, until the design is nearer done
- perhaps tomorrow - except to append the (entirely imaginary) wishlist
below.
-- Peter Fairbrother
[><] GCHQ Blackbox Wishlist:
0) All available traffic is examined. All search parameters etc can be
updated in real time.
1) Data selected by to: and from: IP numbers is provided in full
1a) Data selected by human names or other data identifying the
sender/recipient is provided in full for analysis,
1c) Data selected according to keyword, keyword combination etc
is provided in full for analysis.
2) "Data of interesting types", eg the encrypted data, is dealt with as
appropriate, including providing IP etc headers so GCHQ can make a map
of who's using encrypted comms, VPN's etc.
2a) "Data of possible interest", selected by IP, keyword and/or other
search terms is stored in entirety in the box for a (6 month) period,
with selected portions, eg headers provided, and the rest provided if
actually requested.
2b) Analyses of total data are provided.
3) The entire (filtered to remove duplicates?) stream is stored for a
(5 day) period, and available for retrieval or for locally performed
special analysis, including searching, as and if needed,
[><]