What is Communications Data?
Quentin Campbell
Q.G.Campbell at newcastle.ac.uk
Wed, 13 Nov 2002 08:07:12 +0000 (GMT)
On Tue, 12 Nov 2002, Charles Lindsey wrote in reply to the following:
>
> On Tue, 12 Nov 2002 13:09:48 +0000 (GMT)
> Richard D G Cox <Richard.Cox@mandarin.org> said...
>
> > It seems to me, in the spirit of RIPA and other legislation, that anything in
> > any electronic message that is capable of being falsified by or on behalf of
> > the sender, can NOT be considered to be "Comms Data" - and that in reality
> > only such data that are known to be generated on or by machine(s) that is/are
> fully trusted, by reference (where relevant) to other data (such as
> > time references) that are ALSO similarly trusted, can properly be regarded
> > as "Comms Data" whether for the purposes of investigation/intelligence,
> > or any other decision making.
>
> But that's not what RIPA says. It specifically includes data that
> "purports" to identify the sender or recipient, and that is _precisely_
> what the From and To headers do.
But what if they are _purely content_ as in the example I have constructed
below? I can generate an SMTP session by hand using "telnet some.mailhost
25" and cut and paste the following text after the SMTP "DATA" command:
---- cut here
Received: from it.linx.net ([123.456.789.101])
by london.linx.net with SMTP id a-made-up-one for
<ukcrypto@chiark.greenend.org.uk; Weekend, 31 June 2019 25:48:41
x-esmtp: 0 0 1
Message-ID: <1234@perry.co.uk>
To: LittleRedRidingHood
From: Woolfie@bigsofty.com
Subject: Hey there Little Red Riding Hood you sure are looking good
Date: Sun, 31 June 2019 25:34:48
Whose "content" is it anyway? This line is the body of the message.
---- cut here
and follow it with <CR LF>.<CR LF> QUIT to terminate the message and SMTP
session.
Most [1] MTAs will accept that text as a message and deliver it to the
mailbox specified as the envelope recipient address by the SMTP "RCPT TO:"
command.
This sham message illustrates a number of points made by Nigel
Metherington, Bruno Postle, Richard Cox and others about "communications
data" versus content. You will observe:
1. That the message To:/From: addresses are meaningless as traffic data
2. That the Date: header cannot be relied upon
3. That you cannot even rely on Received: headers
4. That at the time this message left the orginating MTA _everything_
in the message was _strictly content_ AND had no bearing whatsoever
on where the message ended up or how it got there.
This last point is particularly important. The actual delivery address was
specified on the SMTP envelope ("RCPT TO:"). All the message headers and
the one line of the message body were entered as _content_ following the
SMTP "DATA" command.
The question "whose content is it anyway?" is relevant because at the
point this message reached the receiving MTA _everything_ in the message
was still pure content, including the bogus "Received:" header. This
follows from my point (4).
Thus an ISP involved in routeing this message has no business under Pt I
Ch 2 accessing the part of the communication that contained my original
message. How does he know where in the communication my message ends? In
general he may not be able to determine this reliably.
If he lifts all of the "Received:" headers he finds in the communication
because they are data that "purports" to identify the sender or the
recipient then he has copied (ie. "intercepted") some of my original
content.
[1] I am impressed that anti-spam measures used by chiark's MTA prevent
me demonstrating this directly to the UKCRYPTO list. However most
other MTAs are not so clever. Try this experiment out for youself.
Quentin
--
PHONE: +44 191 222 8209 Computing Service, University of Newcastle
FAX: +44 191 222 8765 Newcastle upon Tyne, United Kingdom, NE1 7RU.
-------------------------------------------------------------------------
"Any opinions expressed above are mine. The University can get its own."