Data retention question

Thu Jul 17 19:32:54 BST 2014

On 17/07/14 17:36, Alex Burr wrote:
> Records do not correspond to packets or tcp sessions, they correspond to 'an entire connectivity session' (I'm not familiar enough of mobile phone protocols to identify that) which are apparently 'closed' due to, for example a time out, which can happen after 5 minutes of inactivity. So it is plausible that a poll every 10 minutes by an email client could cause a comprehensive location record for a UK user, as in the Spitze case. But I don't have evidence of whether this typical.

I know nothing about how the UK operators are set up, but I know a
little about billing in mobile networks in general.

For those operators who retain location for data use, they almost
certainly take it from the "CDR" (call detail record) which is the
notification sent to the billing system about any (potentially)
chargeable event.  That is consistent with the description you reproduced.

For voice calls, the CDR is normally only generated at the end of the
call (except for very long-lived calls -- most networks will generate a
CDR after 24 hours if the call is still going).

For data, there is no standardisation, and the situation changes as more
equipment gets put into the network.  In general, for data, many network
elements may generate CDRs for the same data session and it is up to the
network which ones they actually use or record.

A simplified discussion (the details are different for 2G, 3G, 4G as
well as for different networks)... It is almost certain a CDR will be
generated when an IP "connectivity session" is disconnected.  However,
the connection being talked about here, is the connection between the
phone and some router (more than just a router, but basically a router),
at a lower level than IP.  That is normally either because the phone
drops off the network or the phone decides it doesn't think it needs to
send/receive any more data for a while.  It is not necessarily related
to any TCP connections, although in simple cases it often is.  CDRs for
data may also be generated when a phone moves (most often when it moves
out of a Location Area, which is a group of cell sites).  It is quite
likely (depending on many things in the phone and network configuration)
that a CDR will be generated after each of the mail polling sessions, in
your example.

As networks get more sophisticated, much more CDR processing is
occurring.  For example, it is very common to combine CDRs that relate
to the same user at a low level in the network to save on
storage/processing.  So, the several CDRs from your email polling may
get combined back into one CDR before it gets to be "retained" (or it
might be after it is retained -- up to the Billing architect).

Also, equipment like policy enforcement can generate more detailed CDRs
including the protocols used (and other information like web pages
accessed, if it includes deep packet inspection, which many do).  Those
CDRs typically do align with TCP sessions (although not always), but
they may or may not include information like location.  And they may or
may not go into the "retention" database.

Bottom line: even if the authors of that document understood correctly
how it is handled for each of the UK networks at the time they asked, it
has probably changed several times since!

Note that this assumes the data is taken from (some of the) CDRs.  That
is the most likely for the bulk retention.  However, to track location
at all times requires a completely different process, fairly expensive
in network resources, so it is probably only available if served with a
specific request (and with a price attached).

Graham