* Design of new, multi-subnet secnet protocol

Like the first (1995/6) version, we're tunnelling IP packets inside
UDP packets. To defeat various restrictions which may be imposed on us
by network providers (like the prohibition of incoming TCP
connections) we're sticking with UDP for everything this time,
including key setup. This means we have to handle retries, etc.

Other new features include being able to deal with subnets hidden
behind changing 'real' IP addresses, and the ability to choose
algorithms and keys per pair of communicating sites.

** Configuration and structure

[The original plan]

The network is made up from a number of 'sites'. These are collections
of machines with private IP addresses. The new secnet code runs on
machines which have interfaces on the private site network and some
way of accessing the 'real' internet.

Each end of a tunnel is identified by a name. Often it will be
convenient for every gateway machine to use the same name for each
tunnel endpoint, but this is not vital. Individual tunnels are
identified by their two endpoint names.

[The new plan]

It appears that people want to be able to use secnet on mobile
machines like laptops as well as to interconnect sites. In particular,
they want to be able to use their laptop in three situations:

1) connected to their internal LAN by a cable; no tunnel involved
2) connected via wireless, using a tunnel to protect traffic
3) connected to some other network, using a tunnel to access the
internal LAN.

They want the laptop to keep the same IP address all the time.

Case (1) is simple.

Case (2) requires that the laptop run a copy of secnet, and have a
tunnel configured between it and the main internal LAN default
gateway. secnet must support the concept of a 'soft' tunnel where it
adds a route and causes the gateway to do proxy-ARP when the tunnel is
up, and removes the route again when the tunnel is down.

The usual prohibition of packets coming in from one tunnel and going
out another must be relaxed in this case (in particular, the
destination address of packets from these 'mobile station' tunnels may
be another tunnel as well as the host).

(Quick sanity check: if chiark's secnet address was in
192.168.73.0/24, would this work properly? Yes, because there will be
an explicit route to it, and proxy ARP will be done for it. Do we want
packets from the chiark tunnel to be able to go out along other
routes? No. So, spotting a 'local' address in a remote site's list of
networks isn't sufficient to switch on routing for a site. We need an
explicit option. NB packets may be routed if the source OR the
destination is marked as allowing routing [otherwise packets couldn't
get back from eg. chiark to a laptop at greenend]).

[the even newer plan]

secnet sites are configured to grant access to particular IP address
ranges to the holder of a particular public key.  The key can certify
other keys, which will then be permitted to use a subrange of the IP
address range of the certifying key.

This means that secnet won't know in advance (i.e. at configuration
time) how many tunnels it might be required to support, so we have to
be able to create them (and routes, and so on) on the fly.

** VPN-level configuration

At a high level we just want to be able to indicate which groups of
users can claim ownership of which ranges of IP addresses. Assuming
these users (or their representatives) all have accounts on a single
machine, we can automate the submission of keys and other information
to make up a 'sites' file for the entire VPN.

The distributed 'sites' file should be in a more restricted format
than the secnet configuration file, to prevent attackers who manage to
distribute bogus sites files from taking over their victim's machines.

The distributed 'sites' file is read one line at a time. Each line
consists of a keyword followed by other information. It defines a
number of VPNs; within each VPN it defines a number of locations;
within each location it defines a number of sites. These VPNs,
locations and sites are turned into a secnet.conf file fragment using
a script.

Some keywords are valid at any 'level' of the distributed 'sites'
file, indicating defaults.

The keywords are:

vpn n: we are now declaring information to do with VPN 'n'. Must come first.

location n: we are now declaring information for location 'n'.

site n: we are now declaring information for site 'n'.
endsite: we're finished declaring information for the current site

restrict-nets a b c ...: restrict the allowable 'networks' for the current
  level to those in this list.
end-definitions: prevent definition of further vpns and locations, and
  modification of defaults at VPN level

dh x y: the current VPN uses the specified group; x=modulus, y=generator

hash x: which hash function to use. Valid options are 'md5' and 'sha1'.

admin n: administrator email address for current level

key-lifetime n
setup-retries n
setup-timeout n
wait-time n
renegotiate-time n

address a b: a=dnsname, b=port
networks a b c ...
pubkey x y z: x=keylen, y=encryption key, z=modulus
mobile: declare this to be a 'mobile' site

** Logging etc.

There are several possible ways of running secnet:

'reporting' only: --version, --help, etc. command line options and the
--just-check-config mode.

'normal' run: perform setup in the foreground, and then background.

'failed' run: setup in the foreground, and terminate with an error
before going to background.

'reporting' modes should never output anything except to stdout/stderr.
'normal' and 'failed' runs output to stdout/stderr before
backgrounding, then thereafter output only to log destinations.

** Protocols

*** Protocol environment:

Each gateway machine serves a particular, well-known set of private IP
addresses (i.e. the agreement over which addresses it serves is
outside the scope of this discussion). Each gateway machine has an IP
address on the interconnecting network (usually the Internet), which
may be dynamically allocated and may change at any point.

Each gateway knows the RSA public keys of the other gateways with
which it wishes to communicate. The mechanism by which this happens is
outside the scope of this discussion. There exists a means by which
each gateway can look up the probable IP address of any other.

*** Protocol goals:

The ultimate goal of the protocol is for the originating gateway
machine to be able to forward packets from its section of the private
network to the appropriate gateway machine for the destination
machine, in such a way that it can be sure that the packets are being
sent to the correct destination machine, the destination machine can
be sure that the source of the packets is the originating gateway
machine, and the contents of the packets cannot be understood other
than by the two communicating gateways.

XXX not sure about the address-change stuff; leave it out of the first
version of the protocol. From experience, IP addresses seem to be
quite stable so the feature doesn't gain us much.

**** Protocol sub-goal 1: establish a shared key

Definitions:

A is the originating gateway machine name
B is the destination gateway machine name
A+ and B+ are the names with optional additional data, see below
PK_A is the public RSA key of A
PK_B is the public RSA key of B
PK_A^-1 is the private RSA key of A
PK_B^-1 is the private RSA key of B
x is the fresh private DH key of A
y is the fresh private DH key of B
k is g^xy mod m
g and m are generator and modulus for Diffie-Hellman
nA is a nonce generated by A
nB is a nonce generated by B
iA is an index generated by A, to be used in packets sent from B to A
iB is an index generated by B, to be used in packets sent from A to B
i? is appropriate index for receiver

Note that 'i' may be re-used from one session to the next, whereas 'n'
is always fresh.

The optional additional data after the sender's name consists of some
initial subset of the following list of items:
 * A 32-bit integer with a set of capability flags, representing the
   abilities of the sender.
 * More data which is yet to be defined and which must be ignored
   by receivers.
The optional additional data after the receiver's name is not
currently used.  If any is seen, it must be ignored.

Capability flag bits must be in one the following two categories:

1. Early capability flags must be advertised in MSG1 or MSG2, as
   applicable.  If MSG3 or MSG4 advertise any "early" capability bits,
   MSG1 or MSG3 (as applicable) must have advertised them too.  Sadly,
   advertising an early capability flag will produce MSG1s which are
   not understood by versions of secnet which predate the capability
   mechanism.

2. Late capability flags are advertised in MSG2 or MSG3, as
   applicable.  They may also appear in MSG1, but this is not
   guaranteed.  MSG4 must advertise the same set as MSG2.

No capability flags are currently defined.  Unknown capability flags
should be treated as late ones.


Messages:

1) A->B: *,iA,msg1,A+,B+,nA

i* must be encoded as 0.  (However, it is permitted for a site to use
zero as its "index" for another site.)

2) B->A: iA,iB,msg2,B+,A+,nB,nA

(The order of B and A reverses in alternate messages so that the same
code can be used to construct them...)

3) A->B: {iB,iA,msg3,A+,B+,[chosen-transform],nA,nB,g^x mod m}_PK_A^-1

If message 1 was a replay then A will not generate message 3, because
it doesn't recognise nA.

If message 2 was from an attacker then B will not generate message 4,
because it doesn't recognise nB.

4) B->A: {iA,iB,msg4,B+,A+,nB,nA,g^y mod m}_PK_B^-1

At this point, A and B share a key, k. B must keep retransmitting
message 4 until it receives a packet encrypted using key k.

5) A: iB,iA,msg5,(ping/msg5)_k

6) B: iA,iB,msg6,(pong/msg6)_k

(Note that these are encrypted using the same transform that's used
for normal traffic, so they include sequence number, MAC, etc.)

The ping and pong messages can be used by either end of the tunnel at
any time, but using msg0 as the unencrypted message type indicator.

**** Protocol sub-goal 2: end the use of a shared key

7) i?,i?,msg0,(end-session/msg7,A,B)_k

This message can be sent by either party. Once sent, k can be
forgotten. Once received and checked, k can be forgotten. No need to
retransmit or confirm reception. It is suggested that this message be
sent when a key times out, or the tunnel is forcibly terminated for
some reason.

**** Protocol sub-goal 3: send a packet

8) i?,i?,msg0,(send-packet/msg9,packet)_k

**** Other messages

9) i?,i?,NAK (NAK is encoded as zero)

If the link-layer can't work out what to do with a packet (session has
gone away, etc.) it can transmit a NAK back to the sender.

This can alert the sender to the situation where the sender has a key
but the receiver doesn't (eg because it has been restarted).  The
sender, on receiving the NAK, will try to initiate a key exchange.

Forged (or overly delayed) NAKs can cause wasted resources due to
spurious key exchange initiation, but there is a limit on this because
of the key exchange retry timeout.

10) i?,i?,msg8,A,B,nA,nB,msg?

This is an obsolete form of NAK packet which is not sent by any even
vaguely recent version of secnet.  (In fact, there is no evidence in
the git history of it ever being sent.)

This message number is reserved.

11) *,*,PROD,A,B

Sent in response to a NAK from B to A.  Requests that B initiates a
key exchange with A, if B is willing and lacks a transport key for A.
(If B doesn't have A's address configured, implicitly supplies A's
public address.)

This is necessary because if one end of a link (B) is restarted while
a key exchange is in progress, the following bad state can persist:
the non-restarted end (A) thinks that the key is still valid and keeps
sending packets, but B either doesn't realise that a key exchange with
A is necessary or (if A is a mobile site) doesn't know A's public IP
address.

Normally in these circumstances B would send NAKs to A, causing A to
initiate a key exchange.  However if A and B were already in the
middle of a key exchange then A will not want to try another one until
the first one has timed out ("setup-time" x "setup-retries") and then
the key exchange retry timeout ("wait-time") has elapsed.

However if B's setup has timed out, B would be willing to participate
in a key exchange initiated by A, if A could be induced to do so.
This is the purpose of the PROD packet.

We send no more PRODs than we would want to send data packets, to
avoid a traffic amplification attack.  We also send them only in state
WAIT, as in other states we wouldn't respond favourably.  And we only
honour them if we don't already have a key.

With PROD, the period of broken communication due to a key exchange
interrupted by a restart is limited to the key exchange total
retransmission timeout, rather than also including the key exchange
retry timeout.
