X-Git-Url: https://www.chiark.greenend.org.uk/ucgi/~ianmdlvl/git?a=blobdiff_plain;f=NOTES;h=001c118e96f9401592560ea3332583b653ac2c55;hb=147b444d6faa9a621e33d653b7a72c29724203c3;hp=b2023817f2beb3c80a9cab908a9623f60bf58fff;hpb=4eac60b2aeaba0ff660fd9f58fcdce210592bac0;p=secnet.git diff --git a/NOTES b/NOTES index b202381..001c118 100644 --- a/NOTES +++ b/NOTES @@ -1,24 +1,452 @@ -&= => subdir -&_ => subdir_ -&/ => subdir/ -&CAPS => subdir_CAPS -&lc => subdir/lc - -&.= &._ &./ &.CAPS &.lc => $(top_srcdir)/subdir -&:= &:_ &:/ &:CAPS &:lc => $(abs_top_srcdir)/subdir -& thing thing => like &thing &thing (until EOL) - -&& => & - -&! spc disables & *until* EOL - -&!STUFF STUFF is recognised instead of & (beyond EOL) - STUFF is either all punct or all alphanum (incl _) - any lwsp after STUFF is discarded too -eg notably - STUFF!& now & is recognised instead (ie back to normal) - STUFFSTUFF STUFF - -eg - &!@@@ @@@ is recognised instead of & - @@@!& go back to & +* Design of new, multi-subnet secnet protocol + +Like the first (1995/6) version, we're tunnelling IP packets inside +UDP packets. To defeat various restrictions which may be imposed on us +by network providers (like the prohibition of incoming TCP +connections) we're sticking with UDP for everything this time, +including key setup. This means we have to handle retries, etc. + +Other new features include being able to deal with subnets hidden +behind changing 'real' IP addresses, and the ability to choose +algorithms and keys per pair of communicating sites. + +** Configuration and structure + +[The original plan] + +The network is made up from a number of 'sites'. These are collections +of machines with private IP addresses. The new secnet code runs on +machines which have interfaces on the private site network and some +way of accessing the 'real' internet. + +Each end of a tunnel is identified by a name. Often it will be +convenient for every gateway machine to use the same name for each +tunnel endpoint, but this is not vital. Individual tunnels are +identified by their two endpoint names. + +[The new plan] + +It appears that people want to be able to use secnet on mobile +machines like laptops as well as to interconnect sites. In particular, +they want to be able to use their laptop in three situations: + +1) connected to their internal LAN by a cable; no tunnel involved +2) connected via wireless, using a tunnel to protect traffic +3) connected to some other network, using a tunnel to access the +internal LAN. + +They want the laptop to keep the same IP address all the time. + +Case (1) is simple. + +Case (2) requires that the laptop run a copy of secnet, and have a +tunnel configured between it and the main internal LAN default +gateway. secnet must support the concept of a 'soft' tunnel where it +adds a route and causes the gateway to do proxy-ARP when the tunnel is +up, and removes the route again when the tunnel is down. + +The usual prohibition of packets coming in from one tunnel and going +out another must be relaxed in this case (in particular, the +destination address of packets from these 'mobile station' tunnels may +be another tunnel as well as the host). + +(Quick sanity check: if chiark's secnet address was in +192.168.73.0/24, would this work properly? Yes, because there will be +an explicit route to it, and proxy ARP will be done for it. Do we want +packets from the chiark tunnel to be able to go out along other +routes? No. So, spotting a 'local' address in a remote site's list of +networks isn't sufficient to switch on routing for a site. We need an +explicit option. NB packets may be routed if the source OR the +destination is marked as allowing routing [otherwise packets couldn't +get back from eg. chiark to a laptop at greenend]). + +[the even newer plan] + +secnet sites are configured to grant access to particular IP address +ranges to the holder of a particular public key. The key can certify +other keys, which will then be permitted to use a subrange of the IP +address range of the certifying key. + +This means that secnet won't know in advance (i.e. at configuration +time) how many tunnels it might be required to support, so we have to +be able to create them (and routes, and so on) on the fly. + +** VPN-level configuration + +At a high level we just want to be able to indicate which groups of +users can claim ownership of which ranges of IP addresses. Assuming +these users (or their representatives) all have accounts on a single +machine, we can automate the submission of keys and other information +to make up a 'sites' file for the entire VPN. + +The distributed 'sites' file should be in a more restricted format +than the secnet configuration file, to prevent attackers who manage to +distribute bogus sites files from taking over their victim's machines. + +The distributed 'sites' file is read one line at a time. Each line +consists of a keyword followed by other information. It defines a +number of VPNs; within each VPN it defines a number of locations; +within each location it defines a number of sites. These VPNs, +locations and sites are turned into a secnet.conf file fragment using +a script. + +Some keywords are valid at any 'level' of the distributed 'sites' +file, indicating defaults. + +The keywords are: + +vpn n: we are now declaring information to do with VPN 'n'. Must come first. + +location n: we are now declaring information for location 'n'. + +site n: we are now declaring information for site 'n'. +endsite: we're finished declaring information for the current site + +restrict-nets a b c ...: restrict the allowable 'networks' for the current + level to those in this list. +end-definitions: prevent definition of further vpns and locations, and + modification of defaults at VPN level + +dh x y: the current VPN uses the specified group; x=modulus, y=generator + +hash x: which hash function to use. Valid options are 'md5' and 'sha1'. + +admin n: administrator email address for current level + +key-lifetime n +setup-retries n +setup-timeout n +wait-time n +renegotiate-time n + +address a b: a=dnsname, b=port +networks a b c ... +pubkey x y z: x=keylen, y=encryption key, z=modulus +mobile: declare this to be a 'mobile' site + +** Logging etc. + +There are several possible ways of running secnet: + +'reporting' only: --version, --help, etc. command line options and the +--just-check-config mode. + +'normal' run: perform setup in the foreground, and then background. + +'failed' run: setup in the foreground, and terminate with an error +before going to background. + +'reporting' modes should never output anything except to stdout/stderr. +'normal' and 'failed' runs output to stdout/stderr before +backgrounding, then thereafter output only to log destinations. + +** Site long-term keys + +We use authenticated DH. Sites identify themselves to each other +using long-term signing keys. + +These signing keys may be for a variety of algorithms. (An algorithm +specifies completely how to do a signature and verification.) + +Each site may have several keys. This helps support key rollover and +algorithm agility. Several keys of different algorithms can form a +key group. Usually a key group consists of keys generated at the same +time. A key is identified by a 4-byte group id (invented by its +publisher and opaque) plus a 1-byte algorithm id (defined by the +protocol spec for each algorithm). + +Keys are published in key sets. A key set is a collection of key +groups (including older keys as well as newer ones) published at a +particular time. Key sets have their own 4-byte ids; these are +invented by the publisher but are ordered using sequence number +arithmetic. This allows reliers to favour new sets over old ones. + +Within each key set, some groups may be marked as `fallback'. This +means a group that should be tolerated by a relier only if the relier +doesn't support any non-fallback keys. + +Keys within groups, and groups within sets, are ordered (by the +publisher of the set), from most to least preferred. + +When deciding which public keys to accept, a relier should: + Process each group within the key set. + Discard unknown algorithms. + Choose a preferred algorithm: + Earliest in the group + (or local config could have algorithm prefererence). + Discard empty groups. + Discard unneeded fallback groups: + If any (non-empty) non-fallback groups found, discard all + fallback groups. Otherwise there are only fallback groups; + discard all but first group in the set. + Discard any keys exceeding limit on number of keys honoured: + Limit is at least 4 + Discard keys later in the set + In wire protocol, offer the resulting subset of keyids to + the peer and a allow the signer to select which key to use + from that subset. + +In configuration and key management, long-term private and public keys +are octet strings. Private keys are generally stored in disk files, +one key per file. The octet string for a private key should identify +the algorithm so that passing the private key to the code for the +wrong algorithm does not produce results which would leak or weaken +the key. The octet string for a public key need not identify the +algorithm; when it's loaded the algorithm will be known from context. + +The group id 00000000 is special. It should contain only one key, +algorithm 00. Key 0000000000 refers to the rsa1 key promulgated +before the key rollover/advertisement protocols, or the key which +should be used by sites running old software. + +The key set id 00000000 is special and is considered older than all +othere key sets (ie this is an exception to the sequence number +arithmetic). It is the implied key set id of the rsa1 key +promulgated before the key rollover/advertisement protocols. + +The algorithm 00 is special and refers to the old rsa1 signature +protocol but unusually does not identify the hash function. The hash +function is conventional and must be specified out of band. In known +existing installations it is SHA-1. + +** Protocols + +*** Protocol environment: + +Each gateway machine serves a particular, well-known set of private IP +addresses (i.e. the agreement over which addresses it serves is +outside the scope of this discussion). Each gateway machine has an IP +address on the interconnecting network (usually the Internet), which +may be dynamically allocated and may change at any point. + +Each gateway knows the RSA public keys of the other gateways with +which it wishes to communicate. The mechanism by which this happens is +outside the scope of this discussion. There exists a means by which +each gateway can look up the probable IP address of any other. + +*** Protocol goals: + +The ultimate goal of the protocol is for the originating gateway +machine to be able to forward packets from its section of the private +network to the appropriate gateway machine for the destination +machine, in such a way that it can be sure that the packets are being +sent to the correct destination machine, the destination machine can +be sure that the source of the packets is the originating gateway +machine, and the contents of the packets cannot be understood other +than by the two communicating gateways. + +XXX not sure about the address-change stuff; leave it out of the first +version of the protocol. From experience, IP addresses seem to be +quite stable so the feature doesn't gain us much. + +**** Protocol sub-goal 1: establish a shared key + +Definitions: + +A is the originating gateway machine name +B is the destination gateway machine name +A+ and B+ are the names with optional additional data, see below +PK_A is the public RSA key of A +PK_B is the public RSA key of B +PK_A^-1 is the private RSA key of A +PK_B^-1 is the private RSA key of B +x is the fresh private DH key of A +y is the fresh private DH key of B +k is g^xy mod m +g and m are generator and modulus for Diffie-Hellman +nA is a nonce generated by A +nB is a nonce generated by B +iA is an index generated by A, to be used in packets sent from B to A +iB is an index generated by B, to be used in packets sent from A to B +i? is appropriate index for receiver + +Note that 'i' may be re-used from one session to the next, whereas 'n' +is always fresh. + +The optional additional data after the sender's name consists of some +initial subset of the following list of items: + * A 32-bit integer with a set of capability flags, representing the + abilities of the sender. + * In MSG3/MSG4: a 16-bit integer being the sender's MTU, or zero. + (In other messages: nothing.) See below. + * In MSG2/MSG3: a list of the peer's public keys that the sender will + accept: (i) a 1-byte integer count (ii) that many 5-byte key ids. + If not present, implicitly only the special key id 0000000000. + * In MSG3/MSG4: an 8-bit integer being an index into the + receiver's public key acceptance list, with which the message + is signed. If not present, implicitly the key id 00000000000. + * More data which is yet to be defined and which must be ignored + by receivers. +The optional additional data after the receiver's name is not +currently used. If any is seen, it must be ignored. + +Capability flag bits must be in one the following two categories: + +1. Early capability flags must be advertised in MSG1 or MSG2, as + applicable. If MSG3 or MSG4 advertise any "early" capability bits, + MSG1 or MSG3 (as applicable) must have advertised them too. + +2. Late capability flags may be advertised only in MSG2 or MSG3, as + applicable. They are only in MSG1 with newer secnets; older + versions omit them. MSG4 must advertise the same set as MSG2. + +Currently, the low 16 bits are allocated for negotiating bulk-crypto +transforms. Bits 8 to 15 are used by Secnet as default capability +numbers for the various kinds of transform closures: bit 8 is for the +original CBCMAC-based transform, and bit 9 for the new EAX transform; +bits 10 to 15 are reserved for future expansion. The the low eight bits +are reserved for local use, e.g., to allow migration from one set of +parameters for a particular transform to a different, incompatible set +of parameters for the same transform. Bit 31, if advertised by both +ends, indicates that a mobile end gets priority in case of crossed MSG1. +The remaining bits have not yet been assigned a purpose. + +Whether a capability number is early depends on its meaning, rather than +being a static property of its number. That said, the mobile-end-gets +priority bit (31) is always sent as an `early' capability bit. + + +MTU handling + +In older versions of secnet, secnet was not capable of fragmentation +or sending ICMP Frag Needed. Administrators were expected to configure +consistent MTUs across the network. + +It is still the case in the current version that the MTUs need to be +configured reasonably coherently across the network: the allocated +buffer sizes must be sufficient to cope with packets from all other +peers. + +However, provided the buffers are sufficient, all packets will be +processed properly: a secnet receiving a packet larger than the +applicable MTU for its delivery will either fragment it, or reject it +with ICMP Frag Needed. + +The MTU additional data field allows secnet to advertise an MTU to the +peer. This allows the sending end to handle overlarge packets, before +they are transmitted across the underlying public network. This can +therefore be used to work around underlying network braindamage +affecting large packets. + +If the MTU additional data field is zero or not present, then the peer +should use locally-configured MTU information (normally, its local +netlink MTU) instead. + +If it is nonzero, the peer may send packets up to the advertised size +(and if that size is bigger than the peer's administratively +configured size, the advertiser promises that its buffers can handle +such a large packet). + +A secnet instance should not assume that just because it has +advertised an mtu which is lower than usual for the vpn, the peer will +honour it, unless the administrator knows that the peers are +sufficiently modern to understand the mtu advertisement option. So +secnet will still accept packets which exceed the link MTU (whether +negotiated or assumed). + + +Messages: + +1) A->B: i*,iA,msg1,A+,B+,nA + +i* must be encoded as 0. (However, it is permitted for a site to use +zero as its "index" for another site.) + +2) B->A: iA,iB,msg2,B+,A+,nB,nA + +(The order of B and A reverses in alternate messages so that the same +code can be used to construct them...) + +3) A->B: {iB,iA,msg3,A+,B+,[chosen-transform],nA,nB,g^x mod m}_PK_A^-1 + +If message 1 was a replay then A will not generate message 3, because +it doesn't recognise nA. + +If message 2 was from an attacker then B will not generate message 4, +because it doesn't recognise nB. + +4) B->A: {iA,iB,msg4,B+,A+,nB,nA,g^y mod m}_PK_B^-1 + +At this point, A and B share a key, k. B must keep retransmitting +message 4 until it receives a packet encrypted using key k. + +5) A: iB,iA,msg5,(ping/msg5)_k + +6) B: iA,iB,msg6,(pong/msg6)_k + +(Note that these are encrypted using the same transform that's used +for normal traffic, so they include sequence number, MAC, etc.) + +The ping and pong messages can be used by either end of the tunnel at +any time, but using msg0 as the unencrypted message type indicator. + +**** Protocol sub-goal 2: end the use of a shared key + +7) i?,i?,msg0,(end-session/msg7,A,B)_k + +This message can be sent by either party. Once sent, k can be +forgotten. Once received and checked, k can be forgotten. No need to +retransmit or confirm reception. It is suggested that this message be +sent when a key times out, or the tunnel is forcibly terminated for +some reason. + +**** Protocol sub-goal 3: send a packet + +8) i?,i?,msg0,(send-packet/msg9,packet)_k + +**** Other messages + +9) i?,i?,NAK (NAK is encoded as zero) + +If the link-layer can't work out what to do with a packet (session has +gone away, etc.) it can transmit a NAK back to the sender. + +This can alert the sender to the situation where the sender has a key +but the receiver doesn't (eg because it has been restarted). The +sender, on receiving the NAK, will try to initiate a key exchange. + +Forged (or overly delayed) NAKs can cause wasted resources due to +spurious key exchange initiation, but there is a limit on this because +of the key exchange retry timeout. + +10) i?,i?,msg8,A,B,nA,nB,msg? + +This is an obsolete form of NAK packet which is not sent by any even +vaguely recent version of secnet. (In fact, there is no evidence in +the git history of it ever being sent.) + +This message number is reserved. + +11) *,*,PROD,A,B + +Sent in response to a NAK from B to A. Requests that B initiates a +key exchange with A, if B is willing and lacks a transport key for A. +(If B doesn't have A's address configured, implicitly supplies A's +public address.) + +This is necessary because if one end of a link (B) is restarted while +a key exchange is in progress, the following bad state can persist: +the non-restarted end (A) thinks that the key is still valid and keeps +sending packets, but B either doesn't realise that a key exchange with +A is necessary or (if A is a mobile site) doesn't know A's public IP +address. + +Normally in these circumstances B would send NAKs to A, causing A to +initiate a key exchange. However if A and B were already in the +middle of a key exchange then A will not want to try another one until +the first one has timed out ("setup-time" x "setup-retries") and then +the key exchange retry timeout ("wait-time") has elapsed. + +However if B's setup has timed out, B would be willing to participate +in a key exchange initiated by A, if A could be induced to do so. +This is the purpose of the PROD packet. + +We send no more PRODs than we would want to send data packets, to +avoid a traffic amplification attack. We also send them only in state +WAIT, as in other states we wouldn't respond favourably. And we only +honour them if we don't already have a key. + +With PROD, the period of broken communication due to a key exchange +interrupted by a restart is limited to the key exchange total +retransmission timeout, rather than also including the key exchange +retry timeout.