[PATCH 21/21] site: Negotiate (configurable) MTU

Ian Jackson ijackson at chiark.greenend.org.uk
Thu Apr 24 02:37:40 BST 2014

Calculate the inter-site mtu for each virtual link to a peer site, and
honour it.  This is the maximum of our desired mtu, and the peer's.

Here our desired MTU is normally our local kernel netlink mtu, but it
can be configured separately.

Extend MSG3/MSG4 optional data to contain an MTU advertisement.

The result is that overall it is possible to set an mtu-target which
is lower than the vpn's conventional mtu and expect shorter
on-the-wire packets to a particular site - provided that site's peers
are all upgraded.

In this patch we also update NOTES for the new protocol and README for
the new configuration option.

Signed-off-by: Ian Jackson <ijackson at chiark.greenend.org.uk>
 NOTES                    |   41 +++++++++++++++++++++++++++++++++++++++++
 README                   |   15 ++++++++++++++-
 debian/changelog         |    1 +
 site.c                   |   37 +++++++++++++++++++++++++++++++++++--
 test-example/inside.conf |    1 +
 5 files changed, 92 insertions(+), 3 deletions(-)

diff --git a/NOTES b/NOTES
index 40cbf04..f5ebc65 100644
--- a/NOTES
+++ b/NOTES
@@ -198,6 +198,8 @@ The optional additional data after the sender's name consists of some
 initial subset of the following list of items:
  * A 32-bit integer with a set of capability flags, representing the
    abilities of the sender.
+ * In MSG3/MSG4: a 16-bit integer being the sender's MTU, or zero.
+   (In other messages: nothing.)  See below.
  * More data which is yet to be defined and which must be ignored
    by receivers.
 The optional additional data after the receiver's name is not
@@ -220,6 +222,45 @@ No capability flags are currently defined.  Unknown capability flags
 should be treated as late ones.
+MTU handling
+In older versions of secnet, secnet was not capable of fragmentation
+or sending ICMP Frag Needed.  Administrators were expected to configure
+consistent MTUs across the network.
+It is still the case in the current version that the MTUs need to be
+configured reasonably coherently across the network: the allocated
+buffer sizes must be sufficient to cope with packets from all other
+However, provided the buffers are sufficient, all packets will be
+processed properly: a secnet receiving a packet larger than the
+applicable MTU for its delivery will either fragment it, or reject it
+with ICMP Frag Needed.
+The MTU additional data field allows secnet to advertise an MTU to the
+peer.  This allows the sending end to handle overlarge packets, before
+they are transmitted across the underlying public network.  This can
+therefore be used to work around underlying network braindamage
+affecting large packets.
+If the MTU additional data field is zero or not present, then the peer
+should use locally-configured MTU information (normally, its local
+netlink MTU) instead.
+If it is nonzero, the peer may send packets up to the advertised size
+(and if that size is bigger than the peer's administratively
+configured size, the advertiser promises that its buffers can handle
+such a large packet).
+A secnet instance should not assume that just because it has
+advertised an mtu which is lower than usual for the vpn, the peer will
+honour it, unless the administrator knows that the peers are
+sufficiently modern to understand the mtu advertisement option.  So
+secnet will still accept packets which exceed the link MTU (whether
+negotiated or assumed).
 1) A->B: *,iA,msg1,A+,B+,nA
diff --git a/README b/README
index 71a5a44..94334b3 100644
--- a/README
+++ b/README
@@ -329,6 +329,19 @@ site: dict argument
     check that there are no links both ends of which are allegedly
     mobile (which is not supported, so those links are ignored) and
     to change some of the tuning parameter defaults. [false]
+  mtu-target (integer): Desired value of the inter-site MTU for this
+    peering.  This value will be advertised to the peer (which ought
+    to affect incoming packets), and if the peer advertises an MTU its
+    value will be combined with this setting to compute the inter-site
+    MTU.  (secnet will still accept packets which exceed the
+    (negotiated or assumed) inter-site MTU.)  Setting a lower
+    inter-site MTU can be used to try to restrict the sizes of the
+    packets sent over the underlying public network (e.g. to work
+    around network braindamage).  It is not normally useful to set a
+    larger value for mtu-target than the VPN's general MTU (which
+    should be reflected in the local private interface MTU, ie the mtu
+    parameter to netlink).  If this parameter is not set, or is set
+    to 0, the default is to use the local private link mtu.
 Links involving mobile peers have some different tuning parameter
 default values, which are generally more aggressive about retrying key
@@ -375,7 +388,7 @@ a netlink closure:
       other tunnels as well as the host (used for mobile devices like laptops)
     soft: remove these routes from the host's routing table when
       the tunnel link quality is zero
-  mtu (integer): default MTU over this link; may be updated by tunnel code
+  mtu (integer): MTU of host's tunnel interface
 Netlink will dump its current routing table to the system/log on
 receipt of SIGUSR1.
diff --git a/debian/changelog b/debian/changelog
index 46645f4..0c8249e 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -10,6 +10,7 @@ secnet (0.3.1~~unstable) unstable; urgency=low
   * SECURITY: Correctly set "unused" ICMP header field.
   * Do not send ICMP errors in response to unknown incoming ICMP.
   * SECURITY: Fix IP length check not to crash on very short packets.
+  * Make the inter-site MTU configurable, and negotiate it with the peer.
diff --git a/site.c b/site.c
index e6c34c8..f87328f 100644
--- a/site.c
+++ b/site.c
@@ -234,6 +234,7 @@ struct site {
     string_t tunname; /* localname<->remotename by default, used in logs */
     string_t address; /* DNS name for bootstrapping, optional */
     int remoteport; /* Port for bootstrapping, optional */
+    uint32_t mtu_target;
     struct netlink_if *netlink;
     struct comm_if **comms;
     int ncomms;
@@ -282,6 +283,7 @@ struct site {
        timeout before we can listen for another setup packet); perhaps
        we should keep a list of 'bad' sources for setup packets. */
     uint32_t remote_capabilities;
+    uint16_t remote_adv_mtu;
     struct transform_if *chosen_transform;
     uint32_t setup_session_id;
     transport_peers setup_peers;
@@ -386,6 +388,14 @@ static void dispose_transform(struct transform_inst_if **transform_var)
     type=buf_unprepend_uint32((b)); \
     if (type!=(t)) return False; } while(0)
+static _Bool type_is_msg34(uint32_t type)
+    return
+	type == LABEL_MSG3 ||
+	type == LABEL_MSG3BIS ||
+	type == LABEL_MSG4;
 struct parsedname {
     int32_t len;
     uint8_t *name;
@@ -399,6 +409,7 @@ struct msg {
     struct parsedname remote;
     struct parsedname local;
     uint32_t remote_capabilities;
+    uint16_t remote_mtu;
     int capab_transformnum;
     uint8_t *nR;
     uint8_t *nL;
@@ -487,6 +498,9 @@ static bool_t generate_msg(struct site *st, uint32_t type, cstring_t what)
     if ((st->local_capabilities & CAPAB_EARLY) || (type != LABEL_MSG1)) {
+    if (type_is_msg34(type)) {
+	buf_append_uint16(&st->buffer,st->mtu_target);
+    }
@@ -542,10 +556,15 @@ static bool_t unpick_msg(struct site *st, uint32_t type,
     if (!unpick_name(msg,&m->remote)) return False;
+    m->remote_mtu=0;
     if (m->remote.extrainfo.size) {
+    if (type_is_msg34(type) && m->remote.extrainfo.size) {
+	CHECK_AVAIL(&m->remote.extrainfo,2);
+	m->remote_mtu=buf_unprepend_uint16(&m->remote.extrainfo);
+    }
     if (!unpick_name(msg,&m->local)) return False;
     if (type==LABEL_PROD) {
@@ -722,6 +741,8 @@ static bool_t process_msg3_msg4(struct site *st, struct msg *m)
+    st->remote_adv_mtu=m->remote_mtu;
     return True;
@@ -1145,7 +1166,15 @@ static void activate_new_key(struct site *st)
-    slog(st,LOG_ACTIVATE_KEY,"new key activated");
+    /* Compute the inter-site MTU.  This is min( our_mtu, their_mtu ).
+     * But their mtu be unspecified, in which case we just use ours. */
+    uint32_t intersite_mtu=
+	MIN(st->mtu_target, st->remote_adv_mtu ?: ~(uint32_t)0);
+    st->netlink->set_mtu(st->netlink->st,intersite_mtu);
+    slog(st,LOG_ACTIVATE_KEY,"new key activated"
+	 " (mtu ours=%"PRId32" theirs=%"PRId32" intersite=%"PRId32")",
+	 st->mtu_target, st->remote_adv_mtu, intersite_mtu);
@@ -1765,6 +1794,7 @@ static list_t *site_apply(closure_t *self, struct cloc loc, dict_t *context,
     st->setup_retries=        CFG_NUMBER("setup-retries", SETUP_RETRIES);
     st->setup_retry_interval= CFG_NUMBER("setup-timeout", SETUP_RETRY_INTERVAL);
     st->wait_timeout=         CFG_NUMBER("wait-time",     WAIT_TIME);
+    st->mtu_target= dict_read_number(dict,"mtu-target",False,"site",loc,0);
     st->mobile_peer_expiry= dict_read_number(
@@ -1833,7 +1863,10 @@ static list_t *site_apply(closure_t *self, struct cloc loc, dict_t *context,
     /* We need to register the remote networks with the netlink device */
-    st->netlink->reg(st->netlink->st, site_outgoing, st, 0);
+    uint32_t netlink_mtu; /* local virtual interface mtu */
+    st->netlink->reg(st->netlink->st, site_outgoing, st, &netlink_mtu);
+    if (!st->mtu_target)
+	st->mtu_target=netlink_mtu;
     for (i=0; i<st->ncomms; i++)
 	st->comms[i]->request_notify(st->comms[i]->st, st, site_incoming);
diff --git a/test-example/inside.conf b/test-example/inside.conf
index 32d1081..060a0bf 100644
--- a/test-example/inside.conf
+++ b/test-example/inside.conf
@@ -17,4 +17,5 @@ comm udp {
 local-name "test-example/inside/inside";
 local-key rsa-private("test-example/inside.key");
 local-mobile True;
+mtu-target 1260;
 include test-example/common.conf

More information about the sgo-software-discuss mailing list