+.SH "CHARACTER ENCODING"
+All data sent by both server and client is encoded using UTF-8.
+Moreover it must be valid UTF-8, i.e. non-minimal sequences are not
+permitted, nor are surrogates, nor are code points outside the
+Unicode code space.
+.PP
+There are no particular normalization requirements on either side of the
+protocol.
+The server currently converts internally to NFC, the client must
+normalize the responses returned if it needs some normalized form for further
+processing.
+.PP
+The various characters which divide up lines may not be followed by combining
+characters.
+For instance all of the following are prohibited:
+.TP
+.B o
+LINE FEED followed by a combining character.
+For example the sequence LINE FEED, COMBINING GRAVE ACCENT is never permitted.
+.TP
+.B o
+APOSTROPHE or QUOTATION MARK followed by a combining character when used to
+delimit fields.
+For instance a line starting APOSTROPHE, COMBINING CEDILLA is prohibited.
+.IP
+Note that such sequences are not prohibited when the quote character cannot be
+interpreted as a field delimiter.
+For instance APOSTROPHE, REVERSE SOLIDUS, APOSTROPHE, COMBINING CEDILLA,
+APOSTROPHE would be permitted.
+.TP
+.B o
+REVERSE SOLIDUS (BACKSLASH) followed by a combining character in a quoted
+string when it is the first character of an escape sequence.
+For instance a line starting APOSTROPHE, REVERSE SOLIDUS, COMBINING TILDE
+is prohibited.
+.IP
+As above such sequences are not prohibited when the character is not being used
+to start an escape sequence.
+For instance APOSTROPHE, REVERSE SOLIDUS, REVERSE SOLIDS, COMBINING TILDE,
+APOSTROPHE is permitted.
+.TP
+.B o
+Any of the field-splitting whitespace characters followed by a combining
+character when not part of a quoted field.
+For instance a line starting COLON, SPACE, COMBINING CANDRABINDU is prohibited.
+.IP
+As above non-delimiter uses are fine.
+.TP
+.B o
+The FULL STOP characters used to quote or delimit a body.
+.PP
+Furthermore none of these characters are permitted to appear in the context of
+a canonical decomposition (i.e. they must still be present when converted to
+NFC).
+In practice however this is not an issue in Unicode 5.0.
+.PP
+These rules are consistent with the observation that the split() function is
+essentially a naive ASCII parser.
+The implication is not that these sequences never actually appear in
+the protocol, merely that the server is not required to honor them in
+any useful way nor be consistent between versions: in current
+versions the result will be lines and fields that start with combining
+characters and are not necessarily split where you expect, but future versions
+may remove them, reject them or ignore some or all of the delimiters that have
+following combining characters, and no notice will be given of any change.