--- /dev/null
+OCB3 test vectors for larger block sizes
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+This distribution contains (what I intend to be) a reference
+implementation for OCB3 with unusual block sizes (i.e., not 128 bits).
+
+
+Extending OCB3
+
+Unfortunately, defining OCB3 for a new block size is not trivial. Here
+are the details which need to be addressed for a new block size w.
+
+ * OCB3 does arithmetic in the finite field GF(2^w). It consistently
+ uses a polynomial basis representation, so GF(2^w) =~
+ GF(2)[x]/(p(x)) for an irreducible polynomial p. Choose p to be the
+ lexicographically earliest of those irreducible polynomials with
+ smallest degree.
+
+ * OCB3 uses a strongly XOR-universal hash to amortize the overhead of
+ encrypting nonces over several messages. This hash stretches a
+ w-bit string x into a 2 w bit string x || (x XOR x << c), and then
+ picks out i overlapping w-bit windows starting at position i, for 0
+ <= i < D for a limit D. For w = 128, D = 64 and c = 8. In the
+ general case, D is the largest power of two such that c exists, and
+ c is the largest possible value out of those for which c mod 8 is
+ minimal -- i.e., we prefer shifts which are byte-aligned over ones
+ which aren't, and subjet to that we prefer larger shifts to smaller
+ ones.
+
+ * OCB3 encodes the intended tag length t into the leftmost bits of the
+ nonce. The number of bits available for encoding the tag size is
+ ceil(log_2(w)): if w is a power of two, then t = w won't fit, and is
+ encoded as zero.
+
+
+Example block ciphers
+
+Not everyone necessarily has Rijndael-256 knocking about, so here's a
+simple block cipher for testing. It's four-round Luby--Rackoff with AES
+as the round function, but with all of the details specified. Four
+rounds is enough for strong-PRP security, but don't use this to extend
+block size in real life because you lose when there are internal
+collisions in the Luby--Rackoff construction, and that happens much
+earlier than you'd hope for a blockcipher of this width. Also, it's
+slow.
+
+As usual, let || denote concatenation of bitstrings. For an integer a
+in {0, 1, ..., 2^w - 1}, let [a]_w be w-bit big-endian encoding of a,
+i..e, the unique bitstring a_{w-1} || ... || a_1 || a_0 such that a =
+SUM_{0<=i<w} a_i 2^i and each a_i in {0, 1} is a bit. For a bitstring
+x, let l(x) be the length of x; if 0 <= i <= j <= l(x), then x[i..j] is
+the (j - i)-bit substring of x, starting from position i, where the
+leftmost bit is at position 0; hence, x = x[0..i] || x[i..j] ||
+x[j..l(x)].
+
+Let w be the width of the blockcipher E that you already have, in bits.
+Let w' <= 2 w be the width you wanted; I'm going to assume that w' is
+even. Let K be a key suitable for E, and let k = l(K) be the length of
+K in bits. Set n = ceil(k/w) and define K_i = (E(K; [n i]_w) ||
+E(K; [n i + 1]_w) || ... || E(K; [n i + n - 1]_w))[0..k] for
+0 <= i < 4.
+
+Given a w'-bit block x, set L_0 = x[0..w'/2], R_0 = x[w'/2..w'], and
+then
+
+ L_{i+1} = R_i XOR E(K_i; L_i || 0^{w-w'/2})[0..w'/2]
+ R_{i+1} = L_i
+
+Then E'(K, x) = R_4 || L_4. (As usual, there's no final swap in this
+Feistel network.)
+
+Define LRAESw' to be E', with the given w' and E = AES (so w = 128).
+Examples with intermediate results are included in the tarball, as
+`lraesW.verbose'. Furthermore, to obtain a 512-bit block cipher, define
+DLRAESw'' to be E'', with the given w'', w' = 256, and E = AES (so w =
+128).
+
+
+Test vector files
+
+I used the following blockciphers:
+
+ * 64-bit: 3DES and LRAES64;
+ * 96-bit: LRAES96;
+ * 128-bit: AES, Kalyna-128, and LRAES128;
+ * 192-bit: Rijndael-192 and LRAES192; and
+ * 256-bit: Rijndael-256, Kalyna-256, and LRAES256.
+ * 512-bit: Kalyna-512, and DLRAES512.
+
+For each blockcipher B, I made the following files:
+
+ * ocb3-B-tT-nN.kat: A list of simple known-answer tests, using similar
+ input patterns to RFC7539, using blockcipher B with T-bit tag and
+ N-bit nonce. In particular, ocb3-aes-t128-n96.kat matches RFC7539.
+ It was just as easy for me to make a complete collection for the
+ alternative tag size, so ocb3-aes-t96-n96.kat has a complete
+ collection including the example from RFC7539, rather than just the
+ one selection.
+
+ * ocb3-B-tT-nN.verbose: a worked example of one of the known-answer
+ tests, showing intermediate results. This differs in a few respects
+ from RFC7539: I picked a different example so as to demonstrate the
+ action of the `Auth' function (essentially PMAC3, I think); it
+ includes the augmented nonce with padding and the tag-length
+ indicator; and the order of the outputs is changed.
+
+ * ocb3-B-nN.mct: the Monte--Carlo test, using blockcipher B with N-bit
+ nonce. The algorithm is as for RFC7539 (and, in particular,
+ ocb3-aes-n96.mct contains the same values as RFC7539), with the
+ following generalizations: the nonce is now variable-size, because a
+ 96-bit nonce is impossible for a 64-bit blockcipher; the initial
+ K-bit key is simply the K-bit big-endian encoding of the chosen tag
+ length T (which is equivalent to the RFC7539 algorithm when T <
+ 256), adjusted for odd parity for DES3 (thereby introducing some
+ ambiguity, which I have ignored).
+
+I made some arbitrary decisions about the various lengths to test. It's
+quite possible that I failed to understand something important.
+
+ * For a W-bit blockcipher, I picked W and 3/4 W bits as the tag
+ lengths to test.
+
+ * For a W-bit blockcipher, I picked 3/4 W bits as the nonce length.
+ 96 bits is (presumably because of GCM) the IETF's official
+ nonce-length selection, but it certainly won't work with 64- or
+ 96-bit blockciphers (with OCB3), and seems immensely wasteful at
+ larger block sizes. Of the decisions I took, this seems like the
+ one I'm most likely to have to change. That's fine.
+
+ * For a W-bit blockcipher, I picked i/2 W (for 0 < i < 6) as the
+ message and header lengths to test, so as to test a decent mix of
+ block-aligned and non-block-aligned input lengths.