--- /dev/null
+Catacomb
+
+
+ Catacomb is a cryptographic library. It covers quite a lot of
+ the `standard' cryptgraphic primitives, although there's plenty
+ of scope for improvement, implementing more block ciphers and
+ hash functions for example. It contains a relatively extensive
+ multiprecision arithmetic library suitable for implementing a
+ wide range of public-key algorithms, although there's little
+ direct support for any particular system.
+
+
+Objectives
+
+ My objectives in writing Catacomb are:
+
+ * Security. Of course, in most cases I'm implementing
+ algorithms and protocols invented by other people, so my
+ security is bounded by the security of the algorithms I'm
+ implementing. The important thing is that (a) I document
+ the weaknesses I'm aware of, and (b) I don't add more of my
+ own.
+
+ * Trust. I want people to be able to trust Catacomb. I'd
+ like to be able to trust that the library (a) implements its
+ various functions correctly, and (b) doesn't leak any other
+ information, or allow malicious input to make the library
+ misbehave in some other way. I have a fairly extensive set
+ of test vectors for various components, and I add more
+ regularly.
+
+ * Breadth. I want to cover a lot of ground. I'm more
+ interested in covering different sorts of cryptographic
+ primitives and operations than in implementing standard
+ protocols. I'm more likely to add support for elliptic
+ curve-based public-key cryptography and secret-sharing
+ systems than supporting something like SSL or the PKCS suite
+ of standards.
+
+ * Portability. Almost all of Catacomb assumes nothing more
+ than plain old ANSI C, and should therefore work on any
+ conforming implementation of C. That's an awful lot of
+ platforms. A few places make `reasonable' assumptions,
+ usually in a fairly localized way, such as ASCII as a
+ character set (in mptext.c). I've made sure I don't assume
+ too much about the properties of integer arithmetic, for
+ example. (Other exceptions include the key-file management
+ code, which uses system-dependent locking primitives, and
+ noise acquisition for the random-number generator.)
+
+ Notice that efficiency isn't on the list. Catacomb isn't
+ ever-so slow, but it's not particularly quick either. I've
+ mostly used the right algorithms, and made occasional little
+ performance tweaks, but Catacomb will never be the fastest
+ cryptographic library if that means sacrificing other
+ objectives.
+
+
+Licensing, and trust
+
+ Catacomb is, as is explained at the top of every source file,
+ free software; you may modify and/or redistribute it under the
+ conditions described in the GNU Library General Public License.
+ This is for two reasons, and the second one is more important
+ than the first:
+
+ * The first reason is that I think that software should be
+ free. All of it. I think that you get better software that
+ way, and that users are better served by free software than
+ by being tied into restrictive licences by vendors of
+ proprietary systems.
+
+ * The second, and in this case overriding, reason is that I
+ want to encourage trust in Catacomb. I can best do this by
+ showing everyone what I've done and how it works, by being
+ as open as I can about everything I do, and allowing the
+ community at large to either poke holes in it (which I'm
+ sure will happen, and I'll fix any problems as best I can),
+ or fail in the attempt.
+
+ I've chosen the GNU Library General Public License, rather than
+ the more restrictive (but, to me, ideologically more appealing)
+ plain GPL because I think that the world is better served by
+ having trustworthy software than free software. Under the terms
+ of the LGPL, a program linked against Catacomb must come with
+ the Catacomb source code and be provided in such a form that it
+ can be linked against a recompiled version of the library.
+ Since the cryptographic components are provided in an open form,
+ they can be scrutinized and trusted. In addition, modifications
+ to the library can fix any problems found there, and to a large
+ extend patch up weaknesses in the (proprietary) client program.
+
+ Consider the case of a program which, among other functions,
+ signs messages on behalf of its user using the Digital Signature
+ Algorithm (DSA). One of the problems with the DSA is that it's
+ the host for a particular nasty `subliminal channel' -- a
+ hostile implementation can, undetectably, leak bits of your
+ private key in each signed message. This works by carefully
+ choosing a supposedly random parameter to the signature
+ function.
+
+ Once your adversary has acquired a few signed messages, which
+ shouldn't be too hard, he can recover either your entire key, or
+ enough that he can work out the rest in a reasonable amount of
+ time, and then he can forge signatures. If his program can find
+ any other keys, it can leak them too.
+
+ A small modification to Catacomb can patch this weakness. In
+ particular, the code
+
+ /* --- Randomize the number @k@ --- *
+ *
+ * Replace `secret string' with some unguessable string
+ * of your own choosing.
+ */
+
+ {
+ rmd160_ctx rmd;
+ blowfish_cbcctx bf;
+ octet hash[RMD160_HASHSZ];
+ static const char phrase[] = "Secret string";
+
+ rmd160_init(&rmd);
+ rmd160_hash(&rmd, phrase, sizeof(phrase));
+ rmd160_hash(&rmd, k->v, MPWS(MP_LEN(k)));
+ rmd160_done(&rmd, hash);
+ blowfish_cbcinit(&bf, hash, sizeof(hash));
+ blowfish_cbcencrypt(&bf, k->v, k->v, MPWS(MP_LEN(k)));
+ }
+
+ at the top of the function `dsa_mksig' in `dsa-sign.c' will
+ randomize the parameter @k@, closing the channel. (The code
+ could be tidier -- in particular, it's not completely portable
+ as it stands. A portable implementation would allocate a buffer
+ of `mp_octets(k)' bytes, extract the value `k' to it using
+ `mp_storel', encrypt the buffer, and load back using
+ `mp_loadl'.)
+
+ The `phrase' ensures that the output of the hash is
+ unpredictable -- without it, at the cost of a squaring in
+ computational effort, our adversary could compute a `k' such
+ that not only `k' but also E_{H(k)}(k) both leak similar
+ information, *and* also whether this transformation had been
+ applied!
+
+ Of course, the program might not actually use Catacomb for DSA
+ signing. That on its own should be sufficient to cause
+ suspicion -- requiring a cryptographic library and not using it
+ is certainly strange.
+
+
+Documentation
+
+ There's not a lot at the moment. Sorry. A manual is in
+ progress.
+
+ Eventually, I want to thoroughly document all functions and
+ macros provided by Catacomb. The manual, which I've already
+ started, will also include commentary on algorithms and
+ techniques used by Catacomb which should help programmers
+ understand and use the library more effectively. The manual is
+ written using LaTeX, because it's quite mathematical in places
+ and using text would probably just confuse matters. There
+ probably won't be manual pages, because keeping everything
+ up-to-date would be too hard.
+
+ Until that's ready (and progress is fairly good at the moment),
+ you'll have to rely on the comments in the header files.
+ They're mostly good, but rely on a few concepts which haven't
+ been properly explained. In particular, some parts of the
+ multiprecision maths library are quite subtle and require a
+ little mathematical background to understand fully.
+
+ I've written a collection of README files which cover things in
+ fairly broad brushstrokes, to try and set the header file
+ comments in context.
+
+Future directions
+
+ The following things will likely appear in later versions of
+ Catacomb:
+
+ * A manual. See above.
+
+ * Better key management. Particular attention will be paid to
+ management for public-key systems. This needs a lot of
+ thought, however.
+
+ * Secret-sharing systems. Take a secret, and give n people a
+ `share' in it, so that any k <= n of them can recover the
+ secret, but fewer than k have no hope.
+
+ * Arithmetic in finite fields other than the prime-order
+ fields constructed by integer multiplication with a prime
+ modulus. Interesting variants of Diffie-Hellman and other
+ discrete-log-based systems occur in such fields.
+
+ * Support for elliptic curve groups. Unfortunately, elliptic
+ curve cryptography is fraught with patent issues.
+
+ Other stuff not listed here will almost certainly be added. If
+ people have suggestions then I'll consider them fairly, although
+ they shouldn't conflict with my main objectives.
+
+--
+[mdw]
+
+\f
+Local variables:
+mode: text
+End:
--- /dev/null
+Symmetric ciphers
+
+
+ Catacomb provides a number of symmetric ciphers, together with a
+ generic cipher interface. More ciphers may be added later.
+
+
+Block cipher interface
+
+ There are a number of block ciphers implemented, all with
+ extremely similar interfaces. However, block ciphers aren't
+ actually at all pleasant to use directly. They're really
+ intended to be used only by higher-level `modes'.
+
+ Anyway, I'll take Bruce Schneier's Blowfish as an example.
+
+ Before doing any encryption or decryption, you need to
+ initialize a `context'. The context data is stored in an object
+ of type `blowfish_ctx'. You initialize the context by calling
+ `blowfish_init' with the address of the context, the address of
+ some key data, and the length of the key.
+
+ Data is encrypted using `blowfish_eblk' and decrypted by
+ `blowfish_dblk'. Both functions are given data as an array of
+ `uint32' words. (Since Blowfish uses 64-bit blocks, you give it
+ arrays of two words.)
+
+ A number of constants are defined to describe further properties
+ of the cipher:
+
+ BLOWFISH_KEYSZ Is zero, to indicate that Blowfish doesn't care
+ much about the size of key you give it.
+
+ BLOWFISH_BLKSZ Is 8, because Blowfish works on 64-bit blocks,
+ which are therefore 8 bytes wide.
+
+ BLOWFISH_CLASS Is the triple (N, B, 64). This is explained
+ below.
+
+ The BLOWFISH_CLASS macro contains information useful to other
+ macros, rather than to direct users of the interface. The three
+ components are:
+
+ The `type' Simply N if specific macros for handling blocks
+ of the appropriate width have been written, or X
+ if the macros should use a loop instead.
+
+ The `endianness'
+ Either `B' for big-endian, or L for little-
+ endian.
+
+ The `width' The cipher's block size in bits.
+
+ This simple interface is thoroughly inconvenient for general
+ use, although it makes writing block ciphers very easy.
+
+
+ The peculiarities of the various ciphers are described below.
+
+ Blowfish Fairly standard, really. Accepts arbitrary-
+ sized keys up to 448 bits. (The original
+ definition only specified keys with a multiple
+ of 32 bits -- the extension I use is due, I
+ think, to Eric Young.) Blowfish is fast and
+ looks very secure.
+
+ IDEA Requires a 128-bit key. Not very fast. No
+ known attacks on the full cipher. Used in
+ PGP2. Patented!
+
+ DES Good old reliable. Been around for donkey's
+ years and still going. Single-DES (implemented
+ here) has a small key, but apart from that is
+ looking remarkably robust. Uses a 56-bit key
+ which may be either 8 bytes with (ignored)
+ parity bits in the bottom bit of each byte, or 7
+ bytes with no parity.
+
+ DES3 Two- or three-key triple DES. Slow, but strong
+ and almost universally trusted. Accepts 56-,
+ 112- and 168-bit keys. (56 bits gives you
+ single DES at a third of the speed.) Again,
+ parity may be included or not, so the full range
+ of key sizes in bytes is: 7, 8, 14, 16, 21 or
+ 24.
+
+ RC5 Arbitrary-sized key. Designed by Ron Rivest.
+ Not completely convincing in security. About as
+ fast as Blowfish, but with a quicker key
+ schedule. Patented, I think.
+
+
+Block cipher modes
+
+ There are four block cipher modes defined, all of which create a
+ useful cipher from block cipher. Modes are implemented
+ separately from ciphers, so it's easy to add either, and easy to
+ apply modes to ciphers.
+
+ A few definitions will be helpful to explain the modes. Let E
+ denote the encryption function, P be the current plaintext
+ block, C be the current ciphertext, and C' be the previous
+ ciphertext block. Let `XOR' denote the bitwise exclusive or
+ operation.
+
+ Then the modes Electronic Code Book (ECB), Cipher Block Chaining
+ (CBC) and Ciphertext Feedback (CFB) are defined as:
+
+ ECB C = E(P)
+ CBC C = E(P XOR C')
+ CFB C = P XOR E(C')
+
+ Finally, Output Feedback is defined like this: let O be the
+ current output, and O' be the previous output. Then
+
+ OFB O = E(O'), C = P XOR O
+
+ The `previous ciphertext' or `previous output' for the first
+ block is provided by an `initialization vector' or IV.
+
+ The above definitions imply that only data which comes in
+ multiples of the block size can be encrypted. Normally this is
+ the case. However, Catacomb implements all four modes so that
+ almost arbitrary sizes of plaintext can be encrypted (without
+ having to pad out the ciphertext). The details are complicated:
+ read the source, or look up `ciphertext stealing' in your copy
+ of Schneier's `Applied Cryptography'.
+
+ ECB must have *at least* one entire block to work with, but
+ apart from that can cope with odd-size inputs. Both ECB and CBC
+ insert `boundaries' when you encrypt an odd-size input -- you
+ must decrypt in exactly the same-size chunks as you encrypted,
+ otherwise you'll only get rubbish out.
+
+ CFB and OFB have no restrictions on input sizes, and do not
+ normally insert boundaries, although it's possible to explicitly
+ request one.
+
+ Be especially careful with OFB mode. Since it generates an
+ output stream independent of the plaintext, and then XORs the
+ two, if you ever reuse the same key and IV pair, both encrypted
+ messages are compromised. (Take the two ciphertexts, and XOR
+ them together -- then the OFB stream cancels and you have the
+ plaintexts XORed. This is fairly trivial to unravel.)
+
+ OFB mode makes a good random byte generator. See README.random
+ for details about random number generators in Catacomb.
+
+
+ The modes all have similar interfaces. CFB is probably the best
+ example, although CBC is more useful in practice. I'll take
+ Blowfish as my example cipher again.
+
+ You need to initialize a context block. For Blowfish in CFB
+ mode, this is called `blowfish_cfbctx'. You initialize it by
+ passing the context address, a key, the key length, and pointer
+ to an IV (which must be BLOWFISH_BLKSZ in length) to
+ blowfish_cfbinit. If you pass a null pointer instead of an IV,
+ a zero IV is used. This is usually OK for CBC, but bad for OFB
+ or CFB unless you make sure that the key itself is only used
+ once.
+
+ Data is encrypted using blowfish_cfbencrypt and
+ blowfish_cfbdecrypt -- both are given: the address of the
+ context, a pointer to the source data, a pointer to the
+ destination (which may overlap the source) and the size of the
+ data to encrypt or decrypt.
+
+ The IV may be changed by calling blowfish_cfbsetiv. The current
+ IV (really meaning the previous ciphertext) can be obtained with
+ blowfish_cfbgetiv. The key may be changed without altering the
+ IV using blowfish_cfbsetkey. A boundary may be inserted in the
+ ciphertext or plaintext using blowfish_cfbbdry.
+
+ ECB doesn't use IVs, so there aren't ecbsetiv or ecbgetiv
+ calls. You can't insert boundaries in ECB or CBC mode.
+
+ OFB encryption and decryption are the same, so there's no
+ separate ofbdecrypt call. However, ofbencrypt has some useful
+ tricks:
+
+ * If the destination pointer is null, it just churns the output
+ round for a while, without emitting any data.
+
+ * If the source pointer is null, it simply spits out the
+ output blocks from the feedback process. This is equivalent
+ to giving an input full of zero bytes.
+
+
+Implementing new modes: nasty macros
+
+ Block cipher modes are implemented as macros which define the
+ appropriate functions. They're given the prefixes (upper- and
+ lowercase) and expected to get on with life.
+
+ Data can be shunted around fairly efficiently using the BLKC
+ macros. These are fairly ugly, so don't try to work out how
+ they work.
+
+ In the following notation, `b' denotes a pointer to bytes, and
+ `w' and `wx' denote pointers to words. `PRE' is the uppercase
+ cipher prefix. I'll abuse this notation a little and use the
+ names to refer to the entire arrays, since their lengths are
+ known to be PRE_BLKSZ (in bytes) or PRE_BLKSZ / 4 (in words)
+ long.
+
+ BLKC_STORE(PRE, b, w) Set b = w
+ BLKC_XSTORE(PRE, b, w, wx) Set b = w XOR wx
+ BLKC_LOAD(PRE, w, b) Set w = b
+ BLKC_XLOAD(PRE, w, b) Set w = w XOR b
+ BLKC_MOVE(PRE, w, wx) Set w = wx
+ BLKC_XMOVE(PRE, w, wx) Set w = w XOR wx
+
+ These should be enough for most purposes. More can be added,
+ but involves a strong stomach and an ability to do things with C
+ macros which most people wouldn't like to think about over
+ dinner.
+
+
+Other ciphers
+
+ There's only one stream cipher implemented at the moment, and
+ that's RC4. It was designed by Ron Rivest. It's the fastest
+ cipher in Catacomb. It looks fairly strong (although see the
+ note about churning the context after keying below). And also
+ note that it works in output feedback -- you just XOR the output
+ from RC4 with the plaintext. Never reuse an RC4 key!
+
+ RC4 includes an OFB-like interface which should be familiar. It
+ also includes a pair of strange macros RC4_OPEN and RC4_BYTE.
+ These are used to actually get bytes out of the RC4 generator.
+
+ RC4_OPEN is really a new syntactic form. If `r' is a pointer to
+ an RC4 context, then
+
+ RC4_OPEN(r, <statements>);
+
+ executes <statements> within the opened RC4 context. The
+ significance of this is that the expression RC4_BYTE(x) extracts
+ the next byte from the innermost open context, and stores it in
+ x. The standard RC4 encrypt function is written in terms of
+ RC4_OPEN and RC4_BYTE.
+
+ RC4 makes an excellent and fast random-byte generator.
+
+ RSA Data Security Inc. claim that RC4 is a trade secret of
+ theirs. It doesn't look very secret to me.
+
+
+Generic cipher interfaces
+
+ It can be convenient to implement routines where the cipher to
+ use is a parameter. Hence, Catacomb provides a generic
+ interface to (symmetric) ciphers. The generic interface is
+ defined in <catacomb/gcipher.h>.
+
+ The basic type in the interface is `gcipher', which represents
+ an `instance' of a cipher. You don't see lone cipher objects,
+ only pointers to them, so really everything's in terms of
+ `gcipher *'.
+
+ A `gcipher' is a structured type with one member, called `ops'
+ which points to a collection of functions and other useful
+ information. If `c' is a cipher...
+
+ c->ops->b->name Name of the cipher being used
+
+ c->ops->b->keysz Key size in bytes (or zero for
+ `don't care')
+
+ c->ops->b->blksz Block size in bytes (or zero for
+ `not a block cipher')
+
+ c->ops->encrypt(c, s, t, sz) Encrypt the sz bytes stored in
+ s, and store the ciphertext at
+ t.
+
+ c->ops->decrypt(c, s, t, sz) Like encrypt, only it decrypts.
+
+ c->ops->destroy(c) Destroys the cipher object `r'.
+
+ c->ops->setiv(c, iv) Sets the IV to be `iv' -- must
+ be blksz bytes long.
+
+ c->ops->bdry(c) Inserts a boundary.
+
+ Note that `setiv' and `bdry' aren't implemented by all ciphers
+ so these may be null pointers. It's best to check first.
+
+ Generic cipher instances are created from `generic cipher
+ classes' (type `gccipher' -- note the extra `c'). This contains
+ two members:
+
+ b The `class base' -- this is the object pointed
+ to by `c->ops->b', and contains `name', `keysz'
+ and `blksz' members.
+
+ init The constructor. You give it a pointer to some
+ key data and the key size, and it returns a
+ generic cipher instance.
+
+ Note that new generic ciphers always have zero IVs (if they
+ understand the concept), so you may need to call setiv if you
+ want to reuse keys.
+
+ Always remember to destroy gcipher instances when you're
+ finished with them.
+
+ The generic cipher class for CBC-mode Blowfish is called
+ `blowfish_cbc' -- the others are named similarly. The RC4
+ generic cipher class is called simply `rc4'.
+
+
+--
+[mdw]
+
+\f
+Local variables:
+mode: text
+End:
--- /dev/null
+Cryptographic hash functions
+
+
+ Hash functions are very useful cryptographic primitives.
+ Catacomb provides a few of the best-known cryptographic hashes.
+
+
+Hash function interface
+
+ Hash functions share a regular interface. I'll take Ron
+ Rivest's MD5 as an example.
+
+ Using a hash function is a three-stage process. You initialize
+ a context, you hash some data, and you get the result out of the
+ end. The data to be hashed need not be contiguous, and you
+ don't have to have it all before you start. Hash function
+ contexts don't use up lots of memory.
+
+ An MD5 context is called `md5_ctx'. You initialize it with
+ `md5_init'. You hash a block of data using `md5_hash' giving it
+ the context, the pointer to the data and its length. Finally,
+ you extract the data using `md5_done', giving it the address of
+ a buffer of MD5_HASHSZ bytes for it to write the result.
+
+ There are some other standard operations as well, but they're
+ not often used: see the header files for details.
+
+ The hash functions supported are:
+
+ MD4 By Ron Rivest -- returns a 128-bit hash. MD4 is
+ not collision-resistant, and may not even be
+ second-preimage-resistant. Don't depend on its
+ security. On the other hand, MD4 is *very*
+ fast.
+
+ MD5 Also by Rivest, also returns a 128-bit hash.
+ MD5 is slower than MD4, and more conservative,
+ but there are still grave doubts about its
+ security.
+
+ SHA1 Designed by the US National Security Agency.
+ Returns a 160-bit hash. Slower than MD5. Looks
+ strong. Fixes a problem in the earlier SHA.
+
+ RIPEMD-160 Designed by the people who broke MD4 and MD5.
+ Returns a 160-bit hash. Slower than SHA1. My
+ personal preference.
+
+
+HMAC interface
+
+ It's possible to construct a `keyed hash' or `message
+ authentication code' from a hash function. Most methods for
+ doing this are insecure. HMAC is a good method, with rigorously
+ proven security properties.
+
+ Each hash function above has an HMAC mode defined for it. This
+ works much the same way as block cipher modes.
+
+ Using HMAC is a two-step process. First, you initialize a MAC
+ key block `mackey' to contain the key you'll use, and then you
+ initialize MAC contexts `macctx' which to actually hash the
+ data. Hashing works just the same as the basic hash function,
+ except that you use `macinit', `machash' and `macdone' functions
+ rather than the plain `init', `hash' and `done'.
+
+
+Generic interfaces
+
+ Generic interfaces to hash functions and MACs are provided. See
+ README.cipher to get an idea for how a similar generic interface
+ works -- I'll only explain the differences here.
+
+ The generic hash object, `ghash' contains an `ops' member
+ referring to:
+
+ h->ops->b->name The name of the hash function.
+
+ h->ops->b->hashsz The output size of the hash function.
+
+ h->ops->hash(h, p, sz) Hash sz bytes of data starting at
+ address p.
+
+ h->ops->done(h, b) Stop hashing, write the result to buffer
+ b with size `hashsz'.
+
+ h->ops->destroy(h) Destroy the generic hash object.
+
+ The generic hash class, `gchash', contains a base which has the
+ same members as `ops->b' above, and an `init' function which
+ takes no arguments and returns a pointer to a `ghash'.
+
+ There's a generic MAC interface too. A MAC class, `gcmac'
+ contains a hash base (exactly the same, but with a different
+ name), and a function `key' which takes a pointer to some key
+ data and the key size, and returns a pointer to a `generic mac'
+ object, `gmac'.
+
+ A `gmac' contains an `ops' member:
+
+ m->ops->b->name, m->ops->b->hashsz
+ As above.
+
+ m->ops->init(m) Returns a generic *hash* object to
+ actually compute a MAC over some data.
+
+ m->ops->destroy(m) Destroys the generic MAC block.
+
+
+ That was quite complex. Here's an example of using a generic
+ MAC.
+
+ void compute_mac(gcmac *gcm, const void *k, size_t ksz,
+ const void *p, size_t sz,
+ void *hash)
+ {
+ gmac *m = gcm->init(k, ksz);
+ ghash *h = m->ops->init(m);
+ m->ops->destroy(m);
+ h->ops->hash(h, p, sz);
+ h->ops->done(h, hash);
+ h->ops->destroy(h);
+ }
+
+ Note that the hash doesn't depend on the MAC object continuing
+ to exist.
+
+--
+[mdw]
+
+\f
+Local variables:
+mode: text
+End:
--- /dev/null
+Multiprecision maths library
+
+
+ Catacomb comes with a fairly reasonable multiprecision maths
+ library. It's not stunningly fast, and doesn't do everything,
+ but it seems good for the sorts of things needed in
+ cryptographic algorithms (which, for example, GMP isn't[1]).
+
+ There are basically three layers to the system:
+
+ * The low-level `mpx' layer, which implements some basic
+ arithmetic but is fairly unpleasant to actually use.
+
+ * A high-level `mp' layer, which provides a useful `mp'
+ multiprecision type and implements useful arithmetic on
+ them.
+
+ * Everything else. A collection of miscellaneous routines for
+ performing particular operations efficiently or simply.
+
+ [1] In particular, GMP is truly rotten at converting numbers
+ into well-defined binary formats. It has its own format,
+ but it's not defined in the manual, has changed once before,
+ probably doesn't conform to any standards in the
+ cryptographic community, and can only be written to or read
+ from a file stream, so you can't use it in memory (for
+ example to encrypt it). Basically, it's a right pain.
+
+
+The low-level interface
+
+ The low level is described in `mpx.h'. It includes everything
+ else it needs.
+
+ An integer is represented in an array of words. Each word has
+ type `mpw' (multi precision word). The least significant word
+ comes first. An array is described by its `base' -- the address
+ of the first element -- and its `limit' -- the address *just
+ after* the last element. Thus, the length of the array is
+ `limit - base'. This representation is particularly convenient
+ for many things. An old version used base and length, which was
+ a nuisance.
+
+ The largest value which can be represented in a word is
+ MPW_MAX. The number of bits used in the representation of a
+ word is MPW_BITS. Actually, you might be able to use more bits
+ and store larger numbers, but everything will go wrong if you
+ try.
+
+ The macro call MPW(x) returns the low MPW_BITS bits of x. It's
+ very useful in low-level arithmetic.
+
+ The call MPWS(n) tells you the `sizeof' an array of n words.
+ It's also very useful. Try not to get MPW and MPWS confused
+ because they mean very different things.
+
+ The actual low-level macros and functions are documented
+ relatively well. They are given source and destination arrays
+ for things to be read and written to. Usually the sources
+ mustn't overlap the destinations, and destinations certainly
+ mustn't overlap other destinations; sometimes this restriction
+ is lifted, where it would be both convenient and easy to do so.
+ Sources can overlap each other without problems, because they're
+ not written to.
+
+ The difficult parts are the Karatsuba functions. Karatsuba
+ and Ofman discovered a neat recursive way to multiply large
+ numbers. It requires quite a bit of workspace, and adds some
+ overhead, but has lower asymptotic complexity than the standard
+ multiply algorithm. Usually, Karatsuba functions get used
+ automatically by the rest of the library when appropriate and
+ you don't have to care.
+
+
+The high-level interface
+
+ The high-level invents a type, `mp'. It's a structured type,
+ with the following members:
+
+ mpw *v, *vl Base and limit of the word array.
+ size_t sz Number of words actually allocated to the array.
+ unsigned f Various flags.
+ unsigned ref Number of references to the integer.
+
+ The flags are interesting. Here's what they mean:
+
+ MP_NEG The number is negative
+ MP_BURN Burn the number after using it
+ MP_CONST The number mustn't be modified or freed
+ MP_UNDEF The number contains no interesting value
+ MP_DESTROYED The number has been freed once already
+
+ The last one is handy for catching bugs. MP_NEG is obvious --
+ Catacomb uses a signed-magnitude representation because things
+ are usually easier that way. MP_CONST is set on various
+ constants provided for general convenience so they don't get
+ destroyed by accident. MP_UNDEF is used to avoid copying
+ useless data.
+
+ MP_BURN marks a `secret' number. Secret numbers are overwritten
+ with zeroes (`burnt') when they're freed. Secrecy is a viral
+ concept: anything calculated from a secret number is also
+ secret, so they get burnt too.
+
+ Most of the time, you refer to numbers by their addresses. Lots
+ of `mp *' pointers get used. You hardly ever see a real `mp',
+ just pointers to them.
+
+ There are a collection of utility functions for keeping the
+ system running -- for creating new numbers without interesting
+ values, for throwing old ones away, for extending the amount of
+ memory they use, and so on.
+
+ Throwing numbers away is simple. You pass the pointer to the
+ number to `mp_drop'. If someone else still has a reference to
+ the number, it stays where it is, but remembers that you're not
+ interested in it any more. If nobody's interested then the
+ number is disposed of.
+
+ You can add a reference using MP_COPY. It returns the new
+ reference. The new reference is the same as the old one, but
+ it's a useful fiction to pretend that it's different.
+
+ Real arithmetic happens in a fairly uniform way. There's an
+ important point to make about `destinations', though. Just
+ about all of the arithmetic functions take `destination'
+ arguments (one for each result), with the idea that, at least
+ conceptually, they store the results there. What actually
+ happens is that a reference is lost to whatever the destination
+ used to refer to, and a reference to the new result is gained.
+ The address might be different, and then again it might not.
+ Destinations acn be the same as sources -- that works fine.
+ Finally, there's the magic value `MP_NEW'. MP_NEW is a special
+ constant which, as a destination, means `create a new place and
+ put the answer there'.
+
+ The idea is that, sometimes it helps that memory is already
+ allocated for a destination, so it doesn't have to be obtained
+ specially, and, even more often, you want to throw away old
+ intermediate results while you're calculating new ones.
+
+ Here are some examples of how to do it right:
+
+ x = mp_add(MP_NEW, y, z);
+ Assumes x didn't have a useful value before.
+ Afterwards it refers to the sum of y and z. y and z
+ themselves are unchanged.
+
+ y = mp_add(y, y, z);
+ Add z to y. Now y is the sum of z and the old y. z is
+ unchanged.
+
+ q = MP_NEW;
+ mp_div(&q, &x, x, y);
+ Sets q (a new value) equal to x / y, and puts the
+ remainder back in x. y is unchanged.
+
+ z = mp_sub(z, y, z);
+ Subtract z from y, putting the result back in z. y is
+ unchanged.
+
+ x = mp_mul(y, y, z);
+ Multiply y by z, putting the result in x, but making y
+ no longer useful.
+
+ Here's some examples of how to do it wrong:
+
+ mp_mul(y, y, z);
+ mp_mul might choose somewhere other than b's storage to
+ write its result. (In fact, the current version
+ certainly will.) So y is trashed because there are no
+ references to it any more, and the result of the
+ multiplication is lost.
+
+ x = mp_mul(y, y, z);
+ x = mp_add(x, x, y);
+ Whoops. We stored the result in x, but y still got
+ trashed and might contain anything. Adding it to x is a
+ bad move.
+
+ It's not hard once you get the hang of it, although it might
+ feel a bit weird to start with.
+
+
+Modular multiplication and exponentiation
+
+ Lots of public-key crypto uses modular multiplication and
+ exponentation operations. Doing them efficiently is very
+ important. Multiplication on its own is easy, and Catacomb's
+ multiplication algorithms are fairly good (though not up with
+ the very best). Doing the modular reduction afterwards is the
+ tricky bit.
+
+ There are three approaches to modular reduction in Catacomb.
+ All of them have their good and bad points.
+
+ 1. Use mp_div
+
+ For a one-off calculation, this is a good idea. Division is a
+ slow operation -- that's a shame, but life's like that. But
+ when you're doing a one-shot operation, that's the best you can
+ manage.
+
+ 2. Use Barrett reduction
+
+ Barrett reduction is a system whereby you do a little
+ precomputation up-front (one division, in fact) and are then
+ able to perform modular reductions without resorting to
+ expensive division operations. To use it, you initialize a
+ `context' with the modulus, reduce some numbers, and then
+ destroy the context when you've finished.
+
+ Conceptually, Barrett reduction is simple. You give it x and it
+ gives you back x mod m. The mathematics behind it is fairly
+ clever, though -- the real manual has a page of explanation for
+ how it works.
+
+ Barrett reduction works on any positive modulus m. It will do a
+ modular reduction on any integer x which is at most twice as
+ long as m (in words). x can be larger than m^2, but it mustn't
+ be more than twice as long.
+
+ 3. Use Montgomery reduction
+
+ Montgomery reduction is a little more clever. It does a little
+ more work up-front than Barrett reduction, and understanding the
+ benefits is a little harder. However, it's actually slightly
+ more efficient for general use. There's also a disadvantage
+ that Montgomery reduction only works when the modulus is *odd*.
+ It all goes horribly wrong with an even modulus. Don't try it.
+
+ I'm not going to explain how Montgomery reduction actually
+ works. But I am going to explain how to use it.
+
+ The precomputation for Montgomery reduction, performed when you
+ initialize a context, computes a value called R, and also R^2
+ (both mod m). (It also computes some other stuff, but that's
+ not important for this discussion.) Note that R is dependent on
+ the modulus m.
+
+ A Montgomery reduction takes a number x, which should be less
+ than m^2 (it can actually be a bit larger) and computes the
+ value x R^{-1} mod m. That doesn't sound very useful, does it?
+
+ Things start looking more hopeful when you multiply your inputs
+ by R. (There's a clever way of doing that: see below.) To
+ compute xy mod m, you can first calculate xR and yR, multiply
+ them together to get xyR^2, and do a Montgomery reduction to get
+ xyR^2 R^{-1} mod m. Then the R^{-1} cancels one of the Rs and
+ you end up with xyR mod m. Essentially, you do your entire
+ calculation with an extra factor of R all the way through it.
+
+ Removing the factor of R at the end is easy. You just run the
+ Montgomery reduction algorithm again -- the R^{-1} cancels the R
+ and the answer pops out the end. Getting the factor of R in the
+ first place is easy -- you multiply by R^2 and do a reduction.
+
+ The multiply-and-reduce thing is so common that there's a
+ dedicated function which does them both in one go.
+
+ Here's a simple example of multiplying two numbers mod m:
+
+ mpmont mm;
+ mp *ar, *br, *p;
+ mpmont_create(&mm, m);
+ ar = mp_mul(MP_NEW, a, m.r2); /* aR mod m */
+ br = mp_mul(MP_NEW, b, m.r2); /* bR mod m */
+ p = mpmont_mul(&mm, MP_NEW, ar, br); /* abR mod m */
+ p = mpmont_reduce(&mm, p, p); /* ab mod m */
+ mpmont_destroy(&mm);
+ mp_drop(ar);
+ mp_drop(br);
+
+ Actually, it's not a very clever example. Here's a better way,
+ for such a simple calculation:
+
+ mpmont mm;
+ mp *p;
+ mpmont_create(&mm, m);
+ p = mpmont_mul(&mm, MP_NEW, a, b); /* abR^{-1} mod m */
+ p = mpmont_mul(&mm, p, p, mm.r2); /* ab mod m */
+ mpmont_destroy(&mm);
+
+ Of course, for a single multiply-and-reduce, you're better off
+ just using
+
+ mp *p = mp_mul(MP_NEW, a, b);
+ mp_div(0, &p, p, m);
+
+ the traditional division method.
+
+ Both Barrett and Montgomery reduction methods have modular
+ exponentiation functions. If m is a message, n is an RSA
+ modulus, and e is the public exponent, I can do an RSA
+ encryption simply by doing
+
+ mpmont mm;
+ mpmont_create(&mm, n); /* RSA modulus is odd */
+ c = mpmont_exp(&mm, MP_NEW, m, e);
+ mpmont_destroy(&mm);
+
+ (Yes, it's well worth creating a Montgomery context for a
+ full-blown exponentiation, unless someone's deliberately gone
+ out of their way to make it easy for you by choosing a very
+ small public exponent.)
+
+
+Finding prime numbers
+
+ Prime numbers are important too. Finding them is not ever-so
+ easy.
+
+ Catacomb has two useful facilities. There's a fast sequential-
+ search filtering system called `pgen', and a good (but
+ probabilistic) primality tester which implements the Rabin-
+ Miller test.
+
+ The idea is that you initialize a pgen object with a starting
+ place. It spends a short time initializing itself, and then
+ tells you whether the number you gave it is (a) definitely
+ composite, (b) definitely prime (only for small numbers), or (c)
+ it doesn't know. The advantage of the pgen system is that it's
+ good at stepping regularly. You can advance it by 2 or 4 very
+ rapidly, and it will give you the same sort of answer for the
+ next number.
+
+ When you find a number which pgen says it doesn't know about,
+ you must test it more thoroughly. This is time-consuming, but
+ pgen has weeded out many no-hopers so it's not so bad.
+
+ The Rabin-Miller test takes a number to test, and stores it in a
+ context. You then give it some random numbers, and it gives you
+ results. When you've finished, you destroy the context.
+
+ The Rabin-Miller test has two possible responses: (a) definitely
+ not prime, and (b) probably prime. Answer (a) is certain.
+ Answer (b) you may get with a composite number at most one time
+ in four. If you try n times, the probability of a composite
+ number going unnoticed is at most one in 4^n. In fact, the real
+ probability is *much* lower than this.
+
+ It's also important to bear in mind that this is examining the
+ probability that a number passes n Rabin-Miller tests given
+ that it's composite. What's actually interesting the the
+ probability that a number is composite given that it passed n
+ tests. This probability is lower still. For 1024-bit numbers,
+ which are about the right size by current standards of security,
+ you need only five tests to ensure a probability of less than 1
+ in 2^80 of a composite number slipping through.
+
+ There are specialized functions for finding various sorts of
+ prime numbers with particular properties. Read the header files
+ to find out what they do.
+
+
+Conclusion
+
+ This basically concludes the tour of Catacomb's multiprecision
+ maths library. I hope it's provided enough background that the
+ comments start making sense.
+
+--
+[mdw]
+
+\f
+Local variables:
+mode: text
+End:
--- /dev/null
+Catacomb's random number generators
+
+
+ Catacomb contains a large number of pseudorandom number
+ generators. Some are dedicated to generating pseudorandom
+ numbers; others are ciphers in output-feedback mode. Some are
+ cryptographically strong, others are fast or simple. All have
+ been statistically tested and come out OK.
+
+
+The `rand' generator
+
+ Cryptographic applications need a source of true random data.
+ The `rand' generator is Catacomb's true-random-data source.
+
+ The design is my own. I wasn't happy with any existing designs,
+ and I'm still not -- even Counterpane's Yarrow gives me an
+ uneasy feeling[1]. I'm not aware of any cryptanalytic attack
+ against my generator, and I'd very much like to hear if anyone
+ else knows differently.
+
+ The generator's state is held in an object of type `rand_pool'.
+ Data can be added into the pool using the `rand_add' function.
+ Random numbers can be extracted by calling `rand_get'.
+
+ It's possible to attach a `noise acquisition' function to a
+ rand_pool. The function `rand_getgood' extracts a number of
+ bytes from the generator, calling the noise acquisition function
+ if it thinks it's low on randomness. This shouldn't actually be
+ necessary once the generator has been topped up once. It makes
+ the generator very slow.
+
+ It's also possible to key the generator. The output
+ transformation then depends on the key, making it even harder
+ for an adversary to work out anything useful about the
+ generator.
+
+ The underlying transformation used by the generator is
+
+ x' = E_{H(x)}(x)
+
+ i.e., hash, use the hash as a key, and encrypt.
+
+
+ [1] The Yarrow paper states that it offers 160 bits-worth of
+ security, and that an adversary can't distinguish its output
+ from real random bits. I have an attack which takes about
+ 2^64 64-bit outputs and tells the difference between
+ Yarrow-160 and real random data, by analysing the frequency
+ of adjacent 64-bit blocks being equal.
+
+
+Noncryptographic generators
+
+ Two pseudorandom-number generators are supplied which have no
+ cryptographic strength whatever. They are, respectively, a
+ traditional linear-congruential generator (lcrand) and a
+ Fibonacci generator (fibrand).
+
+ The linear-congruential generator has three (fixed) parameters,
+ a, c and p, and calculates
+
+ x[i] = (a x[i - 1] + c) mod p
+
+ The prime p is fairly large (2^32 - 5, in fact), and a and c are
+ carefully chosen. The generator has a period of p - 1. There
+ is one fixed point (i.e., it transforms to itself rather than
+ being in the cycle) which must be avoided.
+
+ The Fibonacci generator works by adding together old results:
+
+ x[i] = (x[i - 24] + x[i - 55]) mod 2^32
+
+ It has a very long period, and is the fastest pseudorandom
+ number generator in Catacomb.
+
+ These two generators are very useful when the cryptographic
+ strength of the output isn't important, for example when
+ choosing random numbers for primality testing.
+
+
+The Blum-Blum-Shub generator
+
+ This is by far the strongest and by even further the slowest
+ pseudorandom-number generator in Catacomb. Its output has been
+ proven to be indistinguishable from random data by *any*
+ polynomial-time test, assuming that factoring large numbers is
+ difficult.
+
+ Use the BBS generator when you're feeling really paranoid, and
+ you've got a few hours with nothing to do.
+
+
+Output-feedback ciphers
+
+ Ciphers in output-feedback mode simply emit a pseudorandom
+ stream. To construct a cipher, you XOR the stream with the
+ plaintext. It's even easier just to use the output as random
+ numbers.
+
+
+Generic interface
+
+ There's a fairly comprehensive generic interface for Catacomb's
+ various generators.
+
+ Each generator provides a function which returns a pointer to a
+ `generic pseudorandom-number generator', of type `grand'.
+
+ A `grand' is a structured type containing one member, `ops',
+ which is a pointer to various useful things. Let `r' be a
+ pointer to a generic pseudorandom-number generator; then,
+
+ r->ops->name The name of the generator.
+
+ r->ops->max One greater than the maximum output of
+ the `raw' function, or zero if it just
+ emits words.
+
+ r->ops->misc(r, op, ...)
+ Performs a miscellaneous operation. The
+ various standard `op' values are
+ described below.
+
+ r->ops->destroy(r) Destroy the generic generator.
+
+ r->ops->raw(r) Return a raw output from the generator.
+ For `lcrand', this is an integer in the
+ range [0, p); for other generators, it's
+ the same as `word'.
+
+ r->ops->byte(r) Return an octet. The result is
+ uniformly distributed over the integers
+ in the interval [0, 256).
+
+ r->ops->word(r) Returns a word. The result is uniformly
+ distributed over the integers in the
+ interval [0, 2^32).
+
+ r->ops->range(r, l) Returns an integer unformly distributed
+ in the interval [0, l), l < 2^32.
+
+ r->ops->fill(r, p, sz) Fill the sz bytes at address p with
+ uniformly distributed bytes (i.e.,
+ integers in the interval [0, 256).
+
+ All of these operations are supported by all generators. Some
+ may be inefficient, however, since they're synthesized using
+ operations more suited to the particular generator. For
+ example, `word' is not efficient on the linear-congruential
+ generator, but is for the Fibonacci generator.
+
+ The misc-ops are as follows:
+
+ misc(r, GRAND_CHECK, op) Return nonzero if `op' is a
+ recognized misc-op for this
+ generator.
+
+ misc(r, GRAND_SEEDINT, s) Seed the generator using the
+ integer s.
+
+ misc(r, GRAND_SEEDUINT32, s) Seed the generator using the
+ 32-bit word s.
+
+ misc(r, GRAND_SEEDBLOCK, p, sz) Seed the generator using the
+ block of sz bytes starting at
+ address p.
+
+ misc(r, GRAND_SEEDMP, m) Seed the generator using the
+ multiprecision integer m.
+
+ misc(r, GRAND_SEEDRAND, r') Seed the generator using some
+ output from the generic pseudo-
+ random number generator r'.
+
+
+Statistical testing
+
+ I generated 10M of data with each of Catacomb's random number
+ generators, and tested them with George Marsaglia's DIEHARD
+ suite. While the random data is not included here, the
+ following trivial Bourne shell script will generate the exact
+ output streams I used, assuming that the supplied `rspit'
+ program is in the current PATH:
+
+ for i in des 3des idea blowfish rc5 rc4; do
+ rspit $i -kseed -z10m -o$i.rand
+ done
+ for i in lcrand fibrand; do
+ rspit $i -p -s0 -z10m -o$i.rand
+ done
+ rspit rand -p -z10m -orand.rand
+ nice rspit bbs -p -s2 -z10m -obbs.rand
+
+ (Note that generating the BBS output data can take over an
+ hour. Make sure you have a good book to read.)
+
+ Looking at the results, I think that they all passed the tests
+ relatively convincingly.
+
+ I've not actually included the DIEHARD output files in the
+ distribution, because they're quite large, and anyone can
+ reproduce my results exactly using the publically available
+ DIEHARD distribution and the code supplied. If you do actually
+ want them, send me some mail and I'll send you a 60K tar.gz
+ file by (approximate) return.
+--
+[mdw]
+
+\f
+Local variables:
+mode: text
+End: