| 1 | Catacomb's random number generators |
| 2 | |
| 3 | |
| 4 | Catacomb contains a large number of pseudorandom number |
| 5 | generators. Some are dedicated to generating pseudorandom |
| 6 | numbers; others are ciphers in output-feedback mode. Some are |
| 7 | cryptographically strong, others are fast or simple. All have |
| 8 | been statistically tested and come out OK. |
| 9 | |
| 10 | |
| 11 | The `rand' generator |
| 12 | |
| 13 | Cryptographic applications need a source of true random data. |
| 14 | The `rand' generator is Catacomb's true-random-data source. |
| 15 | |
| 16 | The design is my own. I wasn't happy with any existing designs, |
| 17 | and I'm still not -- even Counterpane's Yarrow gives me an |
| 18 | uneasy feeling[1]. I'm not aware of any cryptanalytic attack |
| 19 | against my generator, and I'd very much like to hear if anyone |
| 20 | else knows differently. |
| 21 | |
| 22 | The generator's state is held in an object of type `rand_pool'. |
| 23 | Data can be added into the pool using the `rand_add' function. |
| 24 | Random numbers can be extracted by calling `rand_get'. |
| 25 | |
| 26 | It's possible to attach a `noise acquisition' function to a |
| 27 | rand_pool. The function `rand_getgood' extracts a number of |
| 28 | bytes from the generator, calling the noise acquisition function |
| 29 | if it thinks it's low on randomness. This shouldn't actually be |
| 30 | necessary once the generator has been topped up once. It makes |
| 31 | the generator very slow. |
| 32 | |
| 33 | It's also possible to key the generator. The output |
| 34 | transformation then depends on the key, making it even harder |
| 35 | for an adversary to work out anything useful about the |
| 36 | generator. |
| 37 | |
| 38 | The underlying transformation used by the generator is |
| 39 | |
| 40 | x' = E_{H(x)}(x) |
| 41 | |
| 42 | i.e., hash, use the hash as a key, and encrypt. |
| 43 | |
| 44 | |
| 45 | [1] The Yarrow paper states that it offers 160 bits-worth of |
| 46 | security, and that an adversary can't distinguish its output |
| 47 | from real random bits. I have an attack which takes about |
| 48 | 2^64 64-bit outputs and tells the difference between |
| 49 | Yarrow-160 and real random data, by analysing the frequency |
| 50 | of adjacent 64-bit blocks being equal. |
| 51 | |
| 52 | |
| 53 | Non-cryptographic generators |
| 54 | |
| 55 | Two pseudorandom-number generators are supplied which have no |
| 56 | cryptographic strength whatever. They are, respectively, a |
| 57 | traditional linear-congruential generator (lcrand) and a |
| 58 | Fibonacci generator (fibrand). |
| 59 | |
| 60 | The linear-congruential generator has three (fixed) parameters, |
| 61 | a, c and p, and calculates |
| 62 | |
| 63 | x[i] = (a x[i - 1] + c) mod p |
| 64 | |
| 65 | The prime p is fairly large (2^32 - 5, in fact), and a and c are |
| 66 | carefully chosen. The generator has a period of p - 1. There |
| 67 | is one fixed point (i.e., it transforms to itself rather than |
| 68 | being in the cycle) which must be avoided. |
| 69 | |
| 70 | The Fibonacci generator works by adding together old results: |
| 71 | |
| 72 | x[i] = (x[i - 24] + x[i - 55]) mod 2^32 |
| 73 | |
| 74 | It has a very long period, and is the fastest pseudorandom |
| 75 | number generator in Catacomb. |
| 76 | |
| 77 | These two generators are very useful when the cryptographic |
| 78 | strength of the output isn't important, for example when |
| 79 | choosing random numbers for primality testing. |
| 80 | |
| 81 | |
| 82 | The Blum-Blum-Shub generator |
| 83 | |
| 84 | This is by far the strongest and by even further the slowest |
| 85 | pseudorandom-number generator in Catacomb. Its output has been |
| 86 | proven to be indistinguishable from random data by *any* |
| 87 | polynomial-time test, assuming that factoring large numbers is |
| 88 | difficult. |
| 89 | |
| 90 | Use the BBS generator when you're feeling really paranoid, and |
| 91 | you've got a few hours with nothing to do. |
| 92 | |
| 93 | |
| 94 | Output-feedback ciphers |
| 95 | |
| 96 | Ciphers in output-feedback mode simply emit a pseudorandom |
| 97 | stream. To construct a cipher, you XOR the stream with the |
| 98 | plaintext. It's even easier just to use the output as random |
| 99 | numbers. |
| 100 | |
| 101 | |
| 102 | Generic interface |
| 103 | |
| 104 | There's a fairly comprehensive generic interface for Catacomb's |
| 105 | various generators. |
| 106 | |
| 107 | Each generator provides a function which returns a pointer to a |
| 108 | `generic pseudorandom-number generator', of type `grand'. |
| 109 | |
| 110 | A `grand' is a structured type containing one member, `ops', |
| 111 | which is a pointer to various useful things. Let `r' be a |
| 112 | pointer to a generic pseudorandom-number generator; then, |
| 113 | |
| 114 | r->ops->name The name of the generator. |
| 115 | |
| 116 | r->ops->max One greater than the maximum output of |
| 117 | the `raw' function, or zero if it just |
| 118 | emits words. |
| 119 | |
| 120 | r->ops->misc(r, op, ...) |
| 121 | Performs a miscellaneous operation. The |
| 122 | various standard `op' values are |
| 123 | described below. |
| 124 | |
| 125 | r->ops->destroy(r) Destroy the generic generator. |
| 126 | |
| 127 | r->ops->raw(r) Return a raw output from the generator. |
| 128 | For `lcrand', this is an integer in the |
| 129 | range [0, p); for other generators, it's |
| 130 | the same as `word'. |
| 131 | |
| 132 | r->ops->byte(r) Return an octet. The result is |
| 133 | uniformly distributed over the integers |
| 134 | in the interval [0, 256). |
| 135 | |
| 136 | r->ops->word(r) Returns a word. The result is uniformly |
| 137 | distributed over the integers in the |
| 138 | interval [0, 2^32). |
| 139 | |
| 140 | r->ops->range(r, l) Returns an integer uniformly distributed |
| 141 | in the interval [0, l), l < 2^32. |
| 142 | |
| 143 | r->ops->fill(r, p, sz) Fill the sz bytes at address p with |
| 144 | uniformly distributed bytes (i.e., |
| 145 | integers in the interval [0, 256). |
| 146 | |
| 147 | All of these operations are supported by all generators. Some |
| 148 | may be inefficient, however, since they're synthesized using |
| 149 | operations more suited to the particular generator. For |
| 150 | example, `word' is not efficient on the linear-congruential |
| 151 | generator, but is for the Fibonacci generator. |
| 152 | |
| 153 | The misc-ops are as follows: |
| 154 | |
| 155 | misc(r, GRAND_CHECK, op) Return nonzero if `op' is a |
| 156 | recognized misc-op for this |
| 157 | generator. |
| 158 | |
| 159 | misc(r, GRAND_SEEDINT, s) Seed the generator using the |
| 160 | integer s. |
| 161 | |
| 162 | misc(r, GRAND_SEEDUINT32, s) Seed the generator using the |
| 163 | 32-bit word s. |
| 164 | |
| 165 | misc(r, GRAND_SEEDBLOCK, p, sz) Seed the generator using the |
| 166 | block of sz bytes starting at |
| 167 | address p. |
| 168 | |
| 169 | misc(r, GRAND_SEEDMP, m) Seed the generator using the |
| 170 | multiprecision integer m. |
| 171 | |
| 172 | misc(r, GRAND_SEEDRAND, r') Seed the generator using some |
| 173 | output from the generic pseudo- |
| 174 | random number generator r'. |
| 175 | |
| 176 | |
| 177 | Statistical testing |
| 178 | |
| 179 | I generated 10M of data with each of Catacomb's random number |
| 180 | generators, and tested them with George Marsaglia's DIEHARD |
| 181 | suite. While the random data is not included here, the |
| 182 | following trivial Bourne shell script will generate the exact |
| 183 | output streams I used, assuming that the supplied `rspit' |
| 184 | program is in the current PATH: |
| 185 | |
| 186 | for i in des 3des idea blowfish rc5 rc4; do |
| 187 | rspit $i -kseed -z10m -o$i.rand |
| 188 | done |
| 189 | for i in lcrand fibrand; do |
| 190 | rspit $i -p -s0 -z10m -o$i.rand |
| 191 | done |
| 192 | rspit rand -p -z10m -orand.rand |
| 193 | nice rspit bbs -p -s2 -z10m -obbs.rand |
| 194 | |
| 195 | (Note that generating the BBS output data can take over an |
| 196 | hour. Make sure you have a good book to read.) |
| 197 | |
| 198 | Looking at the results, I think that they all passed the tests |
| 199 | relatively convincingly. |
| 200 | |
| 201 | I've not actually included the DIEHARD output files in the |
| 202 | distribution, because they're quite large, and anyone can |
| 203 | reproduce my results exactly using the publicly available |
| 204 | DIEHARD distribution and the code supplied. If you do actually |
| 205 | want them, send me some mail and I'll send you a 60K tar.gz |
| 206 | file by (approximate) return. |
| 207 | |
| 208 | -- [mdw] |
| 209 | |
| 210 | \f |
| 211 | Local variables: |
| 212 | mode: text |
| 213 | End: |