chiark - git - mdw - catacomb/blame_incremental

... / ...

Commit	Line	Data
	1	Catacomb's random number generators
	2
	3
	4	Catacomb contains a large number of pseudorandom number
	5	generators. Some are dedicated to generating pseudorandom
	6	numbers; others are ciphers in output-feedback mode. Some are
	7	cryptographically strong, others are fast or simple. All have
	8	been statistically tested and come out OK.
	9
	10
	11	The `rand' generator
	12
	13	Cryptographic applications need a source of true random data.
	14	The `rand' generator is Catacomb's true-random-data source.
	15
	16	The design is my own. I wasn't happy with any existing designs,
	17	and I'm still not -- even Counterpane's Yarrow gives me an
	18	uneasy feeling[1]. I'm not aware of any cryptanalytic attack
	19	against my generator, and I'd very much like to hear if anyone
	20	else knows differently.
	21
	22	The generator's state is held in an object of type `rand_pool'.
	23	Data can be added into the pool using the `rand_add' function.
	24	Random numbers can be extracted by calling `rand_get'.
	25
	26	It's possible to attach a `noise acquisition' function to a
	27	rand_pool. The function `rand_getgood' extracts a number of
	28	bytes from the generator, calling the noise acquisition function
	29	if it thinks it's low on randomness. This shouldn't actually be
	30	necessary once the generator has been topped up once. It makes
	31	the generator very slow.
	32
	33	It's also possible to key the generator. The output
	34	transformation then depends on the key, making it even harder
	35	for an adversary to work out anything useful about the
	36	generator.
	37
	38	The underlying transformation used by the generator is
	39
	40	x' = E_{H(x)}(x)
	41
	42	i.e., hash, use the hash as a key, and encrypt.
	43
	44
	45	[1] The Yarrow paper states that it offers 160 bits-worth of
	46	security, and that an adversary can't distinguish its output
	47	from real random bits. I have an attack which takes about
	48	2^64 64-bit outputs and tells the difference between
	49	Yarrow-160 and real random data, by analysing the frequency
	50	of adjacent 64-bit blocks being equal.
	51
	52
	53	Non-cryptographic generators
	54
	55	Two pseudorandom-number generators are supplied which have no
	56	cryptographic strength whatever. They are, respectively, a
	57	traditional linear-congruential generator (lcrand) and a
	58	Fibonacci generator (fibrand).
	59
	60	The linear-congruential generator has three (fixed) parameters,
	61	a, c and p, and calculates
	62
	63	x[i] = (a x[i - 1] + c) mod p
	64
	65	The prime p is fairly large (2^32 - 5, in fact), and a and c are
	66	carefully chosen. The generator has a period of p - 1. There
	67	is one fixed point (i.e., it transforms to itself rather than
	68	being in the cycle) which must be avoided.
	69
	70	The Fibonacci generator works by adding together old results:
	71
	72	x[i] = (x[i - 24] + x[i - 55]) mod 2^32
	73
	74	It has a very long period, and is the fastest pseudorandom
	75	number generator in Catacomb.
	76
	77	These two generators are very useful when the cryptographic
	78	strength of the output isn't important, for example when
	79	choosing random numbers for primality testing.
	80
	81
	82	The Blum-Blum-Shub generator
	83
	84	This is by far the strongest and by even further the slowest
	85	pseudorandom-number generator in Catacomb. Its output has been
	86	proven to be indistinguishable from random data by any
	87	polynomial-time test, assuming that factoring large numbers is
	88	difficult.
	89
	90	Use the BBS generator when you're feeling really paranoid, and
	91	you've got a few hours with nothing to do.
	92
	93
	94	Output-feedback ciphers
	95
	96	Ciphers in output-feedback mode simply emit a pseudorandom
	97	stream. To construct a cipher, you XOR the stream with the
	98	plaintext. It's even easier just to use the output as random
	99	numbers.
	100
	101
	102	Generic interface
	103
	104	There's a fairly comprehensive generic interface for Catacomb's
	105	various generators.
	106
	107	Each generator provides a function which returns a pointer to a
	108	`generic pseudorandom-number generator', of type `grand'.
	109
	110	A `grand' is a structured type containing one member, `ops',
	111	which is a pointer to various useful things. Let `r' be a
	112	pointer to a generic pseudorandom-number generator; then,
	113
	114	r->ops->name The name of the generator.
	115
	116	r->ops->max One greater than the maximum output of
	117	the `raw' function, or zero if it just
	118	emits words.
	119
	120	r->ops->misc(r, op, ...)
	121	Performs a miscellaneous operation. The
	122	various standard `op' values are
	123	described below.
	124
	125	r->ops->destroy(r) Destroy the generic generator.
	126
	127	r->ops->raw(r) Return a raw output from the generator.
	128	For `lcrand', this is an integer in the
	129	range [0, p); for other generators, it's
	130	the same as `word'.
	131
	132	r->ops->byte(r) Return an octet. The result is
	133	uniformly distributed over the integers
	134	in the interval [0, 256).
	135
	136	r->ops->word(r) Returns a word. The result is uniformly
	137	distributed over the integers in the
	138	interval [0, 2^32).
	139
	140	r->ops->range(r, l) Returns an integer uniformly distributed
	141	in the interval [0, l), l < 2^32.
	142
	143	r->ops->fill(r, p, sz) Fill the sz bytes at address p with
	144	uniformly distributed bytes (i.e.,
	145	integers in the interval [0, 256).
	146
	147	All of these operations are supported by all generators. Some
	148	may be inefficient, however, since they're synthesized using
	149	operations more suited to the particular generator. For
	150	example, `word' is not efficient on the linear-congruential
	151	generator, but is for the Fibonacci generator.
	152
	153	The misc-ops are as follows:
	154
	155	misc(r, GRAND_CHECK, op) Return nonzero if `op' is a
	156	recognized misc-op for this
	157	generator.
	158
	159	misc(r, GRAND_SEEDINT, s) Seed the generator using the
	160	integer s.
	161
	162	misc(r, GRAND_SEEDUINT32, s) Seed the generator using the
	163	32-bit word s.
	164
	165	misc(r, GRAND_SEEDBLOCK, p, sz) Seed the generator using the
	166	block of sz bytes starting at
	167	address p.
	168
	169	misc(r, GRAND_SEEDMP, m) Seed the generator using the
	170	multiprecision integer m.
	171
	172	misc(r, GRAND_SEEDRAND, r') Seed the generator using some
	173	output from the generic pseudo-
	174	random number generator r'.
	175
	176
	177	Statistical testing
	178
	179	I generated 10M of data with each of Catacomb's random number
	180	generators, and tested them with George Marsaglia's DIEHARD
	181	suite. While the random data is not included here, the
	182	following trivial Bourne shell script will generate the exact
	183	output streams I used, assuming that the supplied `rspit'
	184	program is in the current PATH:
	185
	186	for i in des 3des idea blowfish rc5 rc4; do
	187	rspit $i -kseed -z10m -o$i.rand
	188	done
	189	for i in lcrand fibrand; do
	190	rspit $i -p -s0 -z10m -o$i.rand
	191	done
	192	rspit rand -p -z10m -orand.rand
	193	nice rspit bbs -p -s2 -z10m -obbs.rand
	194
	195	(Note that generating the BBS output data can take over an
	196	hour. Make sure you have a good book to read.)
	197
	198	Looking at the results, I think that they all passed the tests
	199	relatively convincingly.
	200
	201	I've not actually included the DIEHARD output files in the
	202	distribution, because they're quite large, and anyone can
	203	reproduce my results exactly using the publicly available
	204	DIEHARD distribution and the code supplied. If you do actually
	205	want them, send me some mail and I'll send you a 60K tar.gz
	206	file by (approximate) return.
	207
	208	-- [mdw]
	209
	210	\f
	211	Local variables:
	212	mode: text
	213	End: