chiark / gitweb /
Mark Wooding [Thu, 15 Aug 2019 17:16:02 +0000 (18:16 +0100)]
symm/chacha.c: Set the correct nonce size for `xchachaNN'.
Oops.
(cherry picked from commit
9acc7e10f1da03be55e3bc2cdcbbd5775253e3d0)
Mark Wooding [Fri, 9 Nov 2018 22:44:40 +0000 (22:44 +0000)]
symm/idea.c: Fix key-size descriptor.
Missing terminator. Oops.
(cherry picked from commit
9c22e9e0d174ee0c1e649464755568fe61c0e949)
Mark Wooding [Wed, 18 Sep 2019 18:47:47 +0000 (19:47 +0100)]
progs/Makefile.am: Don't link `pixie' against the main `libcatacomb.la'.
It doesn't actually do any cryptography. Instead, Just pick out the
`base' and `key' libraries which contain its (very light) requirements.
This is the conclusion I reached following an Android ARM64 build
failure caused by lack of maths functions.
Mark Wooding [Wed, 18 Sep 2019 17:35:34 +0000 (18:35 +0100)]
key/key-misc.c: Fix bogus parentheses in macro.
The old, bogus behaviour was that it would report `KERR_READONLY' if the
keyring was neither open for writing, /nor/ modified. I think this is
relatively benign, but still well deserving of fixing.
Spotted by Clang.
Mark Wooding [Wed, 18 Sep 2019 17:24:49 +0000 (18:24 +0100)]
symm/rijndael-arm64-crypto.S: Fix bogus element-to-GP move.
Spotted by Clang's assembler. GAS is obviously too lenient.
Mark Wooding [Wed, 18 Sep 2019 16:37:33 +0000 (17:37 +0100)]
configure.ac: Fix the bug report for unexpected CPU or ABI.
Mark Wooding [Wed, 18 Sep 2019 16:28:01 +0000 (17:28 +0100)]
configure.ac: Set the `ASM_DEBUG' automake conditional at the right time.
Most significantly, after we actually know whether we want it turned on.
Mark Wooding [Wed, 18 Sep 2019 16:26:44 +0000 (17:26 +0100)]
configure.ac: Don't force `ENABLE_ASM_DEBUG' on unconditionally.
Oops. I bungled a `case'.
Mark Wooding [Tue, 17 Sep 2019 09:40:39 +0000 (10:40 +0100)]
symm/keccak1600.c: Eliminate the unnecessary temporary vector `b'.
Mark Wooding [Thu, 12 Sep 2019 19:19:15 +0000 (20:19 +0100)]
configure.ac, base/asm-common.h: Check explicitly for `_' on symbols.
There's an autoconf macro for this in the Debian `libltdl-dev' package,
though not in the main `libtool' package.
I think some BSDs are foolish enough to put `_' on symbols even though
they notionally use ELF. This may not be enough to make things work on
them, but it should at least help a bit.
Mark Wooding [Thu, 12 Sep 2019 19:18:59 +0000 (20:18 +0100)]
configure.ac: Give the `asm-debug' stanza a heading.
Mark Wooding [Thu, 12 Sep 2019 19:18:09 +0000 (20:18 +0100)]
configure.ac: Move `asm-debug' after we've finished CPU/ABI detection.
Fortunately it doesn't actually print anything, but if it did it would
produce much confusion.
Mark Wooding [Thu, 15 Aug 2019 15:25:32 +0000 (16:25 +0100)]
math/pgen.h, math/pgen-granfrob.c: Fix typo in function comment.
Mark Wooding [Mon, 12 Aug 2019 13:44:35 +0000 (14:44 +0100)]
math/mp-sqrt.c: Explain the algorithm, and particularly the end condition.
The gnomic remark that `Increasing x is pointless when -q < 2 x + 1' was
enough to lead me back on the right lines, but is hardly adequate. I
ended up wasting quite a lot of time with a whiteboard, because the
`... + 1' makes the termination condition look just enough like `the
fraction comes out zero'. But no: that's (not quite) a
coincidence. (Thinking more carefully, it's no surprise that the delta
looks similar to the derivative, but it's definitely the former we're
interested in here, rather than the latter.)
Mark Wooding [Sun, 8 Sep 2019 17:36:28 +0000 (18:36 +0100)]
Merge branch 'mdw/rsvr'
* mdw/rsvr: (49 commits)
progs/cc-kem.c: Reimplement the `naclbox' bulk cipher in terms of AEAD.
progs/cc-kem.c: Split `aead_init' into two pieces.
symm/latinpoly-def.h: Implement Bernstein's `crypto_secretbox'.
symm/latinpoly-def.h, etc.: Refactor in preparation for a related scheme.
symm/gaead.h: Specify a flag for `AEAD' schemes which don't do AAD.
symm/t/chacha: Add IETF test vector for XChacha20-Poly1305.
symm/gcm-*.S: GCM acceleration using hardware polynomial multiplication.
symm/gcm.c: Make `gcm_mktable' and `gcm_mulk_...' be CPU-dependent.
symm/gcm.c: Add low-level multiplication tests.
base/regdump.[ch], etc.: Fancy register dumping infrastructure.
base/asm-common.h: Add some macros for shifting entire NEON vectors.
base/asm-common.h: Use `push' and `pop', for Thumb compatibility.
base/asm-common.h: Provide default frame pointer registers.
base/asm-common.h: Prefer `nil' as the unspecified-argument sentinel.
base/asm-common.h: Fix bogus indentation.
base/asm-common.h: Settle on no spaces around keyword-argument `='.
base/asm-common.h: Add an `IMM' macro for immediate operands.
base/asm-common.h: Implement the `r' decorator for `MEM' accesses.
base/asm-common.h: Hoist the `_DECOR_mem_...' definitions.
base/asm-common.h: Put `l' suffix on `si', `di', etc. under `CPUFAM_AMD'.
...
Mark Wooding [Tue, 20 Aug 2019 13:19:21 +0000 (14:19 +0100)]
progs/cc-kem.c: Reimplement the `naclbox' bulk cipher in terms of AEAD.
Mark Wooding [Tue, 20 Aug 2019 13:18:36 +0000 (14:18 +0100)]
progs/cc-kem.c: Split `aead_init' into two pieces.
This will let us use the same machinery with different user interfaces.
Mark Wooding [Fri, 16 Aug 2019 11:49:33 +0000 (12:49 +0100)]
symm/latinpoly-def.h: Implement Bernstein's `crypto_secretbox'.
Mark Wooding [Fri, 16 Aug 2019 11:48:22 +0000 (12:48 +0100)]
symm/latinpoly-def.h, etc.: Refactor in preparation for a related scheme.
No functional change at this time, but a bunch of things are renamed and
parts which will be common between the two are factored out.
Mark Wooding [Fri, 16 Aug 2019 11:33:22 +0000 (12:33 +0100)]
symm/gaead.h: Specify a flag for `AEAD' schemes which don't do AAD.
This is a useful shape, and, in particular, it covers the NaCl
`crypto_secretbox' transform.
Mark Wooding [Thu, 15 Aug 2019 17:17:27 +0000 (18:17 +0100)]
symm/t/chacha: Add IETF test vector for XChacha20-Poly1305.
Mark Wooding [Tue, 13 Nov 2018 11:28:53 +0000 (11:28 +0000)]
symm/gcm-*.S: GCM acceleration using hardware polynomial multiplication.
Add assembler implementations of the low-level GCM arithmetic which make
use of polynomial multiplication instructions on x86 (the delightfully
named `pclmul{l,h}q{l,h}dq' instructions) and ARM processors (the ARM32
`vmull.p64' and ARM64 `pmull{,2}' instructions). Of course, this
involves adding the necessary CPU feature detection.
GCM's bit and byte order is remarkably confusing. I've tried quite hard
to write the code so as to help the reader keep track of which bits are
where, but it's very difficult.
There's also a Python implementation which has proven invaluable while
debugging these things.
Mark Wooding [Tue, 13 Nov 2018 11:26:56 +0000 (11:26 +0000)]
symm/gcm.c: Make `gcm_mktable' and `gcm_mulk_...' be CPU-dependent.
A couple of other changes to ease the way:
* Split `gcm_mulk_...' into two endianness variants, so that
CPU-specific variants don't have to track what's going on through
the key table.
* Abstract out `recover_k' to decode the key value from a table, for
the use of `gcm_concat'. This is, of course, necessary if the table
format is CPU-dependent.
* Add testing to make sure that `mktable'/`recover_k' agree with each
other.
There are currently no fancy implementations, but you can tell what's
coming. No actual functional change, except for logging if you set
`CATACOMB_CPUDISPATCH_DEBUG' in the environment.
Mark Wooding [Tue, 13 Nov 2018 11:21:59 +0000 (11:21 +0000)]
symm/gcm.c: Add low-level multiplication tests.
Mark Wooding [Sun, 18 Aug 2019 01:08:07 +0000 (02:08 +0100)]
base/regdump.[ch], etc.: Fancy register dumping infrastructure.
Mark Wooding [Sat, 7 Sep 2019 13:20:19 +0000 (14:20 +0100)]
base/asm-common.h: Add some macros for shifting entire NEON vectors.
The `vext' (A32 NEON) or `ext' (A64) instructions can be (ab)used for
shifting vectors left and right if you have a spare zero vector lying
around. But using them is kind of confusing: left shifts, especially,
need a reversed shift quantity, and the shift is measured in bytes
rather than bits.
Add a couple of macros to make this less strange.
Mark Wooding [Mon, 2 Sep 2019 11:53:54 +0000 (12:53 +0100)]
base/asm-common.h: Use `push' and `pop', for Thumb compatibility.
I still prefer `stmfd' and `ldmfd' for general code, but these are
important macros for which Thumb compatibility might be valuable.
Mark Wooding [Mon, 2 Sep 2019 11:51:05 +0000 (12:51 +0100)]
base/asm-common.h: Provide default frame pointer registers.
And use the default in the obvious places.
Mark Wooding [Mon, 2 Sep 2019 11:50:11 +0000 (12:50 +0100)]
base/asm-common.h: Prefer `nil' as the unspecified-argument sentinel.
Mark Wooding [Mon, 2 Sep 2019 11:49:26 +0000 (12:49 +0100)]
base/asm-common.h: Fix bogus indentation.
Mark Wooding [Mon, 2 Sep 2019 11:49:03 +0000 (12:49 +0100)]
base/asm-common.h: Settle on no spaces around keyword-argument `='.
Mark Wooding [Sun, 18 Aug 2019 01:13:18 +0000 (02:13 +0100)]
base/asm-common.h: Add an `IMM' macro for immediate operands.
The most useful version of this is `IMM(r, ...)', because that varies
according to the target architecture, but the others might be considered
an improvement over the Intel syntax.
It turns out that I don't actually need this: the motivation was caused
by a typo-ed register field. But I think it still makes a useful
addition.
Mark Wooding [Sun, 18 Aug 2019 01:17:13 +0000 (02:17 +0100)]
base/asm-common.h: Implement the `r' decorator for `MEM' accesses.
I think this was an unintentional omission.
Mark Wooding [Sun, 18 Aug 2019 01:11:55 +0000 (02:11 +0100)]
base/asm-common.h: Hoist the `_DECOR_mem_...' definitions.
In particular, the various `_DECOR_mumble_...' groups go above the
special `..._r' suffix.
Mark Wooding [Sun, 18 Aug 2019 01:09:30 +0000 (02:09 +0100)]
base/asm-common.h: Put `l' suffix on `si', `di', etc. under `CPUFAM_AMD'.
Mark Wooding [Fri, 6 Sep 2019 09:30:13 +0000 (10:30 +0100)]
base/asm-common.h: Add include guards.
Mark Wooding [Sat, 2 Mar 2019 13:11:25 +0000 (13:11 +0000)]
**/*.S: Arrange assembler preambles consistently.
Mark Wooding [Sat, 3 Nov 2018 10:54:40 +0000 (10:54 +0000)]
symm/ocb3.h, symm/ocb3-def.h: Implement the OCB3 auth'ned encryption mode.
Note that there is no PMAC3 corresponding to OCB3, like there is for the
previous two versions. The OCB3 header-processing is not a secure
standalone MAC.
Mark Wooding [Mon, 5 Nov 2018 17:34:41 +0000 (17:34 +0000)]
utils/advmodes: Implement (only) a toy version of OCB2.
I doubt this will ever end up as a high-quality mode implementation in
Catacomb, because it doesn't actually provide authenticity. See
`Cryptanalysis of OCB2' by Akiko Inoue and Kazuhiko Minamatsu,
https://eprint.iacr.org/2018/1040.
This is enough to confirm their result.
* First, choose an arbitrary key and nonce, and encrypt a two-block
message whose first block contains len(0^{128}) = 128; the second
block is arbitrary.
$ ./advmodes ocb2-enc rijndael
00112233445566778899aabbccddeeff 00112233445566778899aabbccddeeff ""
0000000000000000000000000000008000112233445566778899aabbccddeeff
0e6475201e14155a2744eb78f396581c3ffbfcf1d7a2505ef8f5e56b2824f4bb
5973f3fdd62e411b05c9d9d982769bbc
* Ask Python to XOR pieces of message and ciphertext:
>>> import catacomb as C
>>> C.bytes('
00000000000000000000000000000080') ^ C.bytes('
0e6475201e14155a2744eb78f396581c')
bytes('
0e6475201e14155a2744eb78f396589c')
>>> C.bytes('
00112233445566778899aabbccddeeff') ^ C.bytes('
3ffbfcf1d7a2505ef8f5e56b2824f4bb')
bytes('
3feadec293f73629706c4fd0e4f91a44')
* Use the first result as the ciphertext and the second as the MAC.
$ ./advmodes ocb2-dec rijndael
00112233445566778899aabbccddeeff 00112233445566778899aabbccddeeff ""
0e6475201e14155a2744eb78f396589c 3feadec293f73629706c4fd0e4f91a44
c5ecf37c57e1b262c83c0739468037e4
Oops.
Mark Wooding [Fri, 2 Nov 2018 22:15:14 +0000 (22:15 +0000)]
symm/ocb1.h, symm/pmac1.h, ...: Implement PMAC1 and OCB1.
Also bump the required mLib version to 2.3.0, for <mLib/compiler.h>.
Mark Wooding [Wed, 31 Oct 2018 22:59:13 +0000 (22:59 +0000)]
symm/ccm.h, symm/ccm-def.h: Implement the CCM authenticated encryption mode.
This is pretty grim, really.
Mark Wooding [Fri, 2 Nov 2018 00:00:02 +0000 (00:00 +0000)]
symm/gcm.h, symm/gcm-def.h: Implement the GCM authenticated encryption mode.
Mark Wooding [Wed, 31 Oct 2018 16:45:06 +0000 (16:45 +0000)]
symm/eax.h, symm/eax-def.h: Implement the EAX authenticated encryption mode.
Mark Wooding [Wed, 31 Oct 2018 12:05:48 +0000 (12:05 +0000)]
symm/cmac.h, symm/cmac-def.h: Implement the CMAC (OMAC) message auth'n mode.
Also introduce `utils/advmodes' containing toy implementations of
`fancy' blockcipher modes, which is useful as a reference and
playground.
Mark Wooding [Sat, 10 Nov 2018 14:06:17 +0000 (14:06 +0000)]
progs/perftest.c: Add measurement support for AEAD schemes.
Mark Wooding [Fri, 9 Nov 2018 18:45:51 +0000 (18:45 +0000)]
progs/catcrypt.c: Support the use of AEAD schemes.
Mark Wooding [Fri, 9 Nov 2018 18:25:52 +0000 (18:25 +0000)]
symm/latinpoly.c, etc.: AEADs based on Salsa20 and ChaCha with Poly1305.
This is an extension of the scheme specified in RFC7539.
Mark Wooding [Fri, 9 Nov 2018 18:21:34 +0000 (18:21 +0000)]
base/keysz.c: New function to find smallest `key' size larger.
Now that AEAD schemes are (ab)using key-size lists for permitted nonce
lengths, it's useful to ask: what's the smallest acceptable size bigger
than the amount of stuff I want to pack into this nonce? The new
`keysz_pad' function answers this question.
Mark Wooding [Fri, 9 Nov 2018 18:17:42 +0000 (18:17 +0000)]
symm/gaead.h: Introduce a new abstraction for authenticated encryption.
... with additional data.
The build system is aware that these can be constructed from
blockciphers, and there is a table of the things. Alas, there aren't
any implemented yet, so the table is empty. For now, at any rate...
Mark Wooding [Wed, 4 Sep 2019 17:42:32 +0000 (18:42 +0100)]
symm/chacha.c, symm/salsa20.c: Merge the `zerononce' values.
Previously, each file had four separate `zerononce' values, for no good
reason. Consolidate them. (I haven't merged them across the files, to
keep the implementations self-contained.)
Mark Wooding [Thu, 15 Aug 2019 17:16:37 +0000 (18:16 +0100)]
symm/t/chacha: Add IETF test vector for XChacha20.
At least I get the answer right.
Mark Wooding [Thu, 15 Aug 2019 17:16:02 +0000 (18:16 +0100)]
symm/chacha.c: Set the correct nonce size for `xchachaNN'.
Oops.
Mark Wooding [Fri, 1 Mar 2019 12:21:38 +0000 (12:21 +0000)]
math/f25519.c: Order 10-bit constants the same as 26-bit constants.
Mark Wooding [Fri, 1 Mar 2019 12:21:16 +0000 (12:21 +0000)]
math/f25519.c, math/fgoldi.c: Remove some unused constant definitions.
Mark Wooding [Fri, 9 Nov 2018 18:14:37 +0000 (18:14 +0000)]
symm/: Introduce the idea of MAC modes based on blockciphers.
This is just a build-system tweak. No such modes exist yet.
Hint, hint.
Mark Wooding [Fri, 9 Nov 2018 18:35:18 +0000 (18:35 +0000)]
symm/chacha.h: Fix indentation.
Mark Wooding [Wed, 31 Oct 2018 12:03:16 +0000 (12:03 +0000)]
symm/blkc.h: Add macros for binary-field shifts.
Mark Wooding [Tue, 30 Oct 2018 22:33:54 +0000 (22:33 +0000)]
symm/blkc.h: Add explicitly big- and little-endian `STEP', `ADD' and `SET'.
We shall have need of these soon.
Mark Wooding [Tue, 30 Oct 2018 13:49:54 +0000 (13:49 +0000)]
symm/seal.c: Spruce up a bit.
I'm not deploying the reservoir code here because the core update is
entangled with the buffering in an unusual way, to avoid having to spill
the working state. Switching to the reservoir logic would mean having
to factor out the core update, which would lead to spillage.
Instead, settle for renaming the buffering variables (and switching
which end we count from) and reformatting the code a bit.
Mark Wooding [Fri, 5 Jan 2018 04:34:47 +0000 (04:34 +0000)]
symm/...: Start deploying the `rsvr' machinery.
Mark Wooding [Fri, 5 Jan 2018 04:31:08 +0000 (04:31 +0000)]
base/rsvr.[ch]: New hack for buffering input to block-oriented functions.
Mark Wooding [Sun, 28 Oct 2018 22:51:57 +0000 (22:51 +0000)]
symm/blkc.h: Define a new `BLKC_ADD' macro.
And rewrite `BLKC_STEP' in terms of it.
Mark Wooding [Tue, 30 Oct 2018 10:24:29 +0000 (10:24 +0000)]
symm/modes-test.c: Test discarding output by changing encryption order.
Some of the code paths for discarding output are, or might become, quite
complicated, so they're worth exercising.
Mark Wooding [Tue, 30 Oct 2018 10:29:39 +0000 (10:29 +0000)]
symm/cbc-def.h: Fix discarding output for short inputs.
You got a segfault if the input was smaller than the block size and the
destination pointer was null. We need a temporary place for shuffling
the buffer around anyway, so it seems like the best approach is just to
make a (necessarily small) dummy destination.
Mark Wooding [Tue, 30 Oct 2018 10:26:05 +0000 (10:26 +0000)]
symm/ecb-def.h: Simplify the discarding-output path.
Because ECB is stateless, there is nothing to do if we discard the
output.
Mark Wooding [Sun, 28 Oct 2018 17:45:18 +0000 (17:45 +0000)]
symm/...: Reformat encryption mode loops and related code.
* Rename the various variables consistently. Now `off' is the progress
into a buffer, `b' is the buffer or reservoir, `t' and `u' are
temporary internal-format blocks, `y' is a temporary octet.
* Hoist variable declarations to function top-levels.
* Squish compound statement bodies vertically.
* Invert some conditions to reduce nesting depth.
* Move loop-variable updates closer to where the thing they measure is
actually used.
* Elide pointless use of `register' storage class.
* Remove spaces around diadic `*' and `/' operators.
Mark Wooding [Sun, 28 Oct 2018 18:09:43 +0000 (18:09 +0000)]
symm/*-def.h: Fix layout bogosities.
Mark Wooding [Fri, 9 Nov 2018 22:44:40 +0000 (22:44 +0000)]
symm/idea.c: Fix key-size descriptor.
Missing terminator. Oops.
Mark Wooding [Mon, 29 Oct 2018 22:48:49 +0000 (22:48 +0000)]
symm/*-def.h: Overhaul encryption mode testing.
Introduce a new source file (not part of the library proper) containing
the main code. The old version only checked that the modes supported
round trips. This is an improvement in several respects:
* The per-mode code is now nearly trivial, and specific to the mode in
question.
* The new code checks that block-aligned (at least, in the case of ECB
and CBC) or arbitrarily misaligned (in the case of CFB, OFB,
counter, and MGF1, which are resumable) splits result in identical
ciphertexts.
* The new code can generate and/or check against regression-test
data (in a binary format, because these can be big for non-resumable
modes) to prevent cross-version interoperability bugs. This data is
generated automatically by `make distdir', and version controlled.
Mark Wooding [Sat, 17 Nov 2018 19:21:43 +0000 (19:21 +0000)]
math/: Implement Grantham's Frobenius (primality) test.
This is a rather heavyweight test which is effective when checking
possibly adversarial numbers.
There are no known composites which pass both this test and the
Miller--Rabin test with witness 2 (although infinitely many are
conjectured to exist); the combination is called the `Baillie--PSW'
test (after Baillie, Pomerance, Selfridge, and Wagstaff). Modify
`pgen_primep' to use Baillie--PSW.
Since Baillie--PSW is somewhat faster than the many rounds of Miller--
Rabin which `pgen_primep' used to use, celebrate by raising the `keen'
threshold in the `dh-param.c' test.
This work was prompted by the paper `Prime and Prejudice', by Martin
R. Albrecht, Jake Massimo, Kenneth G. Paterson, and Juraj Somorovsky;
though, since Catacomb already used 32 iterations of Miller--Rabin with
random witnesses, I can confidently state that the previous
implementation was inefficient but secure when used with a good
randomness source.
Mark Wooding [Sat, 24 Nov 2018 21:53:58 +0000 (21:53 +0000)]
Merge branch '2.4.x'
* 2.4.x:
progs/cc-progress.c: Use `fstat' to discover the file size.
math/mpx-mul4-amd64-sse2.S: Always collect iteration count as 32 bits.
math/mpx-mul4-amd64-sse2.S: Fix stack-argument offset for 64-bit Windows.
symm/salsa20-x86ish-sse2.S: Fix typo in 64-bit Windows code.
symm/desx.c, symm/desx.h (desx_init): Fix documentation.
symm/t/rijndael256: Add tests for small key sizes.
progs/cc-kem.c (getkem): Parse the `kdf' spec after bulk crypto.
progs/..., symm/...: Fix 32-bit right-shift idiom.
Mark Wooding [Fri, 9 Nov 2018 18:45:16 +0000 (18:45 +0000)]
progs/catcrypt.1: Rephrase the descriptions of the available lists.
Mark Wooding [Fri, 9 Nov 2018 18:39:11 +0000 (18:39 +0000)]
progs/catcrypt.1: The default `cipher' depends on the bulk transform.
As does the space of acceptable names. Refactor the manual a little to
describe this properly.
Mark Wooding [Sat, 10 Nov 2018 14:03:21 +0000 (14:03 +0000)]
progs/perftest.c: Report cycle counts per operation where possible.
This is a much more useful figure to work with. Use `rdtsc' on x86,
falling back to `perf_event_open' on Linux where available. On other
platforms, you don't get cycle counts: sorry.
Mark Wooding [Sat, 10 Nov 2018 13:58:54 +0000 (13:58 +0000)]
progs/perftest.c: Fix key-size handling.
* Allow a key-size parameter to `enc', because algorithms like
Rijndael have key-size-dependent performance. This uses the `-b'
option, because `-B' is already the buffer size for the inner loop.
* For consistency, use `-b' for the key size in `ksched' too.
* Finally, check explicit key sizes for validity rather than just
rounding off and potentially crashing.
Mark Wooding [Sat, 10 Nov 2018 13:55:14 +0000 (13:55 +0000)]
progs/perftest.c: Document the `-n' option for `enc' and `hash'.
Mark Wooding [Sat, 10 Nov 2018 13:47:15 +0000 (13:47 +0000)]
progs/perftest.c: Rename `c_start', `c_stop' to `c0', c1'.
Mark Wooding [Sat, 10 Nov 2018 13:41:29 +0000 (13:41 +0000)]
progs/perftest.c: Introduce top-level option for batching.
`-kN' runs N iterations of the underlying job between looking at the
clock, without affecting the other statistics. The main purpose here is
to reduce the impact of the measurement overhead.
Mark Wooding [Fri, 9 Nov 2018 18:06:53 +0000 (18:06 +0000)]
symm/stub.h.in: Fix include-guard names to be identifier-safe.
The header's own include-guard was fixed, but not the guards for the
individual headers.
Mark Wooding [Fri, 2 Nov 2018 22:09:50 +0000 (22:09 +0000)]
symm/blkc.h (BLKC_SHOW): Capture operand as `const'.
Mark Wooding [Mon, 12 Nov 2018 11:03:05 +0000 (11:03 +0000)]
base/asm-common.h: Reverse the order of `SHUF' arguments.
The original idea was this: since one can change one's view of how the
bits in an XMM register are divided into lanes on a per-instruction
basis, it would make more sense if I took a single consistent view of
how the bits are arranged, with the least significant on the right and
the most significant on the left. Therefore, I listed the shuffle
indices from left to right, counting from right to left.
This, I now realise, was a mistake. The thing which finally made this
clear to me was that it makes the order of indices in the `SHUF' macro
be inconsistent with the order of bytes in a table for the SSSE3
`pshufb' instruction, and I can't do anything about that.
So: change the order of the arguments, and track down all uses of this
macro to fix them. Sorry about that.
To verify that I got them all:
for i in $(git grep -l SHUF); do
git blame -- $i | grep SHUF
done | less
Mark Wooding [Fri, 9 Nov 2018 17:28:47 +0000 (17:28 +0000)]
**/.gitignore: Push patterns downwards, and format.
The top-level `.gitignore' was getting too unwieldy, and subsidiary
`.gitignore' files existed but weren't used much.
Push patterns for specific files down into the appropriate directories.
Also, gather and sort the patterns in a vaguely logical way.
Mark Wooding [Sat, 24 Nov 2018 19:06:45 +0000 (19:06 +0000)]
progs/cc-progress.c: Use `fstat' to discover the file size.
And `lseek' to discover the current offset. Annoyingly, Android only
developed `ftello64' and `fseeko64' in API24, so we can't use these (and
it was a pretty grim circumlocution anyway). On the other hand, Android
has had `lseek64' forever, and its `fstat' is natively 64-bit; and
there's no portability benefit to using the other functions because
Windows doesn't have them anyway. (Indeed, `lseek' and `stat' are
ancient Unix, so probably more portable.)
Mark Wooding [Fri, 16 Nov 2018 12:51:39 +0000 (12:51 +0000)]
math/mpx-mul4-amd64-sse2.S: Always collect iteration count as 32 bits.
Some ABIs, at least, don't guarantee to zero-extend arguments, and we
use the counter as an address offset.
Mark Wooding [Fri, 16 Nov 2018 12:49:42 +0000 (12:49 +0000)]
math/mpx-mul4-amd64-sse2.S: Fix stack-argument offset for 64-bit Windows.
I failed to account for either the 160 bytes of saved XMM registers
(because the stupid ABI demands that XMM6--XMM15 be preserved across
calls), or for the daft 32-byte shadow space between the return address
and the stacked arguments.
Mark Wooding [Fri, 16 Nov 2018 12:22:27 +0000 (12:22 +0000)]
symm/salsa20-x86ish-sse2.S: Fix typo in 64-bit Windows code.
Goes to show how often I test on Windows. :-(
Mark Wooding [Fri, 9 Nov 2018 21:46:56 +0000 (21:46 +0000)]
symm/desx.c, symm/desx.h (desx_init): Fix documentation.
The two documentation comments disagreed about the orders of the key
pieces. The implementation had it right: the DES key comes first,
followed by the whitening keys. Fix the header, and a stupid typo.
Mark Wooding [Wed, 31 Oct 2018 13:47:47 +0000 (13:47 +0000)]
symm/t/rijndael256: Add tests for small key sizes.
Commit
388489cbb302cb86ee0fd4927243a24525dfd5ee (released in 2.4.2)
added more round constants so that we give the correct answers for
large-block Rijndael with small keys -- and this works fine for clean
builds. Unfortunately, Catacomb's build system doesn't regenerate
recomputed tables automatically (and that would anyway be a problem for
cross builds), which means that old working trees will still be building
broken code.
Add some tests so that developers notice and hopefully rebuild the
offending tables.
Mark Wooding [Sat, 10 Nov 2018 17:26:43 +0000 (17:26 +0000)]
progs/cc-kem.c (getkem): Parse the `kdf' spec after bulk crypto.
Otherwise the buffer holding the remains of the kemalgspec is clobbered.
Mark Wooding [Tue, 30 Oct 2018 22:05:18 +0000 (22:05 +0000)]
progs/..., symm/...: Fix 32-bit right-shift idiom.
This one has a long and troubled history. Writing
x >> 32
is undefined behaviour if x is only 32 bits wide. On the other hand, if
it's /not/, then this is necessary to get hold of the upper bits.
The obvious escape plan is to write
(x >> 16) >> 16
(the parentheses are unfortunately necessary), but some Microsoft
compilers managed do bungle compiling this: they merged the two shifts
together and then decided that a shift by 32 places was a no-op.
So I wrote
((x&~MASK32) >> 16) >> 16
which stood for many years. Unfortunately this is really wrong too: if
x is wider than 32 bits, that's nice, but MASK32 /isn't/ necessarily, so
~MASK32 is all-bits zero and the high bits of x are just lost.
Fix this by casting MASK32 to the-type-of-x before inverting it.
Ugh.
Mark Wooding [Sat, 27 Oct 2018 10:54:43 +0000 (11:54 +0100)]
math/mpx-mul4-*-sse2.S: Fix commentary notation.
Write a `;' between the two halves of an XMM register to emphasize when
we're thinking of it as two 64-bit lanes or four 32-bit lanes.
Mark Wooding [Sat, 27 Oct 2018 09:43:24 +0000 (10:43 +0100)]
math/mpx-mul4-*-sse2.S (squash): We don't care about the top half of c3 here.
The previous version of the comment erroneously claimed that the top
half of c3 held y_1; in fact it holds y_2, but we'll clobber it anyway
because the objective is to carry up into y_1, so mark it as
don't-care (like lo).
Mark Wooding [Thu, 23 Aug 2018 04:13:55 +0000 (05:13 +0100)]
(x86 asm): Zero the high parts of the ?MM registers if available.
There's a performance penalty to trying to preserve the upper parts of
the SSE/AVX vector registers, and it's pointless because we don't need
to preserve them. (Earlier AVX-capable processors would carefully snip
off the upper parts of the registers and put them in a box, and then
glue them back on when they were wanted, which isn't so bad. Later
processors instead just track the upper part of the register as an
additional operand, which leads to unnecessary latency.)
Add AVX-specific entry points to the necessary routines, and call them
when AVX is detected. This would all be easier if Intel had chosen
`vzeroupper' from an existing `nop' encoding space.
Mark Wooding [Mon, 13 Aug 2018 20:30:07 +0000 (21:30 +0100)]
progs/catsign.c: Don't gratuitously try to open a temporary file.
The `merry dance' where we open the necessary output files was bungled,
which caused a temporary file to be opened unless an explicit output
file was requested without buffering.
Mark Wooding [Mon, 30 Jul 2018 11:24:04 +0000 (12:24 +0100)]
base/asm-common.h: Fix the description comment at the top of the file.
Mark Wooding [Fri, 22 Jun 2018 09:20:44 +0000 (10:20 +0100)]
Add support for fancy AArch64 assembler code.
It's a fun instruction set, and maybe this will improve my crypto on
Raspberry Pi 3.
Mark Wooding [Fri, 22 Jun 2018 09:21:10 +0000 (10:21 +0100)]
configure.ac: Don't be so picky about identifying ARM variants.
They're all pretty much the same, really. If I had some good way to
identify big-endian ARM targets, I'd try that, but I don't know how to
do that right now.
Mark Wooding [Fri, 22 Jun 2018 09:21:55 +0000 (10:21 +0100)]
symm/salsa20-arm-neon.S: Remove extra copy of the state-layout diagram.
I think this is leftover debris from when I was first figuring out this
layout, but it certainly doesn't belong here.
Mark Wooding [Sat, 23 Jun 2018 03:17:13 +0000 (04:17 +0100)]
symm/rijndael-arm-crypto.S: Use `vmov' rather than `veor' to zero-init.
I think I'd be doing too much x86 coding when I came to do this.
Mark Wooding [Fri, 22 Jun 2018 09:21:32 +0000 (10:21 +0100)]
symm/rijndael-arm-crypto.S: Delete a redundant instruction.
We've already loaded the previous-cycle word by the time we get to `1:'
here, so we don't need to do it again. The pointers don't move, so this
was harmless but pointless.