3 \section{Low-level details: MPX}
7 \subsection{Data types for low-level arithmetic}
9 The header file \hdr{<catacomb/mpw.h>} defines two types, \code{mpw} and
10 \code{mpd}, and a collection of macros describing them.
12 The `multiprecision word' type, \code{mpw}, is an unsigned integer type whose
13 representation uses \code{MPW_BITS}. The largest value representable in an
14 \code{mpw} is $\code{MPW_MAX} = 2^{\code{MPW_BITS}} - 1$.
15 The value of \code{MPW_BITS} is at least 16. Note that, on some
16 architectures, an \code{mpw} may be capable of representing values larger
17 than \code{MPW_MAX}. The expression \code{MPW($x$)} returns the value of $x
18 \bmod \code{MPW_MAX}$ cast to type \code{mpw}.
20 The `multiprecision doubleword' type, \code{mpd}, is an unsigned integer type
21 with at least double the width of \code{mpw}. Hence, an \code{mpd} is
22 capable of representing the product of two \code{mpw}s, and is useful when
23 performing multiplications and divisions. The constants \code{MPD_BITS} and
24 \code{MPD_MAX} are defined to be the number of bits used in the
25 representation of an \code{mpd} and the largest value representable in an
26 \code{mpd} respectively.
28 A few macros useful for manipulating \code{mpw}s are defined.
30 \subsubsection{The \code{MPW} macro}
36 |#include <catacomb/mpw.h>| \\
42 The value of $x \bmod 2^{\code{MPW_BITS}}$, as an \code{mpw}. This is a
43 frequently used abbreviation for |(mpw)(|$x$| & MPW_MAX)|.
45 \subsubsection{The \code{MPWS} macro}
51 |#include <catacomb/mpw.h>| \\
52 |size_t MPWS(size_t |$n$|);|
57 The number of bytes occupied by an array of $n$~\code{mpw}s (i.e., $n \cdot
60 \subsubsection{The \code{MPW_RQ} macro}
66 |#include <catacomb/mpw.h>| \\
67 |size_t MPW_RQ(size_t |$\sz$|);|
72 The number of \code{mpw}s required to represent a multiprecision integer
73 occupying $\sz$~octets in an external representation.
76 \subsection{Low-level multiprecision integer representation}
78 The low-level multiprecision definitions are exported in the header file
79 \hdr{<catacomb/mpx.h>}. This header includes \hdr{<catacomb/mpw.h>}, so
80 there's usually no need to include it explicitly.
82 A multiprecision integer is represented as an array of \code{mpw}s,
83 least-significant first. The array's bounds are described by a a pair of
84 pointers to the array's \emph{base} (i.e., its first element) and
85 \emph{limit} (i.e., the location immediately following its last element).
87 Let $v$ be the base and $\vl$ the limit of a multiprecision integer array.
88 The array itself is notated as~$v .. \vl$. The array's size in words may be
89 computed as $\vl - v$ in the normal C fashion. The integer represented by
90 the array, denoted $\mp(v .. \vl)$, is defined to be
93 \sum_{0 \le i < \vl - v}
94 2^{\code{MPW_BITS} \cdot i} v[i]
96 If the array is empty (i.e., $v = \vl$) then the number is zero. If the
97 array is empty, or the final word is nonzero, then the representation is said
98 to be \emph{shrunken}. Shrunken representations are more efficient, since
99 the arithmetic algorithms don't need to consider high-order words which make
100 no difference to the final result anyway.
102 Whenever a result is too large to be represented in the memory allocated for
103 it, high-order bits are discarded. Thus, a result written to an array of
104 $k$~words is reduced modulo $2^{\code{MPW_BITS} \cdot k}$.
107 \subsection{Low-level macros}
108 \label{sec:mpx-macro}
110 The following macros perform various simple operations useful for
111 manipulating multiprecision integers at the MPX level.
113 \subsubsection{The \code{MPX_SHRINK} macro}
114 \label{fn:MPX-SHRINK}
119 |#include <catacomb/mpx.h>| \\
120 |MPX_SHRINK(const mpw *|$v$|, const mpw *|$\vl$|);|
125 The \code{MPX_SHRINK} macro reduces the limit~$\vl$ of a multiprecision
126 integer array so that either $v = \vl$ or~$\vl[-1] \ne 0$. The argument~$\vl$ must be an lvalue, since the macro writes the result back when it
129 \subsubsection{The \code{MPX_BITS} macro}
135 |#include <catacomb/mpx.h>| \\
136 |MPX_BITS(unsigned long |$b$|, const mpw *|$v$|, const mpw *|$\vl$|);|
141 Determines the smallest number of bits which could represent the number
142 $\mp(v .. \vl)$, and writes the answer to~$b$, which must therefore be an
143 lvalue. The result is zero if the number is zero; otherwise $b$ is the
144 largest integer such that
145 \[ \mp(v .. \vl) \ge 2^{(b - 1) \bmod \code{MPW_BITS}}.\]
147 \subsubsection{The \code{MPX_OCTETS} macro}
148 \label{fn:MPX-OCTETS}
153 |#include <catacomb/mpx.h>| \\
154 |MPX_OCTETS(size_t |$o$|,|
155 |const mpw *|$v$|, const mpw *|$\vl$|);|
160 Determines the smallest number of octets which could represent the number
161 $\mp(v .. \vl)$, and writes the answer to~$o$, which must therefore be an
162 lvalue. This is useful for determining appropriate buffer sizes for the
163 results of \code{mpx_storel} and \code{mpx_storeb}.
165 The result $o$ can be calculated from the number of bits~$b$ reported by
166 \code{MPX_BITS} as $o = \lceil b / 8 \rceil$; the algorithm used by
167 \code{MPX_OCTETS} is more efficient than this, however.
169 \subsubsection{The \code{MPX_COPY} macro}
175 |#include <catacomb/mpx.h>| \\
176 |MPX_COPY(mpw *|$\dv$|, mpw *|$\dvl$|,|
177 |const mpw *|$\av$|, const mpw *|$\avl$|);|
182 Copies a multiprecision integer from the source array $\av .. \avl$ to the
183 destination array $\dv .. \dvl$. If the destination array is large enough,
184 the result is equal to the source; otherwise, high-order bits are discarded
187 \subsubsection{The \code{MPX_ZERO} macro}
193 |#include <catacomb/mpx.h>| \\
194 |MPX_ZERO(mpw *|$v$|, mpw *|$\vl$|);|
199 Sets the integer $\mp(v .. \vl)$ to zero.
202 \subsection{Transfer formats: loading and storing}
205 The MPX layer can translate between internal representations of integers as
206 arrays of \code{mpw}s and external formats where integers are stored as
207 arrays of octets. Both little- and big-endian orders are supported.
209 If $a_0, a_1, \ldots, a_{k - 1}$ is an array of $k$ octets, with
210 $0 \le a_i < 256$ for all $0 \le i < k$, then the number $n$ represented by
211 $a$ in little-endian byte order is
212 \[ n = \sum_{0 \le i < k} 256^i a_i .\]
213 Similarly, the number $n'$ represented by $a$ in big-endian byte order is
214 \[ n' = \sum_{0 \le i < k} 256^{k - i - 1} a_i .\]
216 If, in a store operation, the destination octet array is not large enough to
217 represent the number to be stored, high-order octets are discarded; hence,
218 the number is reduced modulo $2^{8k}$, where $k$ is the size of the
219 destination array in octets.
221 Two useful macros help when working out how much memory is needed for
222 translation between internal and transfer formats for multiprecision
223 integers. The macro \code{MPX_OCTETS} (page~\pageref{fn:MPX-OCTETS})
224 calculates the smallest octet array size which can represent a multiprecision
225 integer. The macro \code{MPW_RQ} (page~\pageref{fn:MPW-RQ}) calculates the
226 smallest \code{mpw} array which can represent a multiprecision integer held
227 in an octet array of a particular size.
229 \subsubsection{The \code{mpx_storel} function}
230 \label{fn:mpx-storel}
235 |#include <catacomb/mpx.h>| \\
236 |void mpx_storel(const mpw *|$v$|, const mpw *|$\vl$|,|
237 |void *|$p$|, size_t |$\sz$|);|
242 Stores the number held in the array $v .. \vl$ to the array of $\sz$~octets
243 starting at address~$p$ in little-endian byte order (i.e., least significant
246 \subsubsection{The \code{mpx_loadl} function}
252 |#include <catacomb/mpx.h>| \\
253 |void mpx_loadl(mpw *|$v$|, mpw *|$\vl$|,|
254 |const void *|$p$|, size_t |$\sz$|);|
259 Loads into the array $v .. \vl$ the number represented in the array of
260 $\sz$~octets starting at address~$p$ in little-endian byte order (i.e., least
261 significant byte first).
263 \subsubsection{The \code{mpx_storeb} function}
264 \label{fn:mpx-storeb}
269 |#include <catacomb/mpx.h>| \\
270 |void mpx_storeb(const mpw *|$v$|, const mpw *|$\vl$|,|
271 |void *|$p$|, size_t |$\sz$|);|
276 Stores the number held in the array $v .. \vl$ to the array of $\sz$~octets
277 starting at address~$p$ in big-endian byte order (i.e., least significant
280 \subsubsection{The \code{mpx_loadb} function}
286 |#include <catacomb/mpx.h>| \\
287 |void mpx_loadb(mpw *|$v$|, mpw *|$\vl$|,|
288 |const void *|$p$|, size_t |$\sz$|);|
293 Loads into the array $v .. \vl$ the number represented in the array of
294 $\sz$~octets starting at address $p$ in big-endian byte order (i.e., least
295 significant byte last).
298 \subsection{Bit shifting operations}
299 \label{sec:mpx-shift}
301 The MPX layer provides functions for efficient multiplication and division of
302 multiprecision integers by powers of two, performed simply by shifting the
303 binary representation left or right by a number of bits. Shifts by one bit
304 and by a multiple of \code{MPW_BITS} are particularly fast.
306 There are two shift functions, one for left shifts (multiplications) and one
307 for right shifts (divisions). Each function is passed an array containing
308 the number to be shifted, a destination array for the result (which may be
309 the source array), and the number of bits to be shifted as an unsigned
313 \subsubsection{The \code{mpx_lsl} function}
320 |#include <catacomb/mpx.h>| \\
321 |void mpx_lsl(|\=|mpw *|$\dv$|, mpw *|$\dvl$|,| \\
322 \>|const mpw *|$\av$|, const mpw *|$\avl$|, size_t |$n$|);|
328 Stores in $\dv .. \dvl$ the result of shifting $\mp(\av .. \avl)$ left by
329 $n$~bits (i.e., multiplying it by~$2^n$).
331 \subsubsection{The \code{mpx_lsr} function}
338 |#include <catacomb/mpx.h>| \\
339 |void mpx_lsr(|\=|mpw *|$\dv$|, mpw *|$\dvl$|,| \\
340 \>|const mpw *|$\av$|, const mpw *|$\avl$|, size_t |$n$|);|
346 Stores in $\dv .. \dvl$ the result of shifting $\mp(\av .. \avl)$ right by
347 $n$~bits (i.e., dividing it by~$2^n$, rounding towards zero).
350 \subsection{Low-level arithmetic}
352 The remaining functions perform standard arithmetic operations on large
353 integers. The form for the arguments is fairly well-standardized:
354 destination arrays are given first, followed by the source arrays in the
355 conventional order. (This ordering reflects the usual order in a C
356 assignment expression, the \code{strcpy} and \code{memcpy} functions, and the
357 order of operands in many assembly languages.)
359 Some functions allow the destination array to be the same as one (or both)
360 source arrays; others forbid this. Under no circumstances may the
361 destination partially overlap the sources.
364 \subsubsection{The \code{mpx_2c} function}
370 |#include <catacomb/mpx.h>| \\
371 |void mpx_2c(mpw *|$\dv$|, mpw *|$\dvl$|,|
372 |const mpw *|$v$|, const mpw *|$\vl$|);|
377 Computes the two's complement of the number $\mp(v .. \vl)$ and stores it in
378 $\dv .. \dvl$. The two arrays $v .. \vl$ and $\dv .. \dvl$ may be the same.
380 \subsubsection{The \code{mpx_ucmp} function and \code{MPX_UCMP} macro}
386 |#include <catacomb/mpx.h>| \\
387 |int mpx_ucmp(|\=|const mpw *|$\av$|, const mpw *|$\avl$|,| \\
388 \>|const mpw *|$\bv$|, const mpw *|$\bvl$|);| \\
389 |int MPX_UCMP(|\=|const mpw *|$\av$|, const mpw *|$\avl$|, |\synt{rel-op}|,|
390 \\ \>|const mpw *|$\bv$|, const mpw *|$\bvl$|);|
396 The function \code{mpx_ucmp} performs an unsigned comparison of two unsigned
397 multiprecision integers $a$ and~$b$, passed in the arrays $\av .. \avl$ and
398 $\bv .. \bvl$ respectively.
400 The macro \code{MPX_UCMP} provides a slightly more readable interface for
401 comparing integers. The \synt{rel-op} may be any C relational operator
402 (e.g., |!=|, or |<=|). The macro tests whether
403 $a \mathrel{\synt{rel-op}} b$.
407 The function \code{mpx_ucmp} returns a value less then, equal to, or
408 greater than zero depending on whether $a$ is less than, equal to or greater
411 The macro \code{MPX_UCMP} returns a nonzero result if $a
412 \mathrel{\synt{rel-op}} b$ is true, and zero if false.
414 \subsubsection{The \code{mpx_uadd} function}
421 |#include <catacomb/mpx.h>| \\
422 |void mpx_uadd(|\=|mpw *|$\dv$|, mpw *|$\dvl$|,|
423 |const mpw *|$\av$|, const mpw *|$\avl$|,| \\
424 \>|const mpw *|$\bv$|, const mpw *|$\bvl$|);|
430 Adds two multiprecision integers. The sum of the two arguments
431 $\mp(\av .. \avl) + \mp(\bv .. \bvl)$ is stored in $\dv .. \dvl$. The
432 destination array may be equal to either or both source arrays.\footnote{%
433 Adding a number to itself is a poor way of doubling. A call to
434 \code{mpx_lsl} (page~\pageref{fn:mpx-lsl}) is much more efficient.} %
436 \subsubsection{The \code{mpx_uaddn} function and \code{MPX_UADDN} macro}
442 |#include <catacomb/mpx.h>| \\
443 |void mpx_uaddn(mpw *|$\dv$|, mpw *|$\dvl$|, mpw |$n$|);| \\
444 |void MPX_UADDN(mpw *|$\dv$|, mpw *|$\dvl$|, mpw |$n$|);|
449 The function \code{mpx_uaddn} adds a small integer~$n$ (expressed as a single
450 \code{mpw}) to the multiprecision integer held in $\dv .. \dvl$.
452 The macro \code{MPX_UADDN} performs exactly the same operation, but uses
453 inline code rather than calling a function.
455 \subsubsection{The \code{mpx_usub} function}
462 |#include <catacomb/mpx.h>| \\
463 |void mpx_usub(|\=|mpw *|$\dv$|, mpw *|$\dvl$|,|
464 |const mpw *|$\av$|, const mpw *|$\avl$|,| \\
465 \>|const mpw *|$\bv$|, const mpw *|$\bvl$|);|
471 Subtracts one multiprecision integer from another. The difference of the two
472 arguments $\mp(\av .. \avl) - \mp(\bv .. \bvl)$ is stored in $\dv .. \dvl$.
473 The destination array may be equal to either or both source
475 Subtracting a number from itself is a particularly poor way of clearing an
476 integer to zero. A call to \code{MPX_ZERO} (page~\pageref{fn:MPX-ZERO}) is
477 much more efficient.} %
479 Because overly large results are truncated to fit the destination array, if
480 the second argument is larger than the first, a two's-complement result is
483 \subsubsection{The \code{mpx_usubn} function and \code{MPX_USUBN} macro}
489 |#include <catacomb/mpx.h>| \\
490 |void mpx_usubn(mpw *|$\dv$|, mpw *|$\dvl$|, mpw |$n$|);| \\
491 |void MPX_USUBN(mpw *|$\dv$|, mpw *|$\dvl$|, mpw |$n$|);|
496 The function \code{mpx_usubn} subtracts a small integer~$n$ (expressed as a
497 single \code{mpw}) from the multiprecision integer held in $\dv .. \dvl$.
499 The macro \code{MPX_USUBN} performs exactly the same operation, but uses
500 inline code rather than calling a function.
502 \subsubsection{The \code{mpx_umul} function}
509 |#include <catacomb/mpx.h>| \\
510 |void mpx_umul(|\=|mpw *|$\dv$|, mpw *|$\dvl$|,|
511 |const mpw *|$\av$|, const mpw *|$\avl$|,| \\
512 \>|const mpw *|$\bv$|, const mpw *|$\bvl$|);|
518 Multiplies two multiprecision integers. The product of the two arguments
519 $\mp(\av .. \avl) \times \mp(\bv .. \bvl)$ is stored in $\dv .. \dvl$. The
520 destination array may not be equal to either source array.
522 \subsubsection{The \code{mpx_umuln} function and \code{MPX_UMULN} macro}
529 |#include <catacomb/mpx.h>| \\
530 |void mpx_umuln(|\=|mpw *|$\dv$|, mpw *|$\dvl$|,| \\
531 \> |const mpw *|$\av$|, const mpw *|$\avl$|, mpw |$n$|);| \\
532 |void MPX_UMULN(|\=|mpw *|$\dv$|, mpw *|$\dvl$|,| \\
533 \> |const mpw *|$\av$|, const mpw *|$\avl$|, mpw |$n$|);|
539 The function \code{mpx_umuln} multiplies the multiprecision integer passed in
540 $\av .. \avl$ by a small integer~$n$ (expressed as a single \code{mpw}),
541 writing the product $n \cdot \mp(\av .. \avl)$ to the destination array $\dv
542 .. \dvl$. The destination array may be equal to the source array $\av
545 The macro \code{MPX_UMULN} performs exactly the same operation, but uses
546 inline code rather than calling a function.
548 \subsubsection{The \code{mpx_umlan} function and \code{MPX_UMLAN} macro}
555 |#include <catacomb/mpx.h>| \\*
556 |void mpx_umlan(|\=|mpw *|$\dv$|, mpw *|$\dvl$|,| \\*
557 \> |const mpw *|$\av$|, const mpw *|$\avl$|, mpw |$n$|);| \\
558 |void MPX_UMLAN(|\=|mpw *|$\dv$|, mpw *|$\dvl$|,| \\
559 \> |const mpw *|$\av$|, const mpw *|$\avl$|, mpw |$n$|);|
565 The function \code{mpx_umlan} multiplies the multiprecision integer passed in
566 $\av .. \avl$ by a small integer~$n$ (expressed as a single \code{mpw}), and
567 adds it to the value already stored in the destination array $\dv .. \dvl$.
568 The destination array may be equal to the source array $\av .. \avl$,
569 although this isn't very useful.
571 The macro \code{MPX_UMLAN} performs exactly the same operation, but uses
572 inline code rather than calling a function.
574 \subsubsection{The \code{mpx_usqr} function}
580 |#include <catacomb/mpx.h>| \\
581 |void mpx_usqr(mpw *|$\dv$|, mpw *|$\dvl$|,|
582 |const mpw *|$\av$|, const mpw *|$\avl$|);|
587 Squares a multiprecision integer. The result $\mp(\av .. \avl)^2$ is stored
588 in $\dv .. \dvl$. The destination array may not be equal to the source
591 Squaring a number is approximately twice as fast as multiplying a number by
594 \subsubsection{The \code{mpx_udiv} function}
601 |#include <catacomb/mpx.h>| \\
602 |void mpx_udiv(|\=|mpw *|$\qv$|, mpw *|$\qvl$|, mpw *|$\rv$|, mpw *|$\rvl$|,|
603 \\ \>|const mpw *|$\dv$|, const mpw *|$\dvl$|,|
604 |mpw *|$\mathit{sv}$|, mpw *|$\mathit{svl}$|);|
610 Calculates the quotient and remainder when one multiprecision integer is
613 The calling convention is slightly strange. The dividend and divisor are
614 passed in the arrays $\rv .. \rvl$ and $\dv .. \dvl$ respectively. The
615 quotient is written to the array $\qv .. \qvl$; the remainder is left in $\rv
618 The division function requires some workspace. The `scratch' array
619 $\mathit{sv} .. \mathit{svl}$ must be at least one word longer than the
620 divisor array $\dv .. \dvl$. The scratch array'sfinal contents are not
621 useful. Also, the remainder array $\rv .. \rvl$ must have at least two
625 Given a dividend $x$ and a divisor $y$, the function calculates a quotient
626 $q$ and remainder $r$ such that $q = \lfloor x / y \rfloor$ and $x = qy + r$.
627 In particular, this definition implies that $r$ has the same sign as $y$,
628 which is a useful property when performing modular reductions.
633 %%% TeX-master: "catacomb"