X-Git-Url: http://www.chiark.greenend.org.uk/ucgi/~mdw/git/sod/blobdiff_plain/e520bc2484f96c38991fb8d3b49cd9a3b2410842..9e91c8e7b5fcdeb6389ac7ccbcd9c77348c4493a:/doc/runtime.tex diff --git a/doc/runtime.tex b/doc/runtime.tex index 8b8e080..f6f0846 100644 --- a/doc/runtime.tex +++ b/doc/runtime.tex @@ -29,6 +29,560 @@ This chapter describes the runtime support macros and functions provided by the Sod library. The common structure of object instances and classes is described in \xref{ch:structures}. +%%%-------------------------------------------------------------------------- +\section{Keyword argument support} \label{sec:runtime.keywords} + +This section describes the types, macros, and functions exposed in the +@|| header file which provides support for defining and +calling functions which make use of keyword arguments; see \xref{sec:concepts.keywords}. + + +\subsection{Type definitions} \label{sec:sec:runtime.keywords.types} + +The header file defines two simple structure types, and a function type which +will be described later. + +\begin{describe}[struct kwval]{type} + {struct kwval \{ \\ \ind + const char *kw; \\ + const void *val; \- \\ + \};} + + The @|kwval| structure describes a keyword argument name/value pair. The + @|kw| member points to the name, as a null-terminated string. The @|val| + member always contains the \emph{address} of the value. (This somewhat + inconvenient arrangement makes the size of a @|kwval| object independent of + the actual argument type.) +\end{describe} + +\begin{describe}[struct kwtab]{type} + {struct kwtab \{ \\ \ind + const struct kwval *v; \\ + size_t n; \- \\ + \};} + + The @|kwtab| structure describes a list of keyword arguments, represented + as a vector of @|kwval| structures. The @|v| member points to the start of + the vector; the @|n| member contains the number of elements in the vector. +\end{describe} + + +\subsection{Calling functions with keyword arguments} +\label{sec:runtime.keywords.calling} + +Functions which accept keyword arguments are ordinary C functions with +variable-length argument tails. Hence, they can be called using ordinary C +(of the right kind) and all will be well. However, argument lists must +follow certain rules (which will be described in full below); failure to do +this will result in \emph{undefined behaviour}. + +The header file provides integration with some C compilers in the form of +macros which can be used to help the compiler diagnose errors in calls to +keyword-accepting functions; but such support is rather limited at the +moment. Some additional macros are provided for use in calls to such +functions, and it is recommended that, where possible, these are used. In +particular, it's all too easy to forget the trailing null terminator which +marks the end of a list of keyword arguments. + +That said, the underlying machinery is presented first, and the convenience +macros are described later. + +\subsubsection{Keyword argument mechanism} +The argument tail, following the mandatory arguments, consists of a sequence +of zero or more alternating keyword names, as pointers to null-terminated +strings (with type @|const char~*|), and their argument values. This +sequence is finally terminated by a null pointer (again with type @|const +char~*|) in place of a keyword name. + +Each function may define for itself which keyword names it accepts, +and what types the corresponding argument values should have. +There are also (currently) three special keyword names. +\begin{description} \let\makelabel\code + +\item[kw.valist] This special keyword is followed by a pointer to a + variable-length argument tail cursor object, of type @|va_list~*|. This + cursor object will be modified as the function extracts successive + arguments from the tail. The argument tail should consist of alternating + keyword names and argument values, as described above, including the first + keyword name. (This is therefore different from the convention used when + calling keyword argument parser functions: see the description of the + \descref{KWSET_PARSEFN}[macro]{mac} for more details about these.) The + argument tail may itself contain the special keywords. + +\item[kw.tab] This special keyword is followed by \emph{two} argument values: + a pointer to the base of a vector of @|kwval| structures, and the number of + elements in this vector (as a @|size_t|). Each element of the vector + describes a single keyword argument: the @|kw| member points to the + keyword's name, and the @|val| member points to the value. + + The vector may contain special keywords. The @|val| pointer for a + @|kw.valist| argument should contain the address of an object of type + @|va_list~*| (and not point directly to the cursor object, since @|val| is + has type @|const void~*| but the cursor will be modified as its argument + tail is traversed). The @|val| pointer for a @|kw.tab| argument should + contain the address of a @|kwtab| structure which itself contains the base + address and length of the argument vector to be processed. + +\item[kw.unknown] This keyword is never accepted by any function. If it is + encountered, the @|kw_unknown| function is called to report the situation + as an error; see below. + +\end{description} +It is possible to construct a circular structure of indirect argument lists +(in a number of ways). Don't try to pass such a structure to a function: the +result will be unbounded recursion or some other bad outcome. + +\subsubsection{Argument list structuring macros} +The following macros are intended to help with constructing keyword argument +lists. Their use is not essential, but may help prevent errors. + +\begin{describe}[KWARGS]{mac}{KWARGS(@)} + The @ encloses a sequence of keyword arguments expressed as calls to + argument consists of a sequence of calls to the keyword-argument macros + described below, one after another without any separation. + + In C89, macro actual arguments are not permitted to be empty; if there are + no keyword arguments to provide, and you're using a C89 compiler, then use + @|NO_KWARGS| (below) instead. If your compiler supports C99 or later, it's + fine to just write @|KWARGS()| instead. +\end{describe} + +\begin{describe}{mac}{NO_KWARGS} + A marker, to be written instead of a @|KWARGS| invocation, to indicate that + no keyword arguments are to be passed to a function. + + This is unnecessary with compilers which support C99 or later, since once + can use @|KWARGS()| with an empty @ argument. +\end{describe} + +The following keyword-argument macros can be used within the @|KWARGS| +@ argument. + +\begin{describe}[K]{mac}{K(@, @)} + Passes a keyword @ and its corresponding @, as a pair of + arguments. The @ should be a single identifier (not a quoted + string). The @ may be any C expression of the appropriate type. +\end{describe} + +\begin{describe}[K_VALIST]{mac}{K_VALIST(@)} + Passes an indirect variable-length argument tail. The argument @ + should be an lvalue of type @|va_list|, which will be passed by reference. +\end{describe} + +\begin{describe}[K_TAB]{mac}{K_TAB(@, @)} + Passes a vector of keyword arguments. The argument @ should be the base + address of the vector, and @ should be the number of elements in the + vector. +\end{describe} + + +\subsection{Defining functions with keyword arguments} +\label{sec:runtime.keywords.defining} + +\subsubsection{Keyword sets} +A \emph{keyword set} defines the collection of keyword arguments accepted by +a particular function. The same keyword set may be used by several +functions. (If your function currently accepts no keyword arguments, but you +plan to add some later, do not define a keyword set, and use the +@|KWPARSE_EMPTY| macro described below.) + +Each keyword set has a name, which is a C identifier. It's good to choose +meaningful and distinctive names for keyword sets. Keyword set names are +meaningful at runtime: they are used as part of the @|kw_unknown| protocol +(\xref{sec:runtime.keywords.unknown}), and may be examined by handler +functions, or reported to a user in error messages. For a keyword set which +is used only by a single function, it is recommended that the set be given +the same name as the function. + +The keyword arguments for a keyword set named @ are described by a `list +macro' named @|@{}_KWSET|. This macro takes a single argument, +conventionally named @`_'. + +It should expand to a sequence of one or more list items of the form +\begin{prog} + _(@, @, @) +\end{prog} +with no separation between them. + +For example: +\begin{prog} + \#define example_KWSET(_) @\\ \\ \ind + _(int, x, 0) @\\ \\ + _(const char *, y, NULL) +\end{prog} + +Each @ should be a distinct C identifier; they will be used to name +structure members. An argument @ should not end with the suffix +@`_suppliedp' (for reasons which will soon become apparent). + +Each @ should be a C @ such that +\begin{prog} + @ @ ; +\end{prog} +is a valid declaration: so it may consist of declaration specifiers and +(possibly qualified) pointer declarator markers, but not array or function +markers (since they would have to be placed after the @). This is the +same requirement made by the standard \man{va_arg}{3} macro. + +Each @ should be an initializer expression or brace-enclosed list, +suitable for use in an aggregate initializer for a variable with automatic +storage duration. (In C89, aggregate initializers may contain only constant +expressions; this restriction was lifted in C99.) + +\subsubsection{Function declaration markers} +The following marker macros are intended to be used in both declarations and +definitions of functions which accept keyword arguments. + +\begin{describe}{mac}{KWTAIL} + The @|KWTAIL| is expected to be used at the end of function parameter type + list to indicate that the function accepts keyword arguments; if there are + preceding mandatory arguments then the @|KWTAIL| marker should be separated + from them with a comma @`,'. (It is permitted for a function parameter + type list to contain only a @|KWTAIL| marker.) + + Specifically, the macro declares a mandatory argument @|const char + *kwfirst_| (to collect the first keyword name), and a variable-length + argument tail. + + The \descref{KWPARSE}[macro]{mac} assumes that the enclosing function's + argument list ends with a @|KWTAIL| marker. +\end{describe} + +\begin{describe}{mac}{KWCALL} + The @|KWCALL| macro acts as a declaration specifier for functions which + accept keyword arguments. Its effect is to arrange for the compiler to + check, as far as is possible, that calls to the function are well-formed + according to the keyword-argument rules. The exact checking performed + depends on the compiler's abilities (and how well supported the compiler + is): it may check that every other argument is a string; it may check that + the list is terminated with a null pointer; it may not do anything at all. + Again, this marker should be included in a function's definition and in any + declarations. +\end{describe} + +\subsubsection{Auxiliary definitions} +The following macros define data types and functions used for collecting +keyword arguments. + +\begin{describe}[KWSET_STRUCT]{mac}{KWSET_STRUCT(@);} + The @|KWSET_STRUCT| macro defines a \emph{keyword structure} named @|struct + @{}_kwargs|. For each argument defined in the keyword set, this + structure contains two members: one has exactly the @ and @ + listed in the keyword set definition; the other is a 1-bit-wide bitfield of + type @|unsigned int| named @|@{}_suppliedp|. +\end{describe} + +\begin{describe}[KWDECL]{mac} + {@ KWDECL(@, @);} + The macro declares and initializes a keyword argument structure variable + named @ for the named keyword @. The optional + @ may provide additional storage-class, qualifiers, + or other declaration specifiers. The @`_suppliedp' flags are initialized + to zero; the other members are initialized with the corresponding defaults + from the keyword-set definition. +\end{describe} + +\begin{describe}[KWSET_PARSEFN]{mac} + {@ KWSET_PARSEFN(@)} + + The macro @|KWSET_PARSEFN| defines a keyword argument \emph{parser + function} + \begin{prog} + void @{}_kwparse(\=struct @{}_kwargs *@, + const char *@, va_list *@, \+ \\ + const struct kwval *@, size_t @); + \end{prog} + The macro call can (and usually will) be preceded by storage class + specifiers such as @|static|, for example to adjust the linkage of the + name.\footnote{% + I don't recommend declaring parser functions @|inline|: parser functions + are somewhat large, and modern compilers are pretty good at figuring out + whether to inline static functions.} % + + The function's behaviour is as follows. It parses keyword arguments from a + variable-length argument tail, and/or a vector of @|kwval| structures. + When a keyword argument is recognized, for some keyword @, the + keyword argument structure pointed to by @ is updated: the flag + @|@{}_suppliedp| is set to 1; and the argument value is stored (by + simple assignment) in the @ member. + + Hence, if the @`_suppliedp' members are initialized to zero, the caller can + determine which keyword arguments were supplied. It is not possible to + discover whether two or more arguments have the same keyword: in this case, + the value from the last such argument is left in the keyword argument + structure, and any values from earlier arguments are lost. (For this + purpose, the argument vector @ is scanned \emph{after} the + variable-length argument tail captured in @.) + + The variable-argument tail is read from the list described by @|* @|. + The argument tail is expected to consist of alternating keyword strings (as + ordinary null-terminated strings) and the corresponding values, terminated + by a null pointer of type @|const char~*| in place of a keyword; except + that the first keyword (or terminating null pointer, if no arguments are + provided) is expected to have been extracted already and provided as the + @ argument; the first argument retrieved using the @|va_list| + cursor object should then be the value corresponding to the keyword named + by @.\footnote{% + This slightly unusual convention makes it possible for a function to + collect the first keyword as a separate mandatory argument, which is + essential if there are no other mandatory arguments. It also means that + the compiler will emit a diagnostic if you attempt to call a function + which expects keyword arguments, but don't supply any and forget the null + pointer which terminates the (empty) list.} % + If @ is a null pointer, then @ need not be a valid pointer; + otherwise, the cursor object @|* @| will be modified as the function + extracts successive arguments from the tail. + + The keyword vector is read from the vector of @|kwval| structures starting + at address @ and containing the following @ items. If @ is zero + then @ need not be a valid pointer. + + The function also handles the special @|kw.valist| and @|kw.tab| arguments + described above (\xref{sec:runtime.keywords.calling}). If an unrecognized + keyword argument is encountered, then \descref{kw_unknown}{fun} is called. +\end{describe} + +\subsubsection{Parsing keywords} +The following macros make use of the definitions described above to actually +make a function's keyword arguments available to it. + +\begin{describe}[KW_PARSE]{mac}{KW_PARSE(@, @, @);} + The @|KW_PARSE| macro invokes a keyword argument parsing function. The + @ argument should name a keyword set; @ should be an lvalue of + type @|struct @{}_kwargs|; and @ should be the name of the + enclosing function's last mandatory argument, which must have type @|const + char~*|. + + It calls the function @|@{}_kwparse| with five arguments: the address + of the keyword argument structure @; the string pointer @; the + address of a temporary argument-tail cursor object of type @|va_list|, + constructed on the assumption that @ is the enclosing function's + final keyword argument; a null pointer; and the value zero (signifying an + empty keyword-argument vector). + + If the variable @ was declared using \descref{KWDECL}{mac} and the + function @|@{}_kwparse| has been defined using + \descref{KWSET_PARSEFN}{mac} then the effect is to parse the keyword + arguments passed to the function and set the members of @ + appropriately. +\end{describe} + +\begin{describe}[KWPARSE]{mac}{KWPARSE(@);} + The macro @|KWPARSE| (note the lack of underscore) combines + \descref{KWDECL}{mac} and \descref{KW_PARSE}{mac}. It declares and + initializes a keyword argument structure variable with the fixed name + @|kw|, and parses the keyword arguments provided to the enclosing function, + storing the results in @|kw|. It assumes that the first keyword name is in + an argument named @|kwfirst_|, as set up by the + \descref{KWTAIL}[marker]{mac}. + + The macro expands both to a variable declaration and a statement: in C89, + declarations must precede statements, so under C89 rules this macro must + appear exactly between the declarations at the head of a brace-enclosed + block (typically the function body) and the statements at the end. This + restriction was lifted in C99, so the macro may appear anywhere in the + function body. However, it is recommended that callers avoid taking + actions which might require cleanup before attempting to parse their + keyword arguments, since keyword argument parsing functions invoke the + @|kw_unknown| handler (\xref{sec:runtime.keywords.unknown}) if they + encounter an unknown keyword, and the calling function will not get a + chance to tidy up after itself if this happens. +\end{describe} + +As mentioned above, it is not permitted to define an empty keyword set. +(Specifically, invoking \descref{KWSET_STRUCT}{mac} for an empty keyword set +would result in attempting to define a structure with no members, which C +doesn't allow.) On the other hand, keyword arguments are a useful extension +mechanism, and it's useful to be able to define a function which doesn't +currently accept any keywords, but which might in the future be extended to +allow keyword arguments. + +\begin{describe}[KW_PARSE_EMPTY]{mac}{KW_PARSE_EMPTY(@, @);} + This is an analogue to \descref{KW_PARSE}{mac} which checks the keyword + argument list for a function which accepts no keyword arguments. + + It calls the \descref{kw_parseempty}[function]{fun} with five arguments: + the @ name, as a string; the string pointer @; the address of + a temporary argument-tail cursor object of type @|va_list|, constructed on + the assumption that @ is the enclosing function's final keyword + argument; a null pointer; and the value zero (signifying an empty + keyword-argument vector). + + The effect is to check that the argument tail contains no keyword arguments + other than the special predefined ones. +\end{describe} + +\begin{describe}[KWPARSE_EMPTY]{mac}{KWPARSE_EMPTY(@);} + This is an analogue to \descref{KWPARSE}{mac} which checks that the + enclosing function has been passed no keyword arguments other than the + special predefined ones. It assumes that the first keyword name is in an + argument named @|kwfirst_|, as set up by the \descref{KWTAIL}[marker]{mac}. +\end{describe} + +\begin{describe}[kw_parseempty]{fun} + {void kw_parseempty(\=const char *@, + const char *@, va_list *@, \+ \\ + const struct kwval *@, size_t @);} + This function checks an keyword argument list to make sure that contains no + keyword arguments (other than the special ones described in + \xref{sec:runtime.keywords.calling}). + + The @ argument should point to a null-terminated string: this will be + reported as the keyword set name to \descref{kw_unknown}{fun}, though it + need not (and likely will not) refer to any defined keyword set. The + remaining arguments are as for the keyword parsing functions defined by the + \descref{KWSET_PARSEFN}[macro]{mac}. +\end{describe} + +\subsection{Function wrappers} \label{sec:runtime.keywords.wrappers} + +Most users will not need the hairy machinery involving argument vectors. +Their main use is in defining \emph{wrapper functions}. Suppose there is a +function @ which accepts some keyword arguments, and we want to write a +function @ which accepts the same keywords recognized by @ and some +additional ones. Unfortunately @ may behave differently depending on +whether or not a particular keyword argument is supplied at all, but it's not +possible to synthesize a valid @|va_list| other than by simply capturing a +live argument tail, and it's not possible to decide at runtime whether or not +to include some arguments in a function call. It's still possible to write +@, by building a vector of keyword arguments, collected one-by-one +depending on the corresponding @`_suppliedp' flags. + +A few macros are provided to make this task easier. + +\begin{describe}[KW_COUNT]{mac}{KW_COUNT(@)} + Returns the number of keywords defined in a keyword set named @. +\end{describe} + +\begin{describe}[KW_COPY]{mac} + {KW_COPY(@, @, @, @, @);} + + The macro @|KW_COPY| populates a vector of @|kwval| structures from a + keyword-argument structure. + + The @ and @ arguments should be the names of keyword sets; + @ should be an lvalue of type @|@{}_kwargs|; @ should be + the base address of a sufficiently large vector of @|struct kwval| objects; + and @ should be an lvalue of some appropriate integer type. The + @ must be a subset of @: i.e., for every keyword defined in + @ there is a keyword defined in @ with the same name and + type. + + Successive elements of @, starting at index @, are filled in to refer + to the keyword arguments defined in @ whose @`_suppliedp' flag is + set in the argument structure pointed to by @; for each such argument, + a pointer to the keyword name is stored in the corresponding vector + element's @|kw| member, and a pointer to the argument value, held in the + keyword argument structure, is stored in the vector element's @|val| + member. + + At the end of this, the index @ is advanced so as to contain the index + of the first unused element of @. Hence, at most @|KW_COUNT(@)| + elements of @ will be used. +\end{describe} + + +\subsection{Handling unknown-keyword errors} +\label{sec:runtime.keywords.unknown} + +When parsing a variable-length argument tail, it is not possible to continue +after encountering an unknown keyword name. This is because it is necessary +to know the (promoted) type of the following argument value in order to skip +past it; but the only clue provided as to the type is the keyword name, which +in this case is meaningless. + +In this situation, the parser functions generated by +\descref{KWSET_PARSEFN}{mac} (and the \descref{kw_parseempty}[function]{fun}) +call @|kw_unknown|. + +\begin{describe}[kw_unknown]{fun} + {void kw_unknown(const char *@, const char *@);} + + This is a function of two arguments: @ points to the name of the + keyword set expected by the caller, as a null-terminated string; and @ + is the unknown keyword which was encountered. All that @|kw_unknown| does + is invoke the function whose address is stored in the global variable + \descref{kw_unkhook}{var} with the same arguments. + + This function never returns to its caller: if the @|kw_unkhook| function + returns (which it shouldn't) then @|kw_unknown| writes a fatal error + message to the standard error stream and calls \man{abort}{3}. +\end{describe} + +\begin{describe}[kw_unkhookfn]{type} + {typedef void kw_unkhookfn(const char *@, const char *@);} + + The @|kw_unkhookfn| type is the type of unknown-keyword handler functions. + A handler function is given two arguments, both of which are pointers to + null-terminated strings: @ is the name of the keyword set expected; + and @ is the name of the offending unknown keyword. +\end{describe} + +\begin{describe}[kw_unkhook]{var}{kw_unkhookfn *kw_unkhook} + This variable\footnote{% + Having a single global hook variable is obviously inadequate for a modern + library, but dealing with multiple threads isn't currently possible + without writing (moderately complex) system-specific code which would be + out of place in this library. The author's intention is that the hook + variable @|kw_unkhook| be `owned' by some external library which can make + its functionality available to client programs in a safer and more + convenient way. On Unix-like platforms (including Cygwin) that library + will be (a later version of) \textbf{mLib}; other platforms will likely + need different arrangements. The author is willing to coordinate any + such efforts.} % + holds the current unknown-keyword handler function. It will be invoked by + \descref{kw_unknown}{fun}. The function may take whatever action seems + appropriate, but should not return to its caller. + + Initially, this variable points to the + \descref{kw_defunknown}[function]{fun}. +\end{describe} + +\begin{describe}[kw_defunknown]{fun} + {void kw_defunknown(const char *@, const char *@);} + This function simply writes a message to standard error, to the effect that + the keyword named by @ is not known in the keyword set @, and + calls \man{abort}{3}. + + This function is the default value of the \descref{kw_unkhook}[hook + variable]{var}. +\end{describe} + +As an example of the kind of special effect which can be achieved using this +hook, the following hacking answers whether a function recognizes a +particular keyword argument. + +\begin{prog} + \#define KWARGS_TEST(k, val) KWARGS(K(k, val) K(kw.unknown, 0)) + \\+ + static jmp_buf kw_test_jmp; + \\+ + static void kw_test_unknown(const char *set, const char *kw) \\ + \{ \\ \ind + if (strcmp(kw, "kw.unknown")) longjmp(kw_test_jmp, 1); \\ + else longjmp(kw_test_jmp, 2); \- \\ + \} + \\+ + \#define KW_TEST(flag, set, call) do \{ @\\ \\ \ind + kw_unkhookfn *oldunk = kw_unkhook; @\\ \\ + kw_unkhook = kw_test_unknown; @\\ \\ + switch (setjmp(kw_test_jmp)) \{ @\\ \\ \ind + case 0: call; abort(); @\\ \\ + case 1: flag = 1; break; @\\ \\ + case 2: flag = 0; break; @\\ \\ + default: abort(); \- @\\ \\ + \} @\\ \\ + kw_unkhook = oldunk; \- @\\ \\ + \} while (0) + \\+ + /* Example of use */ \\ + int f; \\ + KW_TEST(f, somefunc(1, "two", 3, KWARGS_TEST("shiny", 68.7))); \\ + /\=* now f is nonzero if `somefunc' accepts the `shiny' keyword \+ \\ + {}* (which we hope wants a double argument) \\ + {}*/ +\end{prog} + %%%-------------------------------------------------------------------------- \section{Object system support} \label{sec:runtime.object}