doc/: Miscellaneous clarifications and rewordings.

[sod] / doc / concepts.tex
diff --git a/doc/concepts.tex b/doc/concepts.tex

index b9bf6d549e3ca60f3f4b42653ebb11f8f9952ffe..1a84b888802f19ad7099af2aa09a986deb45191b 100644 (file)
--- a/doc/concepts.tex
+++ b/doc/concepts.tex
@@ -23,15 +23,706 @@
  %%% along with SOD; if not, write to the Free Software Foundation,
  %%% Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
  
-\chapter{Concepts}
+\chapter{Concepts} \label{ch:concepts}
  
-\section{Classes and slots}
+%%%--------------------------------------------------------------------------
+\section{Operational model} \label{sec:concepts.model}
  
-\section{Messages and methods}
+The Sod translator runs as a preprocessor, similar in nature to the
+traditional Unix \man{lex}{1} and \man{yacc}{1} tools.  The translator reads
+a \emph{module} file containing class definitions and other information, and
+writes C~source and header files.  The source files contain function
+definitions and static tables which are fed directly to a C~compiler; the
+header files contain declarations for functions and data structures, and are
+included by source files -- whether hand-written or generated by Sod -- which
+makes use of the classes defined in the module.
  
-\section{Metaclasses}
+Sod is not like \Cplusplus: it makes no attempt to `enhance' the C language
+itself.  Sod module files describe classes, messages, methods, slots, and
+other kinds of object-system things, and some of these descriptions need to
+contain C code fragments, but this code is entirely uninterpreted by the Sod
+translator.\footnote{%
+  As long as a code fragment broadly follows C's lexical rules, and properly
+  matches parentheses, brackets, and braces, the Sod translator will copy it
+  into its output unchanged.  It might, in fact, be some other kind of C-like
+  language, such as Objective~C or \Cplusplus.  Or maybe even
+  Objective~\Cplusplus, because if having an object system is good, then
+  having three must be really awesome.} %
  
-\section{Modules}
+The Sod translator is not a closed system.  It is written in Common Lisp, and
+can load extension modules which add new input syntax, output formats, or
+altered behaviour.  The interface for writing such extensions is described in
+\xref{p:lisp}.  Extensions can change almost all details of the Sod object
+system, so the material in this manual must be read with this in mind: this
+manual describes the base system as provided in the distribution.
+
+%%%--------------------------------------------------------------------------
+\section{Modules} \label{sec:concepts.modules}
+
+A \emph{module} is the top-level syntactic unit of input to the Sod
+translator.  As described above, given an input module, the translator
+generates C source and header files.
+
+A module can \emph{import} other modules.  This makes the type names and
+classes defined in those other modules available to class definitions in the
+importing module.  Sod's module system is intentionally very simple.  There
+are no private declarations or attempts to hide things.
+
+As well as importing existing modules, a module can include a number of
+different kinds of \emph{items}:
+\begin{itemize}
+\item \emph{class definitions} describe new classes, possibly in terms of
+  existing classes;
+\item \emph{type name declarations} introduce new type names to Sod's
+  parser;\footnote{%
+    This is unfortunately necessary because C syntax, upon which Sod's input
+    language is based for obvious reasons, needs to treat type names
+    differently from other kinds of identifiers.} %
+  and
+\item \emph{code fragments} contain literal C code to be dropped into an
+  appropriate place in an output file.
+\end{itemize}
+Each kind of item, and, indeed, a module as a whole, can have a collection of
+\emph{properties} associated with it.  A property has a \emph{name} and a
+\emph{value}.  Properties are an open-ended way of attaching additional
+information to module items, so extensions can make use of them without
+having to implement additional syntax.
+
+%%%--------------------------------------------------------------------------
+\section{Classes, instances, and slots} \label{sec:concepts.classes}
+
+For the most part, Sod takes a fairly traditional view of what it means to be
+an object system.
+
+An \emph{object} maintains \emph{state} and exhibits \emph{behaviour}.  An
+object's state is maintained in named \emph{slots}, each of which can store a
+C value of an appropriate (scalar or aggregate) type.  An object's behaviour
+is stimulated by sending it \emph{messages}.  A message has a name, and may
+carry a number of arguments, which are C values; sending a message may result
+in the state of receiving object (or other objects) being changed, and a C
+value being returned to the sender.
+
+Every object is a (direct) instance of some \emph{class}.  The class
+determines which slots its instances have, which messages its instances can
+be sent, and which methods are invoked when those messages are received.  The
+Sod translator's main job is to read class definitions and convert them into
+appropriate C declarations, tables, and functions.  An object cannot
+(usually) change its direct class, and the direct class of an object is not
+affected by, for example, the static type of a pointer to it.
+
+
+\subsection{Superclasses and inheritance}
+\label{sec:concepts.classes.inherit}
+
+\subsubsection{Class relationships}
+Each class has zero or more \emph{direct superclasses}.
+
+A class with no direct superclasses is called a \emph{root class}.  The Sod
+runtime library includes a root class named @|SodObject|; making new root
+classes is somewhat tricky, and won't be discussed further here.
+
+Classes can have more than one direct superclass, i.e., Sod supports
+\emph{multiple inheritance}.  A Sod class definition for a class~$C$ lists
+the direct superclasses of $C$ in a particular order.  This order is called
+the \emph{local precedence order} of $C$, and the list which consists of $C$
+follows by $C$'s direct superclasses in local precedence order is called the
+$C$'s \emph{local precedence list}.
+
+The multiple inheritance in Sod works similarly to multiple inheritance in
+Lisp-like languages, such as Common Lisp, EuLisp, Dylan, and Python, which is
+very different from how multiple inheritance works in \Cplusplus.\footnote{%
+  The latter can be summarized as `badly'.  By default in \Cplusplus, an
+  instance receives an additional copy of superclass's state for each path
+  through the class graph from the instance's direct class to that
+  superclass, though this behaviour can be overridden by declaring
+  superclasses to be @|virtual|.  Also, \Cplusplus\ offers only trivial
+  method combination (\xref{sec:concepts.methods}), leaving programmers to
+  deal with delegation manually and (usually) statically.} %
+
+If $C$ is a class, then the \emph{superclasses} of $C$ are
+\begin{itemize}
+\item $C$ itself, and
+\item the superclasses of each of $C$'s direct superclasses.
+\end{itemize}
+The \emph{proper superclasses} of a class $C$ are the superclasses of $C$
+except for $C$ itself.  If a class $B$ is a (direct, proper) superclass of
+$C$, then $C$ is a \emph{(direct, proper) subclass} of $B$.  If $C$ is a root
+class then the only superclass of $C$ is $C$ itself, and $C$ has no proper
+superclasses.
+
+If an object is a direct instance of class~$C$ then the object is also an
+(indirect) instance of every superclass of $C$.
+
+If $C$ has a proper superclass $B$, then $B$ is not allowed to have $C$ has a
+direct superclass.  In different terms, if we construct a graph, whose
+vertices are classes, and draw an edge from each class to each of its direct
+superclasses, then this graph must be acyclic.  In yet other terms, the `is a
+superclass of' relation is a partial order on classes.
+
+\subsubsection{The class precedence list}
+This partial order is not quite sufficient for our purposes.  For each class
+$C$, we shall need to extend it into a total order on $C$'s superclasses.
+This calculation is called \emph{superclass linearization}, and the result is
+a \emph{class precedence list}, which lists each of $C$'s superclasses
+exactly once.  If a superclass $B$ precedes (resp.\ follows) some other
+superclass $A$ in $C$'s class precedence list, then we say that $B$ is a more
+(resp.\ less) \emph{specific} superclass of $C$ than $A$ is.
+
+The superclass linearization algorithm isn't fixed, and extensions to the
+translator can introduce new linearizations for special effects, but the
+following properties are expected to hold.
+\begin{itemize}
+\item The first class in $C$'s class precedence list is $C$ itself; i.e.,
+  $C$ is always its own most specific superclass.
+\item If $A$ and $B$ are both superclasses of $C$, and $A$ is a proper
+  superclass of $B$ then $A$ appears after $B$ in $C$'s class precedence
+  list, i.e., $B$ is a more specific superclass of $C$ than $A$ is.
+\end{itemize}
+The default linearization algorithm used in Sod is the \emph{C3} algorithm,
+which has a number of good properties described in~\cite{FIXME:C3}.
+It works as follows.
+\begin{itemize}
+\item A \emph{merge} of some number of input lists is a single list
+  containing each item that is in any of the input lists exactly once, and no
+  other items; if an item $x$ appears before an item $y$ in any input list,
+  then $x$ also appears before $y$ in the merge.  If a collection of lists
+  have no merge then they are said to be \emph{inconsistent}.
+\item The class precedence list of a class $C$ is a merge of the local
+  precedence list of $C$ together with the class precedence lists of each of
+  $C$'s direct superclasses.
+\item If there are no such merges, then the definition of $C$ is invalid.
+\item Suppose that there are multiple candidate merges.  Consider the
+  earliest position in these candidate merges at which they disagree.  The
+  \emph{candidate classes} at this position are the classes appearing at this
+  position in the candidate merges.  Each candidate class must be a
+  superclass of distinct direct superclasses of $C$, since otherwise the
+  candidates would be ordered by their common subclass's class precedence
+  list.  The class precedence list contains, at this position, that candidate
+  class whose subclass appears earliest in $C$'s local precedence order.
+\end{itemize}
+
+\subsubsection{Class links and chains}
+The definition for a class $C$ may distinguish one of its proper superclasses
+as being the \emph{link superclass} for class $C$.  Not every class need have
+a link superclass, and the link superclass of a class $C$, if it exists, need
+not be a direct superclass of $C$.
+
+Superclass links must obey the following rule: if $C$ is a class, then there
+must be no three superclasses $X$, $Y$ and~$Z$ of $C$ such that $Z$ is the
+link superclass of both $X$ and $Y$.  As a consequence of this rule, the
+superclasses of $C$ can be partitioned into linear \emph{chains}, such that
+superclasses $A$ and $B$ are in the same chain if and only if one can trace a
+path from $A$ to $B$ by following superclass links, or \emph{vice versa}.
+
+Since a class links only to one of its proper superclasses, the classes in a
+chain are naturally ordered from most- to least-specific.  The least specific
+class in a chain is called the \emph{chain head}; the most specific class is
+the \emph{chain tail}.  Chains are often named after their chain head
+classes.
+
+\subsection{Names}
+\label{sec:concepts.classes.names}
+
+Classes have a number of other attributes:
+\begin{itemize}
+\item A \emph{name}, which is a C identifier.  Class names must be globally
+  unique.  The class name is used in the names of a number of associated
+  definitions, to be described later.
+\item A \emph{nickname}, which is also a C identifier.  Unlike names,
+  nicknames are not required to be globally unique.  If $C$ is any class,
+  then all the superclasses of $C$ must have distinct nicknames.
+\end{itemize}
+
+
+\subsection{Slots} \label{sec:concepts.classes.slots}
+
+Each class defines a number of \emph{slots}.  Much like a structure member, a
+slot has a \emph{name}, which is a C identifier, and a \emph{type}.  Unlike
+many other object systems, different superclasses of a class $C$ can define
+slots with the same name without ambiguity, since slot references are always
+qualified by the defining class's nickname.
+
+\subsubsection{Slot initializers}
+As well as defining slot names and types, a class can also associate an
+\emph{initial value} with each slot defined by itself or one of its
+subclasses.  A class $C$ provides an \emph{initialization function} (see
+\xref{sec:concepts.classes.c}, and \xref{sec:structures.root.sodclass}) which
+sets the slots of a \emph{direct} instance of the class to the correct
+initial values.  If several of $C$'s superclasses define initializers for the
+same slot then the initializer from the most specific such class is used.  If
+none of $C$'s superclasses define an initializer for some slot then that slot
+will be left uninitialized.
+
+The initializer for a slot with scalar type may be any C expression.  The
+initializer for a slot with aggregate type must contain only constant
+expressions if the generated code is expected to be processed by a
+implementation of C89.  Initializers will be evaluated once each time an
+instance is initialized.
+
+
+\subsection{C language integration} \label{sec:concepts.classes.c}
+
+For each class~$C$, the Sod translator defines a C type, the \emph{class
+type}, with the same name.  This is the usual type used when considering an
+object as an instance of class~$C$.  No entire object will normally have a
+class type,\footnote{%
+  In general, a class type only captures the structure of one of the
+  superclass chains of an instance.  A full instance layout contains multiple
+  chains.  See \xref{sec:structures.layout} for the full details.} %
+so access to instances is almost always via pointers.
+
+\subsubsection{Access to slots}
+The class type for a class~$C$ is actually a structure.  It contains one
+member for each class in $C$'s superclass chain, named with that class's
+nickname.  Each of these members is also a structure, containing the
+corresponding class's slots, one member per slot.  There's nothing special
+about these slot members: C code can access them in the usual way.
+
+For example, if @|MyClass| has the nickname @|mine|, and defines a slot @|x|
+of type @|int|, then the simple function
+\begin{prog}
+  int get_x(MyClass *m) \{ return (m@->mine.x); \}
+\end{prog}
+will extract the value of @|x| from an instance of @|MyClass|.
+
+All of this means that there's no such thing as `private' or `protected'
+slots.  If you want to hide implementation details, the best approach is to
+stash them in a dynamically allocated private structure, and leave a pointer
+to it in a slot.  (This will also help preserve binary compatibility, because
+the private structure can grow more members as needed.  See
+\xref{sec:fixme.compatibility} for more details.
+
+\subsubsection{Class objects}
+In Sod's object system, classes are objects too.  Therefore classes are
+themselves instances; the class of a class is called a \emph{metaclass}.  The
+consequences of this are explored in \xref{sec:concepts.metaclasses}.  The
+\emph{class object} has the same name as the class, suffixed with
+`@|__class|'\footnote{%
+  This is not quite true.  @|$C$__class| is actually a macro.  See
+  \xref{sec:structures.layout.additional} for the gory details.} %
+and its type is usually @|SodClass|; @|SodClass|'s nickname is @|cls|.
+
+A class object's slots contain or point to useful information, tables and
+functions for working with that class's instances.  (The @|SodClass| class
+doesn't define any messages, so it doesn't have any methods.  In Sod, a class
+slot containing a function pointer is not at all the same thing as a method.)
+
+\subsubsection{Instance allocation, imprinting, and initialization}
+It is in general not sufficient to declare (or @|malloc|) an object of the
+appropriate class type and fill it in, since the class type only describes an
+instance's layout from the point of view of a single superclass chain.  The
+correct type to allocate, to store a direct instance of some class is a
+structure whose tag is the class name suffixed with `@|__ilayout|'; e.g., the
+correct layout structure for a direct instance of @|MyClass| would be
+@|struct MyClass__ilayout|.
+
+Instance layouts may be declared as objects with automatic storage duration
+(colloquially, `allocated on the stack') or allocated dynamically, e.g.,
+using @|malloc|.  Sod's runtime system doesn't retain addresses of instances,
+so, for example, Sod doesn't make using a fancy allocator which sometimes
+moves objects around in memory any more difficult than it needs to be.
+
+Once storage for an instance has been allocated, it must be \emph{imprinted}
+before it can be used.  Imprinting an instance stores some metadata about its
+direct class in the instance structure, so that the rest of the program (and
+Sod's runtime library) can tell what sort of object it is, and how to use
+it.\footnote{%
+  Specifically, imprinting an instance's storage involves storing the
+  appropriate vtable pointers in the right places in it.} %
+A class object's @|imprint| slot points to a function which will correctly
+imprint storage for one of that class's instances.
+
+Once an instance's storage has been imprinted, it is possible to send the
+instance messages; however, the instance's slots are uninitialized at this
+point, so most methods are unlikely to do much of any use.  So, usually, you
+don't just want to imprint instance storage, but to \emph{initialize} an
+instance.  Initialization includes imprinting, but also sets the new
+instance's slots to their initial values, as defined by the class.  If
+neither the class nor any of its superclasses defines an initializer for a
+slot then it will not be initialized.
+
+There is currently no facility for providing parameters to the instance
+initialization process (e.g., for use by slot initializer expressions).
+Instance initialization is a complicated matter and for now I want to
+experiment with various approaches before committing to one.  My current
+interim approach is to specify slot initializers where appropriate and send
+class-specific messages for more complicated parametrized initialization.
+
+Automatic-duration instances can be conveniently constructed and initialized
+using the \descref{SOD_DECL}[macro]{mac}.  No special support is currently
+provided for dynamically allocated instances.  A simple function using
+@|malloc| might work as follows.
+\begin{prog}
+  void *new_instance(const SodClass *c) \\
+  \{ \\ \ind
+    void *p = malloc(c@->cls.initsz); \\
+    if (!p) return (0); \\
+    c@->cls.init(p); \\
+    return (p); \- \\
+  \}
+\end{prog}
+
+\subsubsection{Instance finalization and deallocation}
+There is currently no provided assistance for finalization or deallocation.
+It is the programmer's responsibility to decide and implement an appropriate
+protocol.  Note that to free an instance allocated from the heap, one must
+correctly find its base address: the \descref{SOD_INSTBASE}[macro]{mac} will
+do this for you.
+
+The following simple mixin class is suggested.
+\begin{prog}
+  [nick = disposable] \\
+  class DisposableObject : SodObject \{ \\- \ind
+    void release() \{ ; \} \\
+    \quad /* Release resources held by the receiver. */ \- \\-
+  \}
+  \\+
+  code c : user \{ \\- \ind
+    /\=\+* Free object p's instance storage.  If p is a DisposableObject \\
+       {}* then release its resources beforehand. \\
+       {}*/ \- \\
+    void free_instance(void *p) \\
+    \{ \\ \ind
+      DisposableObject *d = SOD_CONVERT(DisposableObject, p); \\
+      if (d) DisposableObject_release(d); \\
+      free(d); \- \\
+    \} \- \\
+  \}
+\end{prog}
+
+\subsubsection{Conversions}
+Suppose one has a value of type pointer to class type of some class~$C$, and
+wants to convert it to a pointer to class type of some other class~$B$.
+There are three main cases to distinguish.
+\begin{itemize}
+\item If $B$ is a superclass of~$C$, in the same chain, then the conversion
+  is an \emph{in-chain upcast}.  The conversion can be performed using the
+  appropriate generated upcast macro (see below), or by simply casting the
+  pointer, using C's usual cast operator (or the \Cplusplus\ @|static_cast<>|
+  operator).
+\item If $B$ is a superclass of~$C$, in a different chain, then the
+  conversion is a \emph{cross-chain upcast}.  The conversion is more than a
+  simple type change: the pointer value must be adjusted.  If the direct
+  class of the instance in question is not known, the conversion will require
+  a lookup at runtime to find the appropriate offset by which to adjust the
+  pointer.  The conversion can be performed using the appropriate generated
+  upcast macro (see below); the general case is handled by the macro
+  \descref{SOD_XCHAIN}{mac}.
+\item If $B$ is a subclass of~$C$ then the conversion is an \emph{upcast};
+  otherwise the conversion is a~\emph{cross-cast}.  In either case, the
+  conversion can fail: the object in question might not be an instance of~$B$
+  at all.  The macro \descref{SOD_CONVERT}{mac} and the function
+  \descref{sod_convert}{fun} perform general conversions.  They return a null
+  pointer if the conversion fails.  (There are therefore your analogue to the
+  \Cplusplus @|dynamic_cast<>| operator.)
+\end{itemize}
+The Sod translator generates macros for performing both in-chain and
+cross-chain upcasts.  For each class~$C$, and each proper superclass~$B$
+of~$C$, a macro is defined: given an argument of type pointer to class type
+of~$C$, it returns a pointer to the same instance, only with type pointer to
+class type of~$B$, adjusted as necessary in the case of a cross-chain
+conversion.  The macro is named by concatenating
+\begin{itemize}
+\item the name of class~$C$, in upper case,
+\item the characters `@|__CONV_|', and
+\item the nickname of class~$B$, in upper case;
+\end{itemize}
+e.g., if $C$ is named @|MyClass|, and $B$'s name is @|SuperClass| with
+nickname @|super|, then the macro @|MYCLASS__CONV_SUPER| converts a
+@|MyClass~*| to a @|SuperClass~*|.  See
+\xref{sec:structures.layout.additional} for the formal description.
+
+%%%--------------------------------------------------------------------------
+\section{Keyword arguments} \label{sec:concepts.keywords}
+
+In standard C, the actual arguments provided to a function are matched up
+with the formal arguments given in the function definition according to their
+ordering in a list.  Unless the (rather cumbersome) machinery for dealing
+with variable-length argument tails (@|<stdarg.h>|) is used, exactly the
+correct number of arguments must be supplied, and in the correct order.
+
+A \emph{keyword argument} is matched by its distinctive \emph{name}, rather
+than by its position in a list.  Keyword arguments may be \emph{omitted},
+causing some default behaviour by the function.  A function can detect
+whether a particular keyword argument was supplied: so the default behaviour
+need not be the same as that caused by any specific value of the argument.
+
+Keyword arguments can be provided in three ways.
+\begin{enumerate}
+\item Directly, as a variable-length argument tail, consisting (for the most
+  part) of alternating keyword names, as pointers to null-terminated strings,
+  and argument values, and terminated by a null pointer.  This is somewhat
+  error-prone, and the support library defines some macros which help ensure
+  that keyword argument lists are well formed.
+\item Indirectly, through a @|va_list| object capturing a variable-length
+  argument tail passed to some other function.  Such indirect argument tails
+  have the same structure as the direct argument tails described above.
+  Because @|va_list| objects are hard to copy, the keyword-argument support
+  library consistently passes @|va_list| objects \emph{by reference}
+  throughout its programming interface.
+\item Indirectly, through a vector of @|struct kwval| objects, each of which
+  contains a keyword name, as a pointer to a null-terminated string, and the
+  \emph{address} of a corresponding argument value.  (This indirection is
+  necessary so that the items in the vector can be of uniform size.)
+  Argument vectors are rather inconvenient to use, but are the only practical
+  way in which a caller can decide at runtime which arguments to include in a
+  call, which is useful when writing wrapper functions.
+\end{enumerate}
+
+Keyword arguments are provided as a general feature for C functions.
+However, Sod has special support for messages which accept keyword arguments
+(\xref{sec:concepts.methods.keywords}).
+
+%%%--------------------------------------------------------------------------
+\section{Messages and methods} \label{sec:concepts.methods}
+
+Objects can be sent \emph{messages}.  A message has a \emph{name}, and
+carries a number of \emph{arguments}.  When an object is sent a message, a
+function, determined by the receiving object's class, is invoked, passing it
+the receiver and the message arguments.  This function is called the
+class's \emph{effective method} for the message.  The effective method can do
+anything a C function can do, including reading or updating program state or
+object slots, sending more messages, calling other functions, issuing system
+calls, or performing I/O; if it finishes, it may return a value, which is
+returned in turn to the message sender.
+
+The set of messages an object can receive, characterized by their names,
+argument types, and return type, is determined by the object's class.  Each
+class can define new messages, which can be received by any instance of that
+class.  The messages defined by a single class must have distinct names:
+there is no `function overloading'.  As with slots
+(\xref{sec:concepts.classes.slots}), messages defined by distinct classes are
+always distinct, even if they have the same names: references to messages are
+always qualified by the defining class's name or nickname.
+
+Messages may take any number of arguments, of any non-array value type.
+Since message sends are effectively function calls, arguments of array type
+are implicitly converted to values of the corresponding pointer type.  While
+message definitions may ascribe an array type to an argument, the formal
+argument will have pointer type, as is usual for C functions.  A message may
+accept a variable-length argument suffix, denoted @|\dots|.
+
+A class definition may include \emph{direct methods} for messages defined by
+it or any of its superclasses.
+
+Like messages, direct methods define argument lists and return types, but
+they may also have a \emph{body}, and a \emph{role}.
+
+A direct method need not have the same argument list or return type as its
+message.  The acceptable argument lists and return types for a method depend
+on the message, in particular its method combination
+(\xref{sec:concepts.methods.combination}), and the method's role.
+
+A direct method body is a block of C code, and the Sod translator usually
+defines, for each direct method, a function with external linkage, whose body
+contains a copy of the direct method body.  Within the body of a direct
+method defined for a class $C$, the variable @|me|, of type pointer to class
+type of $C$, refers to the receiving object.
+
+
+\subsection{Effective methods and method combinations}
+\label{sec:concepts.methods.combination}
+
+For each message a direct instance of a class might receive, there is a set
+of \emph{applicable methods}, which are exactly the direct methods defined on
+the object's class and its superclasses.  These direct methods are combined
+together to form the \emph{effective method} for that particular class and
+message.  Direct methods can be combined into an effective method in
+different ways, according to the \emph{method combination} specified by the
+message.  The method combination determines which direct method roles are
+acceptable, and, for each role, the appropriate argument lists and return
+types.
+
+One direct method, $M$, is said to be more (resp.\ less) \emph{specific} than
+another, $N$, with respect to a receiving class~$C$, if the class defining
+$M$ is a more (resp.\ less) specific superclass of~$C$ than the class
+defining $N$.
+
+\subsubsection{The standard method combination}
+The default method combination is called the \emph{standard method
+combination}; other method combinations are useful occasionally for special
+effects.  The standard method combination accepts four direct method roles,
+called `primary' (the default), @|before|, @|after|, and @|around|.
+
+All direct methods subject to the standard method combination must have
+argument lists which \emph{match} the message's argument list:
+\begin{itemize}
+\item the method's arguments must have the same types as the message, though
+  the arguments may have different names; and
+\item if the message accepts a variable-length argument suffix then the
+  direct method must instead have a final argument of type @|va_list|.
+\end{itemize}
+Primary and @|around| methods must have the same return type as the message;
+@|before| and @|after| methods must return @|void| regardless of the
+message's return type.
+
+If there are no applicable primary methods then no effective method is
+constructed: the vtables contain null pointers in place of pointers to method
+entry functions.
+
+The effective method for a message with standard method combination works as
+follows.
+\begin{enumerate}
+
+\item If any applicable methods have the @|around| role, then the most
+  specific such method, with respect to the class of the receiving object, is
+  invoked.
+
+  Within the body of an @|around| method, the variable @|next_method| is
+  defined, having pointer-to-function type.  The method may call this
+  function, as described below, any number of times.
+
+  If there any remaining @|around| methods, then @|next_method| invokes the
+  next most specific such method, returning whichever value that method
+  returns; otherwise the behaviour of @|next_method| is to invoke the before
+  methods (if any), followed by the most specific primary method, followed by
+  the @|around| methods (if any), and to return whichever value was returned
+  by the most specific primary method, as described in the following items.
+  That is, the behaviour of the least specific @|around| method's
+  @|next_method| function is exactly the behaviour that the effective method
+  would have if there were no @|around| methods.  Note that if the
+  least-specific @|around| method calls its @|next_method| more than once
+  then the whole sequence of @|before|, primary, and @|after| methods occurs
+  multiple times.
+
+  The value returned by the most specific @|around| method is the value
+  returned by the effective method.
+
+\item If any applicable methods have the @|before| role, then they are all
+  invoked, starting with the most specific.
+
+\item The most specific applicable primary method is invoked.
+
+  Within the body of a primary method, the variable @|next_method| is
+  defined, having pointer-to-function type.  If there are no remaining less
+  specific primary methods, then @|next_method| is a null pointer.
+  Otherwise, the method may call the @|next_method| function any number of
+  times.
+
+  The behaviour of the @|next_method| function, if it is not null, is to
+  invoke the next most specific applicable primary method, and to return
+  whichever value that method returns.
+
+  If there are no applicable @|around| methods, then the value returned by
+  the most specific primary method is the value returned by the effective
+  method; otherwise the value returned by the most specific primary method is
+  returned to the least specific @|around| method, which called it via its
+  own @|next_method| function.
+
+\item If any applicable methods have the @|after| role, then they are all
+  invoked, starting with the \emph{least} specific.  (Hence, the most
+  specific @|after| method is invoked with the most `afterness'.)
+
+\end{enumerate}
+
+A typical use for @|around| methods is to allow a base class to set up the
+dynamic environment appropriately for the primary methods of its subclasses,
+e.g., by claiming a lock, and restore it afterwards.
+
+The @|next_method| function provided to methods with the primary and
+@|around| roles accepts the same arguments, and returns the same type, as the
+message, except that one or two additional arguments are inserted at the
+front of the argument list.  The first additional argument is always the
+receiving object, @|me|.  If the message accepts a variable argument suffix,
+then the second addition argument is a @|va_list|; otherwise there is no
+second additional argument; otherwise, In the former case, a variable
+@|sod__master_ap| of type @|va_list| is defined, containing a separate copy
+of the argument pointer (so the method body can process the variable argument
+suffix itself, and still pass a fresh copy on to the next method).
+
+A method with the primary or @|around| role may use the convenience macro
+@|CALL_NEXT_METHOD|, which takes no arguments itself, and simply calls
+@|next_method| with appropriate arguments: the receiver @|me| pointer, the
+argument pointer @|sod__master_ap| (if applicable), and the method's
+arguments.  If the method body has overwritten its formal arguments, then
+@|CALL_NEXT_METHOD| will pass along the updated values, rather than the
+original ones.
+
+A primary or @|around| method which invokes its @|next_method| function is
+said to \emph{extend} the message behaviour; a method which does not invoke
+its @|next_method| is said to \emph{override} the behaviour.  Note that a
+method may make a decision to override or extend at runtime.
+
+\subsubsection{Aggregating method combinations}
+A number of other method combinations are provided.  They are called
+`aggregating' method combinations because, instead of invoking just the most
+specific primary method, as the standard method combination does, they invoke
+the applicable primary methods in turn and aggregate the return values from
+each.
+
+The aggregating method combinations accept the same four roles as the
+standard method combination, and @|around|, @|before|, and @|after| methods
+work in the same way.
+
+The aggregating method combinations provided are as follows.
+\begin{description} \let\makelabel\code
+\item[progn] The message must return @|void|.  The applicable primary methods
+  are simply invoked in turn, most specific first.
+\item[sum] The message must return a numeric type.\footnote{%
+    The Sod translator does not check this, since it doesn't have enough
+    insight into @|typedef| names.} %
+  The applicable primary methods are invoked in turn, and their return values
+  added up.  The final result is the sum of the individual values.
+\item[product] The message must return a numeric type.  The applicable
+  primary methods are invoked in turn, and their return values multiplied
+  together.  The final result is the product of the individual values.
+\item[min] The message must return a scalar type.  The applicable primary
+  methods are invoked in turn.  The final result is the smallest of the
+  individual values.
+\item[max] The message must return a scalar type.  The applicable primary
+  methods are invoked in turn.  The final result is the largest of the
+  individual values.
+\item[and] The message must return a scalar type.  The applicable primary
+  methods are invoked in turn.  If any method returns zero then the final
+  result is zero and no further methods are invoked.  If all of the
+  applicable primary methods return nonzero, then the final result is the
+  result of the last primary method.
+\item[or] The message must return a scalar type.  The applicable primary
+  methods are invoked in turn.  If any method returns nonzero then the final
+  result is that nonzero value and no further methods are invoked.  If all of
+  the applicable primary methods return zero, then the final result is zero.
+\end{description}
+
+There is also a @|custom| aggregating method combination, which is described
+in \xref{sec:fixme.custom-aggregating-method-combination}.
+
+
+\subsection{Messages with keyword arguments}
+\label{sec:concepts.methods.keywords}
+
+A message or a direct method may declare that it accepts keyword arguments.
+A message which accepts keyword arguments is called a \emph{keyword message};
+a direct method which accepts keyword arguments is called a \emph{keyword
+method}.
+
+While method combinations may set their own rules, usually keyword methods
+can only be defined on keyword messages, and all methods defined on a keyword
+message must be keyword methods.  The direct methods defined on a keyword
+message may differ in the keywords they accept, both from each other, and
+from the message.  If two superclasses of some common class both define
+keyword methods on the same message, and the methods both accept a keyword
+argument with the same name, then these two keyword arguments must also have
+the same type.  Different applicable methods may declare keyword arguments
+with the same name but different defaults; see below.
+
+The keyword arguments acceptable in a message sent to an object are the
+keywords listed in the message definition, together with all of the keywords
+accepted by any applicable method.  There is no easy way to determine at
+runtime whether a particular keyword is acceptable in a message to a given
+instance.
+
+At runtime, a direct method which accepts one or more keyword arguments
+receives an additional argument named @|suppliedp|.  This argument is a small
+structure.  For each keyword argument named $k$ accepted by the direct
+method, @|suppliedp| contains a one-bit-wide bitfield member of type
+@|unsigned|, also named $k$.  If a keyword argument named $k$ was passed in
+the message, then @|suppliedp.$k$| is one, and $k$ contains the argument
+value; otherwise @|suppliedp.$k$| is zero, and $k$ contains the default value
+from the direct method definition if there was one, or an unspecified value
+otherwise.
+
+%%%--------------------------------------------------------------------------
+\section{Metaclasses} \label{sec:concepts.metaclasses}
  
  %%%----- That's all, folks --------------------------------------------------