\chapter{Concepts} \label{ch:concepts}
-%%%--------------------------------------------------------------------------
-\section{Operational model} \label{sec:concepts.model}
-
-The Sod translator runs as a preprocessor, similar in nature to the
-traditional Unix \man{lex}{1} and \man{yacc}{1} tools. The translator reads
-a \emph{module} file containing class definitions and other information, and
-writes C~source and header files. The source files contain function
-definitions and static tables which are fed directly to a C~compiler; the
-header files contain declarations for functions and data structures, and are
-included by source files -- whether hand-written or generated by Sod -- which
-makes use of the classes defined in the module.
-
-Sod is not like \Cplusplus: it makes no attempt to `enhance' the C language
-itself. Sod module files describe classes, messages, methods, slots, and
-other kinds of object-system things, and some of these descriptions need to
-contain C code fragments, but this code is entirely uninterpreted by the Sod
-translator.\footnote{%
- As long as a code fragment broadly follows C's lexical rules, and properly
- matches parentheses, brackets, and braces, the Sod translator will copy it
- into its output unchanged. It might, in fact, be some other kind of C-like
- language, such as Objective~C or \Cplusplus. Or maybe even
- Objective~\Cplusplus, because if having an object system is good, then
- having three must be really awesome.} %
-
-The Sod translator is not a closed system. It is written in Common Lisp, and
-can load extension modules which add new input syntax, output formats, or
-altered behaviour. The interface for writing such extensions is described in
-\xref{p:lisp}. Extensions can change almost all details of the Sod object
-system, so the material in this manual must be read with this in mind: this
-manual describes the base system as provided in the distribution.
-
%%%--------------------------------------------------------------------------
\section{Modules} \label{sec:concepts.modules}
\subsubsection{Slot initializers}
As well as defining slot names and types, a class can also associate an
\emph{initial value} with each slot defined by itself or one of its
-subclasses. A class $C$ provides an \emph{initialization function} (see
+subclasses. A class $C$ provides an \emph{initialization message} (see
\xref{sec:concepts.lifecycle.birth}, and \xref{sec:structures.root.sodclass})
-which sets the slots of a \emph{direct} instance of the class to the correct
-initial values. If several of $C$'s superclasses define initializers for the
-same slot then the initializer from the most specific such class is used. If
-none of $C$'s superclasses define an initializer for some slot then that slot
-will be left uninitialized.
+whose methods set the slots of a \emph{direct} instance of the class to the
+correct initial values. If several of $C$'s superclasses define initializers
+for the same slot then the initializer from the most specific such class is
+used. If none of $C$'s superclasses define an initializer for some slot then
+that slot will be left uninitialized.
The initializer for a slot with scalar type may be any C expression. The
initializer for a slot with aggregate type must contain only constant
stash them in a dynamically allocated private structure, and leave a pointer
to it in a slot. (This will also help preserve binary compatibility, because
the private structure can grow more members as needed. See
-\xref{sec:fixme.compatibility} for more details.
+\xref{sec:fixme.compatibility} for more details.)
+
+\subsubsection{Vtables}
+
\subsubsection{Class objects}
In Sod's object system, classes are objects too. Therefore classes are
slot containing a function pointer is not at all the same thing as a method.)
\subsubsection{Conversions}
-Suppose one has a value of type pointer to class type of some class~$C$, and
-wants to convert it to a pointer to class type of some other class~$B$.
+Suppose one has a value of type pointer-to-class-type for some class~$C$, and
+wants to convert it to a pointer-to-class-type for some other class~$B$.
There are three main cases to distinguish.
\begin{itemize}
\item If $B$ is a superclass of~$C$, in the same chain, then the conversion
pointer. The conversion can be performed using the appropriate generated
upcast macro (see below); the general case is handled by the macro
\descref{SOD_XCHAIN}{mac}.
-\item If $B$ is a subclass of~$C$ then the conversion is an \emph{upcast};
+\item If $B$ is a subclass of~$C$ then the conversion is a \emph{downcast};
otherwise the conversion is a~\emph{cross-cast}. In either case, the
conversion can fail: the object in question might not be an instance of~$B$
- at all. The macro \descref{SOD_CONVERT}{mac} and the function
+ after all. The macro \descref{SOD_CONVERT}{mac} and the function
\descref{sod_convert}{fun} perform general conversions. They return a null
pointer if the conversion fails. (There are therefore your analogue to the
- \Cplusplus @|dynamic_cast<>| operator.)
+ \Cplusplus\ @|dynamic_cast<>| operator.)
\end{itemize}
The Sod translator generates macros for performing both in-chain and
cross-chain upcasts. For each class~$C$, and each proper superclass~$B$
constructed: the vtables contain null pointers in place of pointers to method
entry functions.
+\begin{figure}
+ \begin{tikzpicture}
+ [>=stealth, thick,
+ order/.append style={color=green!70!black},
+ code/.append style={font=\sffamily},
+ action/.append style={font=\itshape},
+ method/.append style={rectangle, draw=black, thin, fill=blue!30,
+ text height=\ht\strutbox, text depth=\dp\strutbox,
+ minimum width=40mm}]
+
+ \def\delgstack#1#2#3{
+ \node (#10) [method, #2] {#3};
+ \node (#11) [method, above=6mm of #10] {#3};
+ \draw [->] ($(#10.north)!.5!(#10.north west) + (0mm, 1mm)$) --
+ ++(0mm, 4mm)
+ node [code, left=4pt, midway] {next_method};
+ \draw [<-] ($(#10.north)!.5!(#10.north east) + (0mm, 1mm)$) --
+ ++(0mm, 4mm)
+ node [action, right=4pt, midway] {return};
+ \draw [->] ($(#11.north)!.5!(#11.north west) + (0mm, 1mm)$) --
+ ++(0mm, 4mm)
+ node [code, left=4pt, midway] {next_method}
+ node (ld) [above] {$\smash\vdots\mathstrut$};
+ \draw [<-] ($(#11.north)!.5!(#11.north east) + (0mm, 1mm)$) --
+ ++(0mm, 4mm)
+ node [action, right=4pt, midway] {return}
+ node (rd) [above] {$\smash\vdots\mathstrut$};
+ \draw [->] ($(ld.north) + (0mm, 1mm)$) -- ++(0mm, 4mm)
+ node [code, left=4pt, midway] {next_method};
+ \draw [<-] ($(rd.north) + (0mm, 1mm)$) -- ++(0mm, 4mm)
+ node [action, right=4pt, midway] {return};
+ \node (p) at ($(ld.north)!.5!(rd.north)$) {};
+ \node (#1n) [method, above=5mm of p] {#3};
+ \draw [->, order] ($(#10.south east) + (4mm, 1mm)$) --
+ ($(#1n.north east) + (4mm, -1mm)$)
+ node [midway, right, align=left]
+ {Most to \\ least \\ specific};}
+
+ \delgstack{a}{}{Around method}
+ \draw [<-] ($(a0.south)!.5!(a0.south west) - (0mm, 1mm)$) --
+ ++(0mm, -4mm);
+ \draw [->] ($(a0.south)!.5!(a0.south east) - (0mm, 1mm)$) --
+ ++(0mm, -4mm)
+ node [action, right=4pt, midway] {return};
+
+ \draw [->] ($(an.north)!.6!(an.north west) + (0mm, 1mm)$) --
+ ++(-8mm, 8mm)
+ node [code, midway, left=3mm] {next_method}
+ node (b0) [method, above left = 1mm + 4mm and -6mm - 4mm] {};
+ \node (b1) [method] at ($(b0) - (2mm, 2mm)$) {};
+ \node (bn) [method] at ($(b1) - (2mm, 2mm)$) {Before method};
+ \draw [->, order] ($(bn.west) - (6mm, 0mm)$) -- ++(12mm, 12mm)
+ node [midway, above left, align=center] {Most to \\ least \\ specific};
+ \draw [->] ($(b0.north east) + (-10mm, 1mm)$) -- ++(8mm, 8mm)
+ node (p) {};
+
+ \delgstack{m}{above right=1mm and 0mm of an.west |- p}{Primary method}
+ \draw [->] ($(mn.north)!.5!(mn.north west) + (0mm, 1mm)$) -- ++(0mm, 4mm)
+ node [code, left=4pt, midway] {next_method}
+ node [above right = 0mm and -8mm]
+ {$\vcenter{\hbox{\Huge\textcolor{red}{!}}}
+ \vcenter{\hbox{\begin{tabular}[c]{l}
+ \textsf{next_method} \\
+ pointer is null
+ \end{tabular}}}$};
+
+ \draw [->, color=blue, dotted]
+ ($(m0.south)!.2!(m0.south east) - (0mm, 1mm)$) --
+ ($(an.north)!.2!(an.north east) + (0mm, 1mm)$)
+ node [midway, sloped, below] {Return value};
+
+ \draw [<-] ($(an.north)!.6!(an.north east) + (0mm, 1mm)$) --
+ ++(8mm, 8mm)
+ node [action, midway, right=3mm] {return}
+ node (f0) [method, above right = 1mm and -6mm] {};
+ \node (f1) [method] at ($(f0) + (-2mm, 2mm)$) {};
+ \node (fn) [method] at ($(f1) + (-2mm, 2mm)$) {After method};
+ \draw [<-, order] ($(f0.east) + (6mm, 0mm)$) -- ++(-12mm, 12mm)
+ node [midway, above right, align=center]
+ {Least to \\ most \\ specific};
+ \draw [<-] ($(fn.north west) + (6mm, 1mm)$) -- ++(-8mm, 8mm);
+
+ \end{tikzpicture}
+
+ \caption{The standard method combination}
+ \label{fig:concepts.methods.stdmeth}
+\end{figure}
+
The effective method for a message with standard method combination works as
-follows.
+follows (see also~\xref{fig:concepts.methods.stdmeth}).
\begin{enumerate}
\item If any applicable methods have the @|around| role, then the most
in \xref{sec:fixme.custom-aggregating-method-combination}.
+\subsection{Sending messages in C} \label{sec:concepts.methods.c}
+
+Each instance is associated with its direct class [FIXME]
+
+The effective methods for each class are determined at translation time, by
+the Sod translator. For each effective method, one or more \emph{method
+entry functions} are constructed. A method entry function has three
+responsibilities.
+\begin{itemize}
+\item It converts the receiver pointer to the correct type. Method entry
+ functions can perform these conversions extremely efficiently: there are
+ separate method entries for each chain of each class which can receive a
+ message, so method entry functions are in the privileged situation of
+ knowing the \emph{exact} class of the receiving object.
+\item If the message accepts a variable-length argument tail, then two method
+ entry functions are created for each chain of each class: one receives a
+ variable-length argument tail, as intended, and captures it in a @|va_list|
+ object; the other accepts an argument of type @|va_list| in place of the
+ variable-length tail and arranges for it to be passed along to the direct
+ methods.
+\item It invokes the effective method with the appropriate arguments. There
+ might or might not be an actual function corresponding to the effective
+ method itself: the translator may instead open-code the effective method's
+ behaviour into each method entry function; and the machinery for handling
+ `delegation chains', such as is used for @|around| methods and primary
+ methods in the standard method combination, is necessarily scattered among
+ a number of small functions.
+\end{itemize}
+
+
\subsection{Messages with keyword arguments}
\label{sec:concepts.methods.keywords}
The following simple function correctly allocates and returns space for an
instance of a class given a pointer to its class object @<cls>.
\begin{prog}
- void *allocate_instance(const SodClass *cls) \\ \ind
+ void *allocate_instance(const SodClass *cls) \\ \ind
\{ return malloc(cls@->cls.initsz); \}
\end{prog}
The following simple function imprints storage at address @<p> as an instance
of a class, given a pointer to its class object @<cls>.
\begin{prog}
- void imprint_instance(const SodClass *cls, void *p) \\ \ind
+ void imprint_instance(const SodClass *cls, void *p) \\ \ind
\{ cls@->cls.imprint(p); \}
\end{prog}
Details of initialization are necessarily class-specific, but typically it
involves setting the instance's slots to appropriate values, and possibly
-linking it into some larger data structure to keep track of it.
+linking it into some larger data structure to keep track of it. It is
+possible for initialization methods to attempt to allocate resources, but
+this must be done carefully: there is currently no way to report an error
+from object initialization, so the object must be marked as incompletely
+initialized, and left in a state where it will be safe to tear down later.
Initialization is performed by sending the imprinted instance an @|init|
message, defined by the @|SodObject| class. This message uses a nonstandard
method combination which works like the standard combination, except that the
\emph{default behaviour}, if there is no overriding method, is to initialize
-the instance's slots using the initializers defined in the class and its
-superclasses, and to invoke each superclass's initialization fragments. This
-default behaviour may be invoked multiple times if some method calls on its
-@|next_method| more than once, unless some other method takes steps to
-prevent this.
+the instance's slots, as described below, and to invoke each superclass's
+initialization fragments. This default behaviour may be invoked multiple
+times if some method calls on its @|next_method| more than once, unless some
+other method takes steps to prevent this.
Slots are initialized in a well-defined order.
\begin{itemize}
or @|goto| for special control-flow effects, but this is not likely to be a
good idea.
-Note that an initialization fragment defined in a class is copied literally
-into each subclass's initialization method. This is fine for simple cases
-but wasteful if the initialization logic is complicated. More complex
-initialization behaviour should be added either by having an initialization
-fragments call functions (necessarily with external linkage), or by defining
-@|after| methods on the @|init| message. These will be run after the slot
-initializers have been applied, in reverse precedence order.
-
-Initialization is \emph{parametrized}, so the caller may select from a space
-of possible initial states for the new instance, or to inform the new
-instance about some other objects known to the caller. Specifically, the
-@|init| message accepts keyword arguments (\xref{sec:concepts.keywords})
-which can be defined and used by methods defined on it.
+The @|init| message accepts keyword arguments
+(\xref{sec:concepts.methods.keywords}). The set of acceptable keywords is
+determined by the applicable methods as usual, but also by the
+\emph{initargs} defined by the receiving instance's class and its
+superclasses, which are made available to slot initializers and
+initialization fragments.
+
+There are two kinds of initarg definitions. \emph{User initargs} are defined
+by an explicit @|initarg| item appearing in a class definition: the item
+defines a name, type, and (optionally) a default value for the initarg.
+\emph{Slot initargs} are defined by attaching an @|initarg| property to a
+slot or slot initializer item: the property's determines the initarg's name,
+while the type is taken from the underlying slot type; slot initargs do not
+have default values. Both kinds define a \emph{direct initarg} for the
+containing class.
+
+Initargs are inherited. The \emph{applicable} direct initargs for an @|init|
+effective method are those defined by the receiving object's class, and all
+of its superclasses. Applicable direct initargs with the same name are
+merged to form \emph{effective initargs}. An error is reported if two
+applicable direct initargs have the same name but different types. The
+default value of an effective initarg is taken from the most specific
+applicable direct initarg which specifies a defalt value; if no applicable
+direct initarg specifies a default value then the effective initarg has no
+default.
+
+All initarg values are made available at runtime to user code --
+initialization fragments and slot initializer expressions -- through local
+variables and a @|suppliedp| structure, as in a direct method
+(\xref{sec:concepts.methods.keywords}). Furthermore, slot initarg
+definitions influence the initialization of slots.
+
+The process for deciding how to initialize a particular slot works as
+follows.
+\begin{enumerate}
+\item If there are any slot initargs defined on the slot, or any of its slot
+ initializers, \emph{and} the sender supplied a value for one or more of the
+ corresponding effective initargs, then the value of the most specific slot
+ initarg is stored in the slot.
+\item Otherwise, if there are any slot initializers defined which include an
+ initializer expression, then the initializer expression from the most
+ specific such slot initializer is evaluated and its value stored in the
+ slot.
+\item Otherwise, the slot is left uninitialized.
+\end{enumerate}
+Note that the default values (if any) of effective initargs do \emph{not}
+affect this procedure.
\subsection{Destruction}
\descref{sod_destroy}[function]{fun}.
\subsubsection{Teardown}
-Details of initialization are necessarily class-specific, but typically it
-involves setting the instance's slots to appropriate values, and possibly
-linking it into some larger data structure to keep track of it.
+Details of teardown are necessarily class-specific, but typically it
+involves releasing resources held by the instance, and disentangling it from
+any data structures it might be linked into.
Teardown is performed by sending the instance the @|teardown| message,
defined by the @|SodObject| class. The message returns an integer, used as a
This simple protocol can be used, for example, to implement a reference
counting system, as follows.
\begin{prog}
- [nick = ref] \\
- class ReferenceCountedObject \{ \\ \ind
- unsigned nref = 1; \\-
- void inc() \{ me@->ref.nref++; \} \\-
- [role = around] \\
- int obj.teardown() \\
- \{ \\ \ind
- if (--\,--me@->ref.nref) return (1); \\
- else return (CALL_NEXT_METHOD); \- \\
- \} \- \\
+ [nick = ref] \\
+ class ReferenceCountedObject: SodObject \{ \\ \ind
+ unsigned nref = 1; \\-
+ void inc() \{ me@->ref.nref++; \} \\-
+ [role = around] \\
+ int obj.teardown() \\
+ \{ \\ \ind
+ if (--\,--me@->ref.nref) return (1); \\
+ else return (CALL_NEXT_METHOD); \-\\
+ \} \-\\
\}
\end{prog}
%%%--------------------------------------------------------------------------
\section{Metaclasses} \label{sec:concepts.metaclasses}
+%%%--------------------------------------------------------------------------
+\section{Compatibility considerations} \label{sec:concepts.compatibility}
+
+Sod doesn't make source-level compatibility especially difficult. As long as
+classes, slots, and messages don't change names or dissappear, and slots and
+messages retain their approximate types, everything will be fine.
+
+Binary compatibility is much more difficult. Unfortunately, Sod classes have
+rather fragile binary interfaces.\footnote{%
+ Research suggestion: investigate alternative instance and vtable layouts
+ which improve binary compatibility, probably at the expense of instance
+ compactness, and efficiency of slot access and message sending. There may
+ be interesting trade-offs to be made.} %
+
+If instances are allocated [FIXME]
+
%%%----- That's all, folks --------------------------------------------------
%%% Local variables: