Commit | Line | Data |
---|---|---|
1f7d590d MW |
1 | %%% -*-latex-*- |
2 | %%% | |
3 | %%% Conceptual background | |
4 | %%% | |
5 | %%% (c) 2015 Straylight/Edgeware | |
6 | %%% | |
7 | ||
8 | %%%----- Licensing notice --------------------------------------------------- | |
9 | %%% | |
e0808c47 | 10 | %%% This file is part of the Sensible Object Design, an object system for C. |
1f7d590d MW |
11 | %%% |
12 | %%% SOD is free software; you can redistribute it and/or modify | |
13 | %%% it under the terms of the GNU General Public License as published by | |
14 | %%% the Free Software Foundation; either version 2 of the License, or | |
15 | %%% (at your option) any later version. | |
16 | %%% | |
17 | %%% SOD is distributed in the hope that it will be useful, | |
18 | %%% but WITHOUT ANY WARRANTY; without even the implied warranty of | |
19 | %%% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | |
20 | %%% GNU General Public License for more details. | |
21 | %%% | |
22 | %%% You should have received a copy of the GNU General Public License | |
23 | %%% along with SOD; if not, write to the Free Software Foundation, | |
24 | %%% Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. | |
25 | ||
3cc520db | 26 | \chapter{Concepts} \label{ch:concepts} |
1f7d590d | 27 | |
3cc520db MW |
28 | %%%-------------------------------------------------------------------------- |
29 | \section{Operational model} \label{sec:concepts.model} | |
1f7d590d | 30 | |
3cc520db MW |
31 | The Sod translator runs as a preprocessor, similar in nature to the |
32 | traditional Unix \man{lex}{1} and \man{yacc}{1} tools. The translator reads | |
33 | a \emph{module} file containing class definitions and other information, and | |
34 | writes C~source and header files. The source files contain function | |
35 | definitions and static tables which are fed directly to a C~compiler; the | |
36 | header files contain declarations for functions and data structures, and are | |
37 | included by source files -- whether hand-written or generated by Sod -- which | |
38 | makes use of the classes defined in the module. | |
1f7d590d | 39 | |
3cc520db MW |
40 | Sod is not like \Cplusplus: it makes no attempt to `enhance' the C language |
41 | itself. Sod module files describe classes, messages, methods, slots, and | |
42 | other kinds of object-system things, and some of these descriptions need to | |
43 | contain C code fragments, but this code is entirely uninterpreted by the Sod | |
44 | translator.\footnote{% | |
45 | As long as a code fragment broadly follows C's lexical rules, and properly | |
46 | matches parentheses, brackets, and braces, the Sod translator will copy it | |
47 | into its output unchanged. It might, in fact, be some other kind of C-like | |
48 | language, such as Objective~C or \Cplusplus. Or maybe even | |
49 | Objective~\Cplusplus, because if having an object system is good, then | |
50 | having three must be really awesome.} % | |
1f7d590d | 51 | |
3cc520db MW |
52 | The Sod translator is not a closed system. It is written in Common Lisp, and |
53 | can load extension modules which add new input syntax, output formats, or | |
54 | altered behaviour. The interface for writing such extensions is described in | |
55 | \xref{p:lisp}. Extensions can change almost all details of the Sod object | |
56 | system, so the material in this manual must be read with this in mind: this | |
57 | manual describes the base system as provided in the distribution. | |
58 | ||
59 | %%%-------------------------------------------------------------------------- | |
60 | \section{Modules} \label{sec:concepts.modules} | |
61 | ||
62 | A \emph{module} is the top-level syntactic unit of input to the Sod | |
63 | translator. As described above, given an input module, the translator | |
64 | generates C source and header files. | |
65 | ||
66 | A module can \emph{import} other modules. This makes the type names and | |
67 | classes defined in those other modules available to class definitions in the | |
68 | importing module. Sod's module system is intentionally very simple. There | |
69 | are no private declarations or attempts to hide things. | |
70 | ||
71 | As well as importing existing modules, a module can include a number of | |
72 | different kinds of \emph{items}: | |
73 | \begin{itemize} | |
74 | \item \emph{class definitions} describe new classes, possibly in terms of | |
75 | existing classes; | |
76 | \item \emph{type name declarations} introduce new type names to Sod's | |
77 | parser;\footnote{% | |
78 | This is unfortunately necessary because C syntax, upon which Sod's input | |
79 | language is based for obvious reasons, needs to treat type names | |
80 | differently from other kinds of identifiers.} % | |
81 | and | |
82 | \item \emph{code fragments} contain literal C code to be dropped into an | |
83 | appropriate place in an output file. | |
84 | \end{itemize} | |
85 | Each kind of item, and, indeed, a module as a whole, can have a collection of | |
86 | \emph{properties} associated with it. A property has a \emph{name} and a | |
87 | \emph{value}. Properties are an open-ended way of attaching additional | |
88 | information to module items, so extensions can make use of them without | |
89 | having to implement additional syntax. | |
90 | ||
91 | %%%-------------------------------------------------------------------------- | |
92 | \section{Classes, instances, and slots} \label{sec:concepts.classes} | |
93 | ||
94 | For the most part, Sod takes a fairly traditional view of what it means to be | |
95 | an object system. | |
96 | ||
97 | An \emph{object} maintains \emph{state} and exhibits \emph{behaviour}. An | |
98 | object's state is maintained in named \emph{slots}, each of which can store a | |
99 | C value of an appropriate (scalar or aggregate) type. An object's behaviour | |
100 | is stimulated by sending it \emph{messages}. A message has a name, and may | |
101 | carry a number of arguments, which are C values; sending a message may result | |
102 | in the state of receiving object (or other objects) being changed, and a C | |
103 | value being returned to the sender. | |
104 | ||
105 | Every object is a (direct) instance of some \emph{class}. The class | |
106 | determines which slots its instances have, which messages its instances can | |
107 | be sent, and which methods are invoked when those messages are received. The | |
108 | Sod translator's main job is to read class definitions and convert them into | |
109 | appropriate C declarations, tables, and functions. An object cannot | |
110 | (usually) change its direct class, and the direct class of an object is not | |
111 | affected by, for example, the static type of a pointer to it. | |
112 | ||
113 | \subsection{Superclasses and inheritance} | |
114 | \label{sec:concepts.classes.inherit} | |
115 | ||
116 | \subsubsection{Class relationships} | |
117 | Each class has zero or more \emph{direct superclasses}. | |
118 | ||
119 | A class with no direct superclasses is called a \emph{root class}. The Sod | |
120 | runtime library includes a root class named @|SodObject|; making new root | |
121 | classes is somewhat tricky, and won't be discussed further here. | |
122 | ||
123 | Classes can have more than one direct superclass, i.e., Sod supports | |
124 | \emph{multiple inheritance}. A Sod class definition for a class~$C$ lists | |
125 | the direct superclasses of $C$ in a particular order. This order is called | |
126 | the \emph{local precedence order} of $C$, and the list which consists of $C$ | |
127 | follows by $C$'s direct superclasses in local precedence order is called the | |
128 | $C$'s \emph{local precedence list}. | |
129 | ||
130 | The multiple inheritance in Sod works similarly to multiple inheritance in | |
131 | Lisp-like languages, such as Common Lisp, EuLisp, Dylan, and Python, which is | |
132 | very different from how multiple inheritance works in \Cplusplus.\footnote{% | |
133 | The latter can be summarized as `badly'. By default in \Cplusplus, an | |
134 | instance receives an additional copy of superclass's state for each path | |
135 | through the class graph from the instance's direct class to that | |
136 | superclass, though this behaviour can be overridden by declaring | |
137 | superclasses to be @|virtual|. Also, \Cplusplus\ offers only trivial | |
138 | method combination (\xref{sec:concepts.methods}), leaving programmers to | |
139 | deal with delegation manually and (usually) statically.} % | |
140 | ||
141 | If $C$ is a class, then the \emph{superclasses} of $C$ are | |
142 | \begin{itemize} | |
143 | \item $C$ itself, and | |
144 | \item the superclasses of each of $C$'s direct superclasses. | |
145 | \end{itemize} | |
146 | The \emph{proper superclasses} of a class $C$ are the superclasses of $C$ | |
147 | except for $C$ itself. If a class $B$ is a (direct, proper) superclass of | |
148 | $C$, then $C$ is a \emph{(direct, proper) subclass} of $B$. If $C$ is a root | |
149 | class then the only superclass of $C$ is $C$ itself, and $C$ has no proper | |
150 | superclasses. | |
151 | ||
152 | If an object is a direct instance of class~$C$ then the object is also an | |
153 | (indirect) instance of every superclass of $C$. | |
154 | ||
155 | If $C$ has a proper superclass $B$, then $B$ is not allowed to have $C$ has a | |
156 | direct superclass. In different terms, if we construct a graph, whose | |
157 | vertices are classes, and draw an edge from each class to each of its direct | |
158 | superclasses, then this graph must be acyclic. In yet other terms, the `is a | |
159 | superclass of' relation is a partial order on classes. | |
160 | ||
161 | \subsubsection{The class precedence list} | |
162 | This partial order is not quite sufficient for our purposes. For each class | |
163 | $C$, we shall need to extend it into a total order on $C$'s superclasses. | |
164 | This calculation is called \emph{superclass linearization}, and the result is | |
165 | a \emph{class precedence list}, which lists each of $C$'s superclasses | |
166 | exactly once. If a superclass $B$ precedes (resp.\ follows) some other | |
167 | superclass $A$ in $C$'s class precedence list, then we say that $B$ is a more | |
168 | (resp.\ less) \emph{specific} superclass of $C$ than $A$ is. | |
169 | ||
170 | The superclass linearization algorithm isn't fixed, and extensions to the | |
171 | translator can introduce new linearizations for special effects, but the | |
172 | following properties are expected to hold. | |
173 | \begin{itemize} | |
174 | \item The first class in $C$'s class precedence list is $C$ itself; i.e., | |
175 | $C$ is always its own most specific superclass. | |
176 | \item If $A$ and $B$ are both superclasses of $C$, and $A$ is a proper | |
177 | superclass of $B$ then $A$ appears after $B$ in $C$'s class precedence | |
178 | list, i.e., $B$ is a more specific superclass of $C$ than $A$ is. | |
179 | \end{itemize} | |
180 | The default linearization algorithm used in Sod is the \emph{C3} algorithm, | |
181 | which has a number of good properties described in~\cite{FIXME:C3}. | |
182 | It works as follows. | |
183 | \begin{itemize} | |
184 | \item A \emph{merge} of some number of input lists is a single list | |
185 | containing each item that is in any of the input lists exactly once, and no | |
186 | other items; if an item $x$ appears before an item $y$ in any input list, | |
187 | then $x$ also appears before $y$ in the merge. If a collection of lists | |
188 | have no merge then they are said to be \emph{inconsistent}. | |
189 | \item The class precedence list of a class $C$ is a merge of the local | |
190 | precedence list of $C$ together with the class precedence lists of each of | |
191 | $C$'s direct superclasses. | |
192 | \item If there are no such merges, then the definition of $C$ is invalid. | |
193 | \item Suppose that there are multiple candidate merges. Consider the | |
194 | earliest position in these candidate merges at which they disagree. The | |
195 | \emph{candidate classes} at this position are the classes appearing at this | |
196 | position in the candidate merges. Each candidate class must be a | |
197 | superclass of exactly one of $C$'s direct superclasses, since otherwise the | |
198 | candidates would be ordered by their common subclass's class precedence | |
199 | list. The class precedence list contains, at this position, that candidate | |
200 | class whose subclass appears earliest in $C$'s local precedence order. | |
201 | \end{itemize} | |
202 | ||
203 | \subsubsection{Class links and chains} | |
204 | The definition for a class $C$ may distinguish one of its proper superclasses | |
205 | as being the \emph{link superclass} for class $C$. Not every class need have | |
206 | a link superclass, and the link superclass of a class $C$, if it exists, need | |
207 | not be a direct superclass of $C$. | |
208 | ||
209 | Superclass links must obey the following rule: if $C$ is a class, then there | |
210 | must be no three superclasses $X$, $Y$ and~$Z$ of $C$ such that both $Z$ is | |
211 | the link superclass of both $X$ and $Y$. As a consequence of this rule, the | |
212 | superclasses of $C$ can be partitioned into linear \emph{chains}, such that | |
213 | superclasses $A$ and $B$ are in the same chain if and only if one can trace a | |
214 | path from $A$ to $B$ by following superclass links, or \emph{vice versa}. | |
215 | ||
216 | Since a class links only to one of its proper superclasses, the classes in a | |
217 | chain are naturally ordered from most- to least-specific. The least specific | |
218 | class in a chain is called the \emph{chain head}; the most specific class is | |
219 | the \emph{chain tail}. Chains are often named after their chain head | |
220 | classes. | |
221 | ||
222 | \subsection{Names} | |
223 | \label{sec:concepts.classes.names} | |
224 | ||
225 | Classes have a number of other attributes: | |
226 | \begin{itemize} | |
227 | \item A \emph{name}, which is a C identifier. Class names must be globally | |
228 | unique. The class name is used in the names of a number of associated | |
229 | definitions, to be described later. | |
230 | \item A \emph{nickname}, which is also a C identifier. Unlike names, | |
231 | nicknames are not required to be globally unique. If $C$ is any class, | |
232 | then all the superclasses of $C$ must have distinct nicknames. | |
233 | \end{itemize} | |
234 | ||
235 | \subsection{Slots} \label{sec:concepts.classes.slots} | |
236 | ||
237 | Each class defines a number of \emph{slots}. Much like a structure member, a | |
238 | slot has a \emph{name}, which is a C identifier, and a \emph{type}. Unlike | |
239 | many other object systems, different superclasses of a class $C$ can define | |
240 | slots with the same name without ambiguity, since slot references are always | |
241 | qualified by the defining class's nickname. | |
242 | ||
243 | \subsubsection{Slot initializers} | |
244 | As well as defining slot names and types, a class can also associate an | |
245 | \emph{initial value} with each slot defined by itself or one of its | |
246 | subclasses. A class $C$ provides an \emph{initialization function} (see | |
247 | \xref{sec:concepts.classes.c}, and \xref{sec:structures.root.sodclass}) which | |
248 | sets the slots of a \emph{direct} instance of the class to the correct | |
249 | initial values. If several of $C$'s superclasses define initializers for the | |
250 | same slot then the initializer from the most specific such class is used. If | |
251 | none of $C$'s superclasses define an initializer for some slot then that slot | |
252 | will not be initialized. | |
253 | ||
254 | The initializer for a slot with scalar type may be any C expression. The | |
255 | initializer for a slot with aggregate type must contain only constant | |
256 | expressions if the generated code is expected to be processed by a | |
257 | implementation of C89. Initializers will be evaluated once each time an | |
258 | instance is initialized. | |
259 | ||
260 | \subsection{C language integration} \label{sec:concepts.classes.c} | |
261 | ||
262 | For each class~$C$, the Sod translator defines a C type, the \emph{class | |
263 | type}, with the same name. This is the usual type used when considering an | |
264 | object as an instance of class~$C$. No entire object will normally have a | |
265 | class type,\footnote{% | |
266 | In general, a class type only captures the structure of one of the | |
267 | superclass chains of an instance. A full instance layout contains multiple | |
268 | chains. See \xref{sec:structures.layout} for the full details.} % | |
269 | so access to instances is almost always via pointers. | |
270 | ||
271 | \subsubsection{Access to slots} | |
272 | The class type for a class~$C$ is actually a structure. It contains one | |
273 | member for each class in $C$'s superclass chain, named with that class's | |
274 | nickname. Each of these members is also a structure, containing the | |
275 | corresponding class's slots, one member per slot. There's nothing special | |
276 | about these slot members: C code can access them in the usual way. | |
277 | ||
278 | For example, if @|MyClass| has the nickname @|mine|, and defines a slot @|x| | |
279 | of type @|int|, then the simple function | |
280 | \begin{prog} | |
281 | int get_x(MyClass *m) \{ return (m->mine.x); \} | |
282 | \end{prog} | |
283 | will extract the value of @|x| from an instance of @|MyClass|. | |
284 | ||
285 | All of this means that there's no such thing as `private' or `protected' | |
286 | slots. If you want to hide implementation details, the best approach is to | |
287 | stash them in a dynamically allocated private structure, and leave a pointer | |
288 | to it in a slot. (This will also help preserve binary compatibility, because | |
289 | the private structure can grow more members as needed. See | |
290 | \xref{sec:fixme.compatibility} for more details. | |
291 | ||
292 | \subsubsection{Class objects} | |
293 | In Sod's object system, classes are objects too. Therefore classes are | |
294 | themselves instances; the class of a class is called a \emph{metaclass}. The | |
295 | consequences of this are explored in \xref{sec:concepts.metaclasses}. The | |
296 | \emph{class object} has the same name as the class, suffixed with | |
297 | `@|__class|'\footnote{% | |
298 | This is not quite true. @|$C$__class| is actually a macro. See | |
299 | \xref{sec:structures.layout.additional} for the gory details.} % | |
300 | and its type is usually @|SodClass|; @|SodClass|'s nickname is @|cls|. | |
301 | ||
302 | A class object's slots contain or point to useful information, tables and | |
303 | functions for working with that class's instances. (The @|SodClass| class | |
304 | doesn't define any messages, so it doesn't have any methods. In Sod, a class | |
305 | slot containing a function pointer is not at all the same thing as a method.) | |
306 | ||
307 | \subsubsection{Instance allocation, imprinting, and initialization} | |
308 | It is in general not sufficient to declare (or @|malloc|) an object of the | |
309 | appropriate class type and fill it in, since the class type only describes an | |
310 | instance's layout from the point of view of a single superclass chain. The | |
311 | correct type to allocate, to store a direct instance of some class is a | |
312 | structure whose tag is the class name suffixed with `@|__ilayout|'; e.g., the | |
313 | correct layout structure for a direct instance of @|MyClass| would be | |
314 | @|struct MyClass__ilayout|. | |
315 | ||
316 | Instance layouts may be declared as objects with automatic storage duration | |
317 | (colloquially, `allocated on the stack') or allocated dynamically, e.g., | |
318 | using @|malloc|. Sod's runtime system doesn't retain addresses of instances, | |
319 | so, for example, Sod doesn't make using a fancy allocator which sometimes | |
320 | moves objects around in memory any more difficult than it needs to be. | |
321 | ||
322 | Once storage for an instance has been allocated, it must be \emph{imprinted} | |
323 | before it can be used. Imprinting an instance stores some metadata about its | |
324 | direct class in the instance structure, so that the rest of the program (and | |
325 | Sod's runtime library) can tell what sort of object it is, and how to use | |
326 | it.\footnote{% | |
327 | Specifically, imprinting an instance's storage involves storing the | |
328 | appropriate vtable pointers in the right places in it.} % | |
329 | A class object's @|imprint| slot points to a function which will correctly | |
330 | imprint storage for one of that class's instances. | |
331 | ||
332 | Once an instance's storage has been imprinted, it is possible to send the | |
333 | instance messages; however, the instance's slots are uninitialized at this | |
334 | point, so most methods are unlikely to do much of any use. So, usually, you | |
335 | don't just want to imprint instance storage, but to \emph{initialize} an | |
336 | instance. Initialization includes imprinting, but also sets the new | |
337 | instance's slots to their initial values, as defined by the class. If | |
338 | neither the class nor any of its superclasses defines an initializer for a | |
339 | slot then it will not be initialized. | |
340 | ||
341 | There is currently no facility for providing parameters to the instance | |
342 | initialization process (e.g., for use by slot initializer expressions). | |
343 | Instance initialization is a complicated matter and for now I want to | |
344 | experiment with various approaches before committing to one. My current | |
345 | interim approach is to specify slot initializers where appropriate and send | |
346 | class-specific messages for more complicated parametrized initialization. | |
347 | ||
348 | Automatic-duration instances can be conveniently constructed and initialized | |
349 | using the @|SOD_DECL| macro (page~\pageref{mac:SOD-DECL}). No special | |
350 | support is currently provided for dynamically allocated instances. A simple | |
351 | function using @|malloc| might work as follows. | |
352 | \begin{prog} | |
353 | void *new_instance(const SodClass *c) \\ | |
354 | \{ \\ \ind | |
355 | void *p = malloc(c->cls.initsz); \\ | |
356 | if (!p) return (0); \\ | |
357 | c->cls.init(p); \\ | |
358 | return (p); \- \\ | |
359 | \} | |
360 | \end{prog} | |
361 | ||
362 | \subsubsection{Instance finalization and deallocation} | |
363 | There is currently no provided assistance for finalization or deallocation. | |
364 | It is the programmer's responsibility to decide and implement an appropriate | |
365 | protocol. Note that to free an instance allocated from the heap, one must | |
366 | correctly find its base address: the @|SOD_INSTBASE| macro | |
367 | (page~\pageref{mac:SOD-INSTBASE}) will do this for you. | |
368 | ||
369 | The following simple mixin class is suggested. | |
370 | \begin{prog} | |
371 | [nick = disposable] \\* | |
372 | class DisposableObject : SodObject \{ \\*[\jot] \ind | |
373 | void release() \{ ; \} \\* | |
374 | \quad /\=\+* Release resources held by the receiver. */ \-\- \\*[\jot] | |
375 | \} \\[\bigskipamount] | |
376 | code c : user \{ \\* \ind | |
377 | /\=\+* Free object p's instance storage. If p is a DisposableObject \\* | |
378 | {}* then release its resources beforehand. \\* | |
379 | {}*/ \- \\* | |
380 | void free_instance(void *p) \\* | |
381 | \{ \\* \ind | |
382 | DisposableObject *d = SOD_CONVERT(DisposableObject, p); \\* | |
383 | if (d) DisposableObject_release(d); \\* | |
384 | free(d); \- \\* | |
385 | \} \- \\* | |
386 | \} | |
387 | \end{prog} | |
388 | ||
389 | \subsubsection{Conversions} | |
390 | Suppose one has a value of type pointer to class type of some class~$C$, and | |
391 | wants to convert it to a pointer to class type of some other class~$B$. | |
392 | There are three main cases to distinguish. | |
393 | \begin{itemize} | |
394 | \item If $B$ is a superclass of~$C$, in the same chain, then the conversion | |
395 | is an \emph{in-chain upcast}. The conversion can be performed using the | |
396 | appropriate generated upcast macro (see below), or by simply casting the | |
397 | pointer, using C's usual cast operator (or the \Cplusplus\ @|static_cast<>| | |
398 | operator). | |
399 | \item If $B$ is a superclass of~$C$, in a different chain, then the | |
400 | conversion is a \emph{cross-chain upcast}. The conversion is more than a | |
401 | simple type change: the pointer value must be adjusted. If the direct | |
402 | class of the instance in question is not known, the conversion will require | |
403 | a lookup at runtime to find the appropriate offset by which to adjust the | |
404 | pointer. The conversion can be performed using the appropriate generated | |
405 | upcast macro (see below); the general case is handled by the macro | |
406 | @|SOD_XCHAIN| (page~\pageref{mac:SOD-XCHAIN}). | |
407 | \item If $B$ is a subclass of~$C$ then the conversion is an \emph{upcast}; | |
408 | otherwise the conversion is a~\emph{cross-cast}. In either case, the | |
409 | conversion can fail: the object in question might not be an instance of~$B$ | |
410 | at all. The macro @|SOD_CONVERT| (page~\pageref{mac:SOD-CONVERT}) and the | |
411 | function @|sod_convert| (page~\pageref{fun:sod-convert}) perform general | |
412 | conversions. They return a null pointer if the conversion fails. | |
413 | \end{itemize} | |
414 | The Sod translator generates macros for performing both in-chain and | |
415 | cross-chain upcasts. For each class~$C$, and each proper superclass~$B$ | |
416 | of~$C$, a macro is defined: given an argument of type pointer to class type | |
417 | of~$C$, it returns a pointer to the same instance, only with type pointer to | |
418 | class type of~$B$, adjusted as necessary in the case of a cross-chain | |
419 | conversion. The macro is named by concatenating | |
420 | \begin{itemize} | |
421 | \item the name of class~$C$, in upper case, | |
422 | \item the characters `@|__CONV_|', and | |
423 | \item the nickname of class~$B$, in upper case; | |
424 | \end{itemize} | |
425 | e.g., if $C$ is named @|MyClass|, and $B$'s name is @|SuperClass| with | |
426 | nickname @|super|, then the macro @|MYCLASS__CONV_SUPER| converts a | |
427 | @|MyClass~*| to a @|SuperClass~*|. See | |
428 | \xref{sec:structures.layout.additional} for the formal description. | |
429 | ||
430 | %%%-------------------------------------------------------------------------- | |
431 | \section{Messages and methods} \label{sec:concepts.methods} | |
432 | ||
433 | Objects can be sent \emph{messages}. A message has a \emph{name}, and | |
434 | carries a number of \emph{arguments}. When an object is sent a message, a | |
435 | function, determined by the receiving object's class, is invoked, passing it | |
436 | the receiver and the message arguments. This function is called the | |
437 | class's \emph{effective method} for the message. The effective method can do | |
438 | anything a C function can do, including reading or updating program state or | |
439 | object slots, sending more messages, calling other functions, issuing system | |
440 | calls, or performing I/O; if it finishes, it may return a value, which is | |
441 | returned in turn to the message sender. | |
442 | ||
443 | The set of messages an object can receive, characterized by their names, | |
444 | argument types, and return type, is determined by the object's class. Each | |
445 | class can define new messages, which can be received by any instance of that | |
446 | class. The messages defined by a single class must have distinct names: | |
447 | there is no `function overloading'. As with slots | |
448 | (\xref{sec:concepts.classes.slots}), messages defined by distinct classes are | |
449 | always distinct, even if they have the same names: references to messages are | |
450 | always qualified by the defining class's name or nickname. | |
451 | ||
452 | Messages may take any number of arguments, of any non-array value type. | |
453 | Since message sends are effectively function calls, arguments of array type | |
454 | are implicitly converted to values of the corresponding pointer type. While | |
455 | message definitions may ascribe an array type to an argument, the formal | |
456 | argument will have pointer type, as is usual for C functions. A message may | |
457 | accept a variable-length argument suffix, denoted @|\dots|. | |
458 | ||
459 | A class definition may include \emph{direct methods} for messages defined by | |
460 | it or any of its superclasses. | |
461 | ||
462 | Like messages, direct methods define argument lists and return types, but | |
463 | they may also have a \emph{body}, and a \emph{role}. | |
464 | ||
465 | A direct method need not have the same argument list or return type as its | |
466 | message. The acceptable argument lists and return types for a method depend | |
467 | on the message, in particular its method combination | |
468 | (\xref{sec:concepts.methods.combination}), and the method's role. | |
469 | ||
470 | A direct method body is a block of C code, and the Sod translator usually | |
471 | defines, for each direct method, a function with external linkage, whose body | |
472 | contains a copy of the direct method body. Within the body of a direct | |
473 | method defined for a class $C$, the variable @|me|, of type pointer to class | |
474 | type of $C$, refers to the receiving object. | |
475 | ||
476 | \subsection{Effective methods and method combinations} | |
477 | \label{sec:concepts.methods.combination} | |
478 | ||
479 | For each message a direct instance of a class might receive, there is a set | |
480 | of \emph{applicable methods}, which are exactly the direct methods defined on | |
481 | the object's class and its superclasses. These direct methods are combined | |
482 | together to form the \emph{effective method} for that particular class and | |
483 | message. Direct methods can be combined into an effective method in | |
484 | different ways, according to the \emph{method combination} specified by the | |
485 | message. The method combination determines which direct method roles are | |
486 | acceptable, and, for each role, the appropriate argument lists and return | |
487 | types. | |
488 | ||
489 | One direct method, $M$, is said to be more (resp.\ less) \emph{specific} than | |
490 | another, $N$, with respect to a receiving class~$C$, if the class defining | |
491 | $M$ is a more (resp.\ less) specific superclass of~$C$ than the class | |
492 | defining $N$. | |
493 | ||
494 | \subsection{The standard method combination} | |
495 | \label{sec:concepts.methods.standard} | |
496 | ||
497 | The default method combination is called the \emph{standard method | |
498 | combination}; other method combinations are useful occasionally for special | |
499 | effects. The standard method combination accepts four direct method roles, | |
500 | called @|primary| (the default), @|before|, @|after|, and @|around|. | |
501 | ||
502 | All direct methods subject to the standard method combination must have | |
503 | argument lists which \emph{match} the message's argument list: | |
504 | \begin{itemize} | |
505 | \item the method's arguments must have the same types as the message, though | |
506 | the arguments may have different names; and | |
507 | \item if the message accepts a variable-length argument suffix then the | |
508 | direct method must instead have a final argument of type @|va_list|. | |
509 | \end{itemize} | |
b1254eb6 MW |
510 | Primary and @|around| methods must have the same return type as the message; |
511 | @|before| and @|after| methods must return @|void| regardless of the | |
512 | message's return type. | |
3cc520db MW |
513 | |
514 | If there are no applicable primary methods then no effective method is | |
515 | constructed: the vtables contain null pointers in place of pointers to method | |
516 | entry functions. | |
517 | ||
518 | The effective method for a message with standard method combination works as | |
519 | follows. | |
520 | \begin{enumerate} | |
521 | ||
522 | \item If any applicable methods have the @|around| role, then the most | |
523 | specific such method, with respect to the class of the receiving object, is | |
524 | invoked. | |
525 | ||
b1254eb6 | 526 | Within the body of an @|around| method, the variable @|next_method| is |
3cc520db MW |
527 | defined, having pointer-to-function type. The method may call this |
528 | function, as described below, any number of times. | |
529 | ||
b1254eb6 MW |
530 | If there any remaining @|around| methods, then @|next_method| invokes the |
531 | next most specific such method, returning whichever value that method | |
532 | returns; otherwise the behaviour of @|next_method| is to invoke the before | |
533 | methods (if any), followed by the most specific primary method, followed by | |
534 | the @|around| methods (if any), and to return whichever value was returned | |
535 | by the most specific primary method. That is, the behaviour of the least | |
536 | specific @|around| method's @|next_method| function is exactly the | |
537 | behaviour that the effective method would have if there were no @|around| | |
538 | methods. | |
3cc520db | 539 | |
b1254eb6 MW |
540 | The value returned by the most specific @|around| method is the value |
541 | returned by the effective method. | |
3cc520db MW |
542 | |
543 | \item If any applicable methods have the @|before| role, then they are all | |
544 | invoked, starting with the most specific. | |
545 | ||
546 | \item The most specific applicable primary method is invoked. | |
547 | ||
548 | Within the body of a primary method, the variable @|next_method| is | |
549 | defined, having pointer-to-function type. If there are no remaining less | |
550 | specific primary methods, then @|next_method| is a null pointer. | |
551 | Otherwise, the method may call the @|next_method| function any number of | |
552 | times. | |
553 | ||
554 | The behaviour of the @|next_method| function, if it is not null, is to | |
555 | invoke the next most specific applicable primary method, and to return | |
556 | whichever value that method returns. | |
557 | ||
b1254eb6 MW |
558 | If there are no applicable @|around| methods, then the value returned by |
559 | the most specific primary method is the value returned by the effective | |
560 | method; otherwise the value returned by the most specific primary method is | |
561 | returned to the least specific @|around| method, which called it via its | |
562 | own @|next_method| function. | |
3cc520db MW |
563 | |
564 | \item If any applicable methods have the @|after| role, then they are all | |
565 | invoked, starting with the \emph{least} specific. (Hence, the most | |
b1254eb6 | 566 | specific @|after| method is invoked with the most `afterness'.) |
3cc520db MW |
567 | |
568 | \end{enumerate} | |
569 | ||
b1254eb6 MW |
570 | A typical use for @|around| methods is to allow a base class to set up the |
571 | dynamic environment appropriately for the primary methods of its subclasses, | |
572 | e.g., by claiming a lock, and restore it afterwards. | |
3cc520db MW |
573 | |
574 | The @|next_method| function provided to methods with the @|primary| and | |
575 | @|around| roles accepts the same arguments, and returns the same type, as the | |
576 | message, except that one or two additional arguments are inserted at the | |
577 | front of the argument list. The first additional argument is always the | |
578 | receiving object, @|me|. If the message accepts a variable argument suffix, | |
579 | then the second addition argument is a @|va_list|; otherwise there is no | |
580 | second additional argument; otherwise, In the former case, a variable | |
581 | @|sod__master_ap| of type @|va_list| is defined, containing a separate copy | |
582 | of the argument pointer (so the method body can process the variable argument | |
583 | suffix itself, and still pass a fresh copy on to the next method). | |
584 | ||
585 | A method with the @|primary| or @|around| role may use the convenience macro | |
586 | @|CALL_NEXT_METHOD|, which takes no arguments itself, and simply calls | |
587 | @|next_method| with appropriate arguments: the receiver @|me| pointer, the | |
588 | argument pointer @|sod__master_ap| (if applicable), and the method's | |
589 | arguments. If the method body has overwritten its formal arguments, then | |
590 | @|CALL_NEXT_METHOD| will pass along the updated values, rather than the | |
591 | original ones. | |
592 | ||
593 | \subsection{Aggregating method combinations} | |
594 | \label{sec:concepts.methods.aggregating} | |
595 | ||
596 | A number of other method combinations are provided. They are called | |
597 | `aggregating' method combinations because, instead of invoking just the most | |
598 | specific primary method, as the standard method combination does, they invoke | |
599 | the applicable primary methods in turn and aggregate the return values from | |
600 | each. | |
601 | ||
602 | The aggregating method combinations accept the same four roles as the | |
b1254eb6 MW |
603 | standard method combination, and @|around|, @|before|, and @|after| methods |
604 | work in the same way. | |
3cc520db MW |
605 | |
606 | The aggregating method combinations provided are as follows. | |
607 | \begin{description} \let\makelabel\code | |
608 | \item[progn] The message must return @|void|. The applicable primary methods | |
609 | are simply invoked in turn, most specific first. | |
610 | \item[sum] The message must return a numeric type.\footnote{% | |
611 | The Sod translator does not check this, since it doesn't have enough | |
612 | insight into @|typedef| names.} % | |
613 | The applicable primary methods are invoked in turn, and their return values | |
614 | added up. The final result is the sum of the individual values. | |
615 | \item[product] The message must return a numeric type. The applicable | |
616 | primary methods are invoked in turn, and their return values multiplied | |
617 | together. The final result is the product of the individual values. | |
618 | \item[min] The message must return a scalar type. The applicable primary | |
619 | methods are invoked in turn. The final result is the smallest of the | |
620 | individual values. | |
621 | \item[max] The message must return a scalar type. The applicable primary | |
622 | methods are invoked in turn. The final result is the largest of the | |
623 | individual values. | |
665a0455 MW |
624 | \item[and] The message must return a scalar type. The applicable primary |
625 | methods are invoked in turn. If any method returns zero then the final | |
626 | result is zero and no further methods are invoked. If all of the | |
627 | applicable primary methods return nonzero, then the final result is the | |
628 | result of the last primary method. | |
629 | \item[or] The message must return a scalar type. The applicable primary | |
630 | methods are invoked in turn. If any method returns nonzero then the final | |
631 | result is that nonzero value and no further methods are invoked. If all of | |
632 | the applicable primary methods return zero, then the final result is zero. | |
3cc520db MW |
633 | \end{description} |
634 | ||
635 | There is also a @|custom| aggregating method combination, which is described | |
636 | in \xref{sec:fixme.custom-aggregating-method-combination}. | |
637 | ||
638 | %%%-------------------------------------------------------------------------- | |
639 | \section{Metaclasses} \label{sec:concepts.metaclasses} | |
1f7d590d MW |
640 | |
641 | %%%----- That's all, folks -------------------------------------------------- | |
642 | ||
643 | %%% Local variables: | |
644 | %%% mode: LaTeX | |
645 | %%% TeX-master: "sod.tex" | |
646 | %%% TeX-PDF-mode: t | |
647 | %%% End: |