Commit | Line | Data |
---|---|---|
1f7d590d MW |
1 | %%% -*-latex-*- |
2 | %%% | |
3 | %%% Conceptual background | |
4 | %%% | |
5 | %%% (c) 2015 Straylight/Edgeware | |
6 | %%% | |
7 | ||
8 | %%%----- Licensing notice --------------------------------------------------- | |
9 | %%% | |
e0808c47 | 10 | %%% This file is part of the Sensible Object Design, an object system for C. |
1f7d590d MW |
11 | %%% |
12 | %%% SOD is free software; you can redistribute it and/or modify | |
13 | %%% it under the terms of the GNU General Public License as published by | |
14 | %%% the Free Software Foundation; either version 2 of the License, or | |
15 | %%% (at your option) any later version. | |
16 | %%% | |
17 | %%% SOD is distributed in the hope that it will be useful, | |
18 | %%% but WITHOUT ANY WARRANTY; without even the implied warranty of | |
19 | %%% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | |
20 | %%% GNU General Public License for more details. | |
21 | %%% | |
22 | %%% You should have received a copy of the GNU General Public License | |
23 | %%% along with SOD; if not, write to the Free Software Foundation, | |
24 | %%% Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. | |
25 | ||
3cc520db | 26 | \chapter{Concepts} \label{ch:concepts} |
1f7d590d | 27 | |
3cc520db MW |
28 | %%%-------------------------------------------------------------------------- |
29 | \section{Modules} \label{sec:concepts.modules} | |
30 | ||
31 | A \emph{module} is the top-level syntactic unit of input to the Sod | |
32 | translator. As described above, given an input module, the translator | |
33 | generates C source and header files. | |
34 | ||
35 | A module can \emph{import} other modules. This makes the type names and | |
36 | classes defined in those other modules available to class definitions in the | |
37 | importing module. Sod's module system is intentionally very simple. There | |
38 | are no private declarations or attempts to hide things. | |
39 | ||
40 | As well as importing existing modules, a module can include a number of | |
41 | different kinds of \emph{items}: | |
42 | \begin{itemize} | |
43 | \item \emph{class definitions} describe new classes, possibly in terms of | |
44 | existing classes; | |
45 | \item \emph{type name declarations} introduce new type names to Sod's | |
46 | parser;\footnote{% | |
47 | This is unfortunately necessary because C syntax, upon which Sod's input | |
48 | language is based for obvious reasons, needs to treat type names | |
49 | differently from other kinds of identifiers.} % | |
50 | and | |
51 | \item \emph{code fragments} contain literal C code to be dropped into an | |
52 | appropriate place in an output file. | |
53 | \end{itemize} | |
54 | Each kind of item, and, indeed, a module as a whole, can have a collection of | |
55 | \emph{properties} associated with it. A property has a \emph{name} and a | |
56 | \emph{value}. Properties are an open-ended way of attaching additional | |
57 | information to module items, so extensions can make use of them without | |
58 | having to implement additional syntax. | |
59 | ||
60 | %%%-------------------------------------------------------------------------- | |
61 | \section{Classes, instances, and slots} \label{sec:concepts.classes} | |
62 | ||
63 | For the most part, Sod takes a fairly traditional view of what it means to be | |
64 | an object system. | |
65 | ||
46fe5a33 MW |
66 | An \emph{object} maintains \emph{state} and exhibits \emph{behaviour}. |
67 | (Here, we're using the term `object' in the usual sense of `object-oriented | |
68 | programming', rather than that of the ISO~C standard. Once we have defined | |
69 | an `instance' below, we shall generally prefer that term, so as to prevent | |
70 | further confusion between these two uses of the word.) | |
71 | ||
72 | An object's state is maintained in named \emph{slots}, each of which can | |
73 | store a C value of an appropriate (scalar or aggregate) type. An object's | |
74 | behaviour is stimulated by sending it \emph{messages}. A message has a name, | |
75 | and may carry a number of arguments, which are C values; sending a message | |
76 | may result in the state of receiving object (or other objects) being changed, | |
77 | and a C value being returned to the sender. | |
78 | ||
79 | Every object is a \emph{direct instance} of exactly one \emph{class}. The | |
80 | class determines which slots its instances have, which messages its instances | |
81 | can be sent, and which methods are invoked when those messages are received. | |
82 | The Sod translator's main job is to read class definitions and convert them | |
83 | into appropriate C declarations, tables, and functions. An object cannot | |
3cc520db MW |
84 | (usually) change its direct class, and the direct class of an object is not |
85 | affected by, for example, the static type of a pointer to it. | |
86 | ||
46fe5a33 MW |
87 | If an object~$x$ is a direct instance of some class~$C$, then we say that $C$ |
88 | is \emph{the class of}~$x$. Note that the class of an object is a property | |
89 | of the object's value at runtime, and not of C's compile-time type system. | |
90 | We shall be careful in distinguishing C's compile-time notion of \emph{type} | |
91 | from Sod's run-time notion of \emph{class}. | |
92 | ||
0a2d4b68 | 93 | |
3cc520db MW |
94 | \subsection{Superclasses and inheritance} |
95 | \label{sec:concepts.classes.inherit} | |
96 | ||
97 | \subsubsection{Class relationships} | |
98 | Each class has zero or more \emph{direct superclasses}. | |
99 | ||
100 | A class with no direct superclasses is called a \emph{root class}. The Sod | |
101 | runtime library includes a root class named @|SodObject|; making new root | |
102 | classes is somewhat tricky, and won't be discussed further here. | |
103 | ||
104 | Classes can have more than one direct superclass, i.e., Sod supports | |
105 | \emph{multiple inheritance}. A Sod class definition for a class~$C$ lists | |
106 | the direct superclasses of $C$ in a particular order. This order is called | |
107 | the \emph{local precedence order} of $C$, and the list which consists of $C$ | |
108 | follows by $C$'s direct superclasses in local precedence order is called the | |
109 | $C$'s \emph{local precedence list}. | |
110 | ||
111 | The multiple inheritance in Sod works similarly to multiple inheritance in | |
112 | Lisp-like languages, such as Common Lisp, EuLisp, Dylan, and Python, which is | |
113 | very different from how multiple inheritance works in \Cplusplus.\footnote{% | |
114 | The latter can be summarized as `badly'. By default in \Cplusplus, an | |
115 | instance receives an additional copy of superclass's state for each path | |
116 | through the class graph from the instance's direct class to that | |
117 | superclass, though this behaviour can be overridden by declaring | |
118 | superclasses to be @|virtual|. Also, \Cplusplus\ offers only trivial | |
119 | method combination (\xref{sec:concepts.methods}), leaving programmers to | |
120 | deal with delegation manually and (usually) statically.} % | |
121 | ||
122 | If $C$ is a class, then the \emph{superclasses} of $C$ are | |
123 | \begin{itemize} | |
124 | \item $C$ itself, and | |
125 | \item the superclasses of each of $C$'s direct superclasses. | |
126 | \end{itemize} | |
127 | The \emph{proper superclasses} of a class $C$ are the superclasses of $C$ | |
128 | except for $C$ itself. If a class $B$ is a (direct, proper) superclass of | |
129 | $C$, then $C$ is a \emph{(direct, proper) subclass} of $B$. If $C$ is a root | |
130 | class then the only superclass of $C$ is $C$ itself, and $C$ has no proper | |
131 | superclasses. | |
132 | ||
133 | If an object is a direct instance of class~$C$ then the object is also an | |
46fe5a33 | 134 | (indirect) \emph{instance} of every superclass of $C$. |
3cc520db | 135 | |
054e8f8f | 136 | If $C$ has a proper superclass $B$, then $B$ must not have $C$ as a direct |
e8fd6aea MW |
137 | superclass. In different terms, if we construct a directed graph, whose |
138 | nodes are classes, and draw an arc from each class to each of its direct | |
139 | superclasses, then this graph must be acyclic. In yet other terms, the `is a | |
140 | superclass of' relation is a partial order on classes. | |
3cc520db MW |
141 | |
142 | \subsubsection{The class precedence list} | |
143 | This partial order is not quite sufficient for our purposes. For each class | |
144 | $C$, we shall need to extend it into a total order on $C$'s superclasses. | |
145 | This calculation is called \emph{superclass linearization}, and the result is | |
146 | a \emph{class precedence list}, which lists each of $C$'s superclasses | |
147 | exactly once. If a superclass $B$ precedes (resp.\ follows) some other | |
148 | superclass $A$ in $C$'s class precedence list, then we say that $B$ is a more | |
149 | (resp.\ less) \emph{specific} superclass of $C$ than $A$ is. | |
150 | ||
151 | The superclass linearization algorithm isn't fixed, and extensions to the | |
152 | translator can introduce new linearizations for special effects, but the | |
153 | following properties are expected to hold. | |
154 | \begin{itemize} | |
155 | \item The first class in $C$'s class precedence list is $C$ itself; i.e., | |
156 | $C$ is always its own most specific superclass. | |
157 | \item If $A$ and $B$ are both superclasses of $C$, and $A$ is a proper | |
158 | superclass of $B$ then $A$ appears after $B$ in $C$'s class precedence | |
159 | list, i.e., $B$ is a more specific superclass of $C$ than $A$ is. | |
160 | \end{itemize} | |
161 | The default linearization algorithm used in Sod is the \emph{C3} algorithm, | |
9cd5cf15 | 162 | which has a number of good properties described in~\cite{Barrett:1996:MSL}. |
3cc520db MW |
163 | It works as follows. |
164 | \begin{itemize} | |
165 | \item A \emph{merge} of some number of input lists is a single list | |
166 | containing each item that is in any of the input lists exactly once, and no | |
167 | other items; if an item $x$ appears before an item $y$ in any input list, | |
168 | then $x$ also appears before $y$ in the merge. If a collection of lists | |
169 | have no merge then they are said to be \emph{inconsistent}. | |
170 | \item The class precedence list of a class $C$ is a merge of the local | |
171 | precedence list of $C$ together with the class precedence lists of each of | |
172 | $C$'s direct superclasses. | |
173 | \item If there are no such merges, then the definition of $C$ is invalid. | |
174 | \item Suppose that there are multiple candidate merges. Consider the | |
175 | earliest position in these candidate merges at which they disagree. The | |
176 | \emph{candidate classes} at this position are the classes appearing at this | |
177 | position in the candidate merges. Each candidate class must be a | |
781a8fbd | 178 | superclass of distinct direct superclasses of $C$, since otherwise the |
3cc520db MW |
179 | candidates would be ordered by their common subclass's class precedence |
180 | list. The class precedence list contains, at this position, that candidate | |
181 | class whose subclass appears earliest in $C$'s local precedence order. | |
182 | \end{itemize} | |
183 | ||
4075ab40 MW |
184 | \begin{figure} |
185 | \centering | |
186 | \begin{tikzpicture}[x=7.5mm, y=-14mm, baseline=(current bounding box.east)] | |
187 | \node[lit] at ( 0, 0) (R) {SodObject}; | |
188 | \node[lit] at (-3, +1) (A) {A}; \draw[->] (A) -- (R); | |
189 | \node[lit] at (-1, +1) (B) {B}; \draw[->] (B) -- (R); | |
190 | \node[lit] at (+1, +1) (C) {C}; \draw[->] (C) -- (R); | |
191 | \node[lit] at (+3, +1) (D) {D}; \draw[->] (D) -- (R); | |
192 | \node[lit] at (-2, +2) (E) {E}; \draw[->] (E) -- (A); | |
193 | \draw[->] (E) -- (B); | |
194 | \node[lit] at (+2, +2) (F) {F}; \draw[->] (F) -- (A); | |
195 | \draw[->] (F) -- (D); | |
196 | \node[lit] at (-1, +3) (G) {G}; \draw[->] (G) -- (E); | |
197 | \draw[->] (G) -- (C); | |
198 | \node[lit] at (+1, +3) (H) {H}; \draw[->] (H) -- (F); | |
199 | \node[lit] at ( 0, +4) (I) {I}; \draw[->] (I) -- (G); | |
200 | \draw[->] (I) -- (H); | |
201 | \end{tikzpicture} | |
202 | \quad | |
203 | \vrule | |
204 | \quad | |
205 | \begin{minipage}[c]{0.45\hsize} | |
206 | \begin{nprog} | |
207 | class A: SodObject \{ \}\quad\=@/* @|A|, @|SodObject| */ \\ | |
208 | class B: SodObject \{ \}\>@/* @|B|, @|SodObject| */ \\ | |
209 | class C: SodObject \{ \}\>@/* @|B|, @|SodObject| */ \\ | |
210 | class D: SodObject \{ \}\>@/* @|B|, @|SodObject| */ \\+ | |
211 | class E: A, B \{ \}\quad\=@/* @|E|, @|A|, @|B|, \dots */ \\ | |
212 | class F: A, D \{ \}\>@/* @|F|, @|A|, @|D|, \dots */ \\+ | |
213 | class G: E, C \{ \}\>@/* @|G|, @|E|, @|A|, | |
214 | @|B|, @|C|, \dots */ \\ | |
215 | class H: F \{ \}\>@/* @|H|, @|F|, @|A|, @|D|, \dots */ \\+ | |
216 | class I: G, H \{ \}\>@/* @|I|, @|G|, @|E|, @|H|, @|F|, | |
217 | @|A|, @|B|, @|C|, @|D|, \dots */ | |
218 | \end{nprog} | |
219 | \end{minipage} | |
220 | ||
221 | \caption{An example class graph and class precedence lists} | |
222 | \label{fig:concepts.classes.cpl-example} | |
223 | \end{figure} | |
224 | ||
225 | \begin{example} | |
226 | Consider the class relationships shown in | |
227 | \xref{fig:concepts.classes.cpl-example}. | |
228 | ||
229 | \begin{itemize} | |
230 | ||
231 | \item @|SodObject| has no proper superclasses. Its class precedence list | |
232 | is therefore simply $\langle @|SodObject| \rangle$. | |
233 | ||
234 | \item In general, if $X$ is a direct subclass only of $Y$, and $Y$'s class | |
235 | precedence list is $\langle Y, \ldots \rangle$, then $X$'s class | |
236 | precedence list is $\langle X, Y, \ldots \rangle$. This explains $A$, | |
237 | $B$, $C$, $D$, and $H$. | |
238 | ||
239 | \item $E$'s list is found by merging its local precedence list $\langle E, | |
240 | A, B \rangle$ with the class precedence lists of its direct superclasses, | |
241 | which are $\langle A, @|SodObject| \rangle$ and $\langle B, @|SodObject| | |
242 | \rangle$. Clearly, @|SodObject| must be last, and $E$'s local precedence | |
243 | list orders the rest, giving $\langle E, A, B, @|SodObject|, \rangle$. | |
244 | $F$ is similar. | |
245 | ||
246 | \item We determine $G$'s class precedence list by merging the three lists | |
247 | $\langle G, E, C \rangle$, $\langle E, A, B, @|SodObject| \rangle$, and | |
248 | $\langle C, @|SodObject| \rangle$. The class precedence list begins | |
249 | $\langle G, E, \ldots \rangle$, but the individual lists don't order $A$ | |
250 | and $C$. Comparing these to $G$'s direct superclasses, we see that $A$ | |
54cf3a30 MW |
251 | is a superclass of $E$, while $C$ is a superclass of -- indeed equal to |
252 | -- $C$; so $A$ must precede $C$, as must $B$, and the final list is | |
253 | $\langle G, E, A, B, C, @|SodObject| \rangle$. | |
4075ab40 MW |
254 | |
255 | \item Finally, we determine $I$'s class precedence list by merging $\langle | |
256 | I, G, H \rangle$, $\langle G, E, A, B, C, @|SodObject| \rangle$, and | |
257 | $\langle H, F, A, D, @|SodObject| \rangle$. The list begins $\langle I, | |
258 | G, \ldots \rangle$, and then we must break a tie between $E$ and $H$; but | |
54cf3a30 | 259 | $E$ is a superclass of $G$, so $E$ wins. Next, $H$ and $F$ must precede |
4075ab40 MW |
260 | $A$, since these are ordered by $H$'s class precedence list. Then $B$ |
261 | and $C$ precede $D$, since the former are superclasses of $G$, and the | |
262 | final list is $\langle I, G, E, H, F, A, B, C, D, @|SodObject| \rangle$. | |
263 | ||
264 | \end{itemize} | |
265 | ||
266 | (This example combines elements from \cite{Barrett:1996:MSL} and | |
267 | \cite{Ducournau:1994:PMM}.) | |
268 | \end{example} | |
269 | ||
3cc520db MW |
270 | \subsubsection{Class links and chains} |
271 | The definition for a class $C$ may distinguish one of its proper superclasses | |
272 | as being the \emph{link superclass} for class $C$. Not every class need have | |
273 | a link superclass, and the link superclass of a class $C$, if it exists, need | |
274 | not be a direct superclass of $C$. | |
275 | ||
276 | Superclass links must obey the following rule: if $C$ is a class, then there | |
756e9293 MW |
277 | must be no three distinct superclasses $X$, $Y$ and~$Z$ of $C$ such that $Z$ |
278 | is the link superclass of both $X$ and $Y$. As a consequence of this rule, | |
279 | the superclasses of $C$ can be partitioned into linear \emph{chains}, such | |
280 | that superclasses $A$ and $B$ are in the same chain if and only if one can | |
281 | trace a path from $A$ to $B$ by following superclass links, or \emph{vice | |
282 | versa}. | |
3cc520db MW |
283 | |
284 | Since a class links only to one of its proper superclasses, the classes in a | |
285 | chain are naturally ordered from most- to least-specific. The least specific | |
286 | class in a chain is called the \emph{chain head}; the most specific class is | |
287 | the \emph{chain tail}. Chains are often named after their chain head | |
288 | classes. | |
289 | ||
c6c9615b | 290 | |
3cc520db MW |
291 | \subsection{Names} |
292 | \label{sec:concepts.classes.names} | |
293 | ||
294 | Classes have a number of other attributes: | |
295 | \begin{itemize} | |
296 | \item A \emph{name}, which is a C identifier. Class names must be globally | |
297 | unique. The class name is used in the names of a number of associated | |
298 | definitions, to be described later. | |
299 | \item A \emph{nickname}, which is also a C identifier. Unlike names, | |
300 | nicknames are not required to be globally unique. If $C$ is any class, | |
301 | then all the superclasses of $C$ must have distinct nicknames. | |
302 | \end{itemize} | |
303 | ||
0a2d4b68 | 304 | |
3cc520db MW |
305 | \subsection{Slots} \label{sec:concepts.classes.slots} |
306 | ||
307 | Each class defines a number of \emph{slots}. Much like a structure member, a | |
308 | slot has a \emph{name}, which is a C identifier, and a \emph{type}. Unlike | |
309 | many other object systems, different superclasses of a class $C$ can define | |
310 | slots with the same name without ambiguity, since slot references are always | |
311 | qualified by the defining class's nickname. | |
312 | ||
313 | \subsubsection{Slot initializers} | |
314 | As well as defining slot names and types, a class can also associate an | |
315 | \emph{initial value} with each slot defined by itself or one of its | |
8eb242b1 | 316 | superclasses. A class $C$ provides an \emph{initialization message} (see |
ca2023b8 MW |
317 | \xref[\instead{sections}]{sec:concepts.lifecycle.birth}, and |
318 | \ref{sec:structures.root.sodobject}) whose methods set the slots of a | |
857c59bd MW |
319 | \emph{direct} instance of the class to the correct initial values. If |
320 | several of $C$'s superclasses define initializers for the same slot then the | |
321 | initializer from the most specific such class is used. If none of $C$'s | |
322 | superclasses define an initializer for some slot then that slot will be left | |
323 | uninitialized. | |
3cc520db MW |
324 | |
325 | The initializer for a slot with scalar type may be any C expression. The | |
326 | initializer for a slot with aggregate type must contain only constant | |
327 | expressions if the generated code is expected to be processed by a | |
328 | implementation of C89. Initializers will be evaluated once each time an | |
329 | instance is initialized. | |
330 | ||
27ec3825 MW |
331 | Slots are initialized in reverse-precedence order of their defining classes; |
332 | i.e., slots defined by a less specific superclass are initialized earlier | |
333 | than slots defined by a more specific superclass. Slots defined by the same | |
334 | class are initialized in the order in which they appear in the class | |
335 | definition. | |
336 | ||
337 | The initializer for a slot may refer to other slots in the same object, via | |
338 | the @|me| pointer: in an initializer for a slot defined by a class $C$, @|me| | |
339 | has type `pointer to $C$'. (Note that the type of @|me| depends only on the | |
340 | class which defined the slot, not the class which defined the initializer.) | |
341 | ||
997b4d2b MW |
342 | A class can also define \emph{class slot initializers}, which provide values |
343 | for a slot defined by its metaclass; see \xref{sec:concepts.metaclasses} for | |
344 | details. | |
345 | ||
0a2d4b68 | 346 | |
3cc520db MW |
347 | \subsection{C language integration} \label{sec:concepts.classes.c} |
348 | ||
c06ba266 MW |
349 | It is very important to distinguish compile-time C \emph{types} from Sod's |
350 | run-time \emph{classes}: see \xref{sec:concepts.classes}. | |
351 | ||
3cc520db MW |
352 | For each class~$C$, the Sod translator defines a C type, the \emph{class |
353 | type}, with the same name. This is the usual type used when considering an | |
354 | object as an instance of class~$C$. No entire object will normally have a | |
355 | class type,\footnote{% | |
356 | In general, a class type only captures the structure of one of the | |
357 | superclass chains of an instance. A full instance layout contains multiple | |
358 | chains. See \xref{sec:structures.layout} for the full details.} % | |
359 | so access to instances is almost always via pointers. | |
360 | ||
c06ba266 MW |
361 | Usually, a value of type pointer-to-class-type of class~$C$ will point into |
362 | an instance of class $C$. However, clever (or foolish) use of pointer | |
363 | conversions can invalidate this relationship. | |
364 | ||
3cc520db MW |
365 | \subsubsection{Access to slots} |
366 | The class type for a class~$C$ is actually a structure. It contains one | |
367 | member for each class in $C$'s superclass chain, named with that class's | |
368 | nickname. Each of these members is also a structure, containing the | |
369 | corresponding class's slots, one member per slot. There's nothing special | |
370 | about these slot members: C code can access them in the usual way. | |
371 | ||
f2309139 MW |
372 | For example, given the definition |
373 | \begin{prog} | |
374 | [nick = mine] \\ | |
375 | class MyClass: SodObject \{ \\ \ind | |
376 | int x; \-\\ | |
377 | \} | |
378 | \end{prog} | |
379 | the simple function | |
3cc520db | 380 | \begin{prog} |
c18d6aba | 381 | int get_x(MyClass *m) \{ return (m@->mine.x); \} |
3cc520db MW |
382 | \end{prog} |
383 | will extract the value of @|x| from an instance of @|MyClass|. | |
384 | ||
385 | All of this means that there's no such thing as `private' or `protected' | |
386 | slots. If you want to hide implementation details, the best approach is to | |
387 | stash them in a dynamically allocated private structure, and leave a pointer | |
388 | to it in a slot. (This will also help preserve binary compatibility, because | |
389 | the private structure can grow more members as needed. See | |
021d9f84 | 390 | \xref{sec:concepts.compatibility} for more details.) |
3cc520db | 391 | |
4b4aec4e MW |
392 | Slots defined by $C$'s link superclass, or any other superclass in the same |
393 | chain, can be accessed in the same way. Slots defined by other superclasses | |
394 | can't be accessed directly: the instance pointer must be \emph{converted} to | |
395 | point to a different chain. See the subsection `Conversions' below. | |
396 | ||
ff06eeb1 | 397 | |
f4e44f7f MW |
398 | \subsubsection{Sending messages} |
399 | Sod defines a macro for each message. If a class $C$ defines a message $m$, | |
400 | then the macro is called @|$C$_$m$|. The macro takes a pointer to the | |
401 | receiving object as its first argument, followed by the message arguments, if | |
402 | any, and returns the value returned by the object's effective method for the | |
403 | message (if any). If you have a pointer to an instance of any of $C$'s | |
404 | subclasses, then you can send it the message; it doesn't matter whether the | |
405 | subclass is on the same chain. Note that the receiver argument is evaluated | |
406 | twice, so it's not safe to write a receiver expression which has | |
407 | side-effects. | |
408 | ||
409 | For example, suppose we defined | |
410 | \begin{prog} | |
411 | [nick = soupy] \\ | |
412 | class Super: SodObject \{ \\ \ind | |
413 | void msg(const char *m); \-\\ | |
414 | \} \\+ | |
415 | class Sub: Super \{ \\ \ind | |
416 | void soupy.msg(const char *m) | |
417 | \{ printf("sub sent `\%s'@\\n", m); \} \-\\ | |
418 | \} | |
419 | \end{prog} | |
420 | then we can send the message like this: | |
421 | \begin{prog} | |
422 | Sub *sub = /* \dots\ */; \\ | |
423 | Super_msg(sub, "hello"); | |
424 | \end{prog} | |
425 | ||
426 | What happens under the covers is as follows. The structure pointed to by the | |
427 | instance pointer has a member named @|_vt|, which points to a structure | |
428 | called a `virtual table', or \emph{vtable}, which contains various pieces of | |
429 | information about the object's direct class and layout, and holds pointers to | |
430 | method entries for the messages which the object can receive. The | |
431 | message-sending macro in the example above expands to something similar to | |
432 | \begin{prog} | |
433 | sub@->_vt.sub.msg(sub, "Hello"); | |
434 | \end{prog} | |
435 | ||
436 | The vtable contains other useful information, such as a pointer to the | |
437 | instance's direct class's \emph{class object} (described below). The full | |
438 | details of the contents and layout of vtables are given in | |
439 | \xref{sec:structures.layout.vtable}. | |
caa6f4b9 MW |
440 | |
441 | ||
3cc520db MW |
442 | \subsubsection{Class objects} |
443 | In Sod's object system, classes are objects too. Therefore classes are | |
444 | themselves instances; the class of a class is called a \emph{metaclass}. The | |
445 | consequences of this are explored in \xref{sec:concepts.metaclasses}. The | |
446 | \emph{class object} has the same name as the class, suffixed with | |
447 | `@|__class|'\footnote{% | |
448 | This is not quite true. @|$C$__class| is actually a macro. See | |
449 | \xref{sec:structures.layout.additional} for the gory details.} % | |
450 | and its type is usually @|SodClass|; @|SodClass|'s nickname is @|cls|. | |
451 | ||
452 | A class object's slots contain or point to useful information, tables and | |
453 | functions for working with that class's instances. (The @|SodClass| class | |
054e8f8f MW |
454 | doesn't define any messages, so it doesn't have any methods other than for |
455 | the @|SodObject| lifecycle messages @|init| and @|teardown|; see | |
456 | \xref{sec:concepts.lifecycle}. In Sod, a class slot containing a function | |
457 | pointer is not at all the same thing as a method.) | |
3cc520db | 458 | |
3cc520db | 459 | \subsubsection{Conversions} |
e4ea29d8 MW |
460 | Suppose one has a value of type pointer-to-class-type for some class~$C$, and |
461 | wants to convert it to a pointer-to-class-type for some other class~$B$. | |
3cc520db MW |
462 | There are three main cases to distinguish. |
463 | \begin{itemize} | |
464 | \item If $B$ is a superclass of~$C$, in the same chain, then the conversion | |
465 | is an \emph{in-chain upcast}. The conversion can be performed using the | |
466 | appropriate generated upcast macro (see below), or by simply casting the | |
467 | pointer, using C's usual cast operator (or the \Cplusplus\ @|static_cast<>| | |
468 | operator). | |
469 | \item If $B$ is a superclass of~$C$, in a different chain, then the | |
470 | conversion is a \emph{cross-chain upcast}. The conversion is more than a | |
471 | simple type change: the pointer value must be adjusted. If the direct | |
472 | class of the instance in question is not known, the conversion will require | |
473 | a lookup at runtime to find the appropriate offset by which to adjust the | |
474 | pointer. The conversion can be performed using the appropriate generated | |
475 | upcast macro (see below); the general case is handled by the macro | |
58f9b400 | 476 | \descref{SOD_XCHAIN}{mac}. |
e4ea29d8 | 477 | \item If $B$ is a subclass of~$C$ then the conversion is a \emph{downcast}; |
3cc520db MW |
478 | otherwise the conversion is a~\emph{cross-cast}. In either case, the |
479 | conversion can fail: the object in question might not be an instance of~$B$ | |
e4ea29d8 | 480 | after all. The macro \descref{SOD_CONVERT}{mac} and the function |
58f9b400 | 481 | \descref{sod_convert}{fun} perform general conversions. They return a null |
054e8f8f | 482 | pointer if the conversion fails. (These are therefore your analogue to the |
e4ea29d8 | 483 | \Cplusplus\ @|dynamic_cast<>| operator.) |
3cc520db MW |
484 | \end{itemize} |
485 | The Sod translator generates macros for performing both in-chain and | |
486 | cross-chain upcasts. For each class~$C$, and each proper superclass~$B$ | |
487 | of~$C$, a macro is defined: given an argument of type pointer to class type | |
488 | of~$C$, it returns a pointer to the same instance, only with type pointer to | |
489 | class type of~$B$, adjusted as necessary in the case of a cross-chain | |
490 | conversion. The macro is named by concatenating | |
491 | \begin{itemize} | |
492 | \item the name of class~$C$, in upper case, | |
493 | \item the characters `@|__CONV_|', and | |
494 | \item the nickname of class~$B$, in upper case; | |
495 | \end{itemize} | |
496 | e.g., if $C$ is named @|MyClass|, and $B$'s name is @|SuperClass| with | |
497 | nickname @|super|, then the macro @|MYCLASS__CONV_SUPER| converts a | |
498 | @|MyClass~*| to a @|SuperClass~*|. See | |
499 | \xref{sec:structures.layout.additional} for the formal description. | |
500 | ||
501 | %%%-------------------------------------------------------------------------- | |
9e91c8e7 MW |
502 | \section{Keyword arguments} \label{sec:concepts.keywords} |
503 | ||
504 | In standard C, the actual arguments provided to a function are matched up | |
505 | with the formal arguments given in the function definition according to their | |
506 | ordering in a list. Unless the (rather cumbersome) machinery for dealing | |
507 | with variable-length argument tails (@|<stdarg.h>|) is used, exactly the | |
508 | correct number of arguments must be supplied, and in the correct order. | |
509 | ||
510 | A \emph{keyword argument} is matched by its distinctive \emph{name}, rather | |
511 | than by its position in a list. Keyword arguments may be \emph{omitted}, | |
512 | causing some default behaviour by the function. A function can detect | |
513 | whether a particular keyword argument was supplied: so the default behaviour | |
514 | need not be the same as that caused by any specific value of the argument. | |
515 | ||
516 | Keyword arguments can be provided in three ways. | |
517 | \begin{enumerate} | |
518 | \item Directly, as a variable-length argument tail, consisting (for the most | |
519 | part) of alternating keyword names, as pointers to null-terminated strings, | |
520 | and argument values, and terminated by a null pointer. This is somewhat | |
521 | error-prone, and the support library defines some macros which help ensure | |
522 | that keyword argument lists are well formed. | |
523 | \item Indirectly, through a @|va_list| object capturing a variable-length | |
524 | argument tail passed to some other function. Such indirect argument tails | |
525 | have the same structure as the direct argument tails described above. | |
526 | Because @|va_list| objects are hard to copy, the keyword-argument support | |
527 | library consistently passes @|va_list| objects \emph{by reference} | |
528 | throughout its programming interface. | |
529 | \item Indirectly, through a vector of @|struct kwval| objects, each of which | |
530 | contains a keyword name, as a pointer to a null-terminated string, and the | |
531 | \emph{address} of a corresponding argument value. (This indirection is | |
532 | necessary so that the items in the vector can be of uniform size.) | |
533 | Argument vectors are rather inconvenient to use, but are the only practical | |
534 | way in which a caller can decide at runtime which arguments to include in a | |
535 | call, which is useful when writing wrapper functions. | |
536 | \end{enumerate} | |
537 | ||
4f634d20 MW |
538 | Perhaps surprisingly, keyword arguments have a relatively small performance |
539 | impact. On the author's aging laptop, a call to a simple function, passing | |
540 | two out of three keyword arguments, takes about 30 cycles longer than calling | |
541 | a standard function which just takes integer arguments. On the other hand, | |
542 | quite a lot of code is involved in decoding keyword arguments, so code size | |
543 | will naturally suffer. | |
544 | ||
9e91c8e7 | 545 | Keyword arguments are provided as a general feature for C functions. |
43073476 | 546 | However, Sod has special support for messages which accept keyword arguments |
8ec911fa | 547 | (\xref{sec:concepts.methods.keywords}); and they play an essential rôle in |
a142609c | 548 | the instance construction protocol (\xref{sec:concepts.lifecycle.birth}). |
9e91c8e7 MW |
549 | |
550 | %%%-------------------------------------------------------------------------- | |
3cc520db MW |
551 | \section{Messages and methods} \label{sec:concepts.methods} |
552 | ||
553 | Objects can be sent \emph{messages}. A message has a \emph{name}, and | |
554 | carries a number of \emph{arguments}. When an object is sent a message, a | |
555 | function, determined by the receiving object's class, is invoked, passing it | |
556 | the receiver and the message arguments. This function is called the | |
557 | class's \emph{effective method} for the message. The effective method can do | |
558 | anything a C function can do, including reading or updating program state or | |
559 | object slots, sending more messages, calling other functions, issuing system | |
560 | calls, or performing I/O; if it finishes, it may return a value, which is | |
561 | returned in turn to the message sender. | |
562 | ||
563 | The set of messages an object can receive, characterized by their names, | |
564 | argument types, and return type, is determined by the object's class. Each | |
565 | class can define new messages, which can be received by any instance of that | |
566 | class. The messages defined by a single class must have distinct names: | |
567 | there is no `function overloading'. As with slots | |
568 | (\xref{sec:concepts.classes.slots}), messages defined by distinct classes are | |
569 | always distinct, even if they have the same names: references to messages are | |
570 | always qualified by the defining class's name or nickname. | |
571 | ||
572 | Messages may take any number of arguments, of any non-array value type. | |
573 | Since message sends are effectively function calls, arguments of array type | |
574 | are implicitly converted to values of the corresponding pointer type. While | |
575 | message definitions may ascribe an array type to an argument, the formal | |
576 | argument will have pointer type, as is usual for C functions. A message may | |
577 | accept a variable-length argument suffix, denoted @|\dots|. | |
578 | ||
579 | A class definition may include \emph{direct methods} for messages defined by | |
580 | it or any of its superclasses. | |
581 | ||
582 | Like messages, direct methods define argument lists and return types, but | |
8ec911fa | 583 | they may also have a \emph{body}, and a \emph{rôle}. |
3cc520db MW |
584 | |
585 | A direct method need not have the same argument list or return type as its | |
586 | message. The acceptable argument lists and return types for a method depend | |
587 | on the message, in particular its method combination | |
8ec911fa | 588 | (\xref{sec:concepts.methods.combination}), and the method's rôle. |
3cc520db MW |
589 | |
590 | A direct method body is a block of C code, and the Sod translator usually | |
591 | defines, for each direct method, a function with external linkage, whose body | |
592 | contains a copy of the direct method body. Within the body of a direct | |
593 | method defined for a class $C$, the variable @|me|, of type pointer to class | |
594 | type of $C$, refers to the receiving object. | |
595 | ||
0a2d4b68 | 596 | |
3cc520db MW |
597 | \subsection{Effective methods and method combinations} |
598 | \label{sec:concepts.methods.combination} | |
599 | ||
600 | For each message a direct instance of a class might receive, there is a set | |
601 | of \emph{applicable methods}, which are exactly the direct methods defined on | |
602 | the object's class and its superclasses. These direct methods are combined | |
603 | together to form the \emph{effective method} for that particular class and | |
604 | message. Direct methods can be combined into an effective method in | |
605 | different ways, according to the \emph{method combination} specified by the | |
8ec911fa MW |
606 | message. The method combination determines which direct method rôles are |
607 | acceptable, and, for each rôle, the appropriate argument lists and return | |
3cc520db MW |
608 | types. |
609 | ||
610 | One direct method, $M$, is said to be more (resp.\ less) \emph{specific} than | |
611 | another, $N$, with respect to a receiving class~$C$, if the class defining | |
612 | $M$ is a more (resp.\ less) specific superclass of~$C$ than the class | |
613 | defining $N$. | |
614 | ||
43073476 | 615 | \subsubsection{The standard method combination} |
3cc520db MW |
616 | The default method combination is called the \emph{standard method |
617 | combination}; other method combinations are useful occasionally for special | |
8ec911fa | 618 | effects. The standard method combination accepts four direct method rôles, |
9761db0d | 619 | called `primary' (the default), @|before|, @|after|, and @|around|. |
3cc520db MW |
620 | |
621 | All direct methods subject to the standard method combination must have | |
622 | argument lists which \emph{match} the message's argument list: | |
623 | \begin{itemize} | |
624 | \item the method's arguments must have the same types as the message, though | |
625 | the arguments may have different names; and | |
626 | \item if the message accepts a variable-length argument suffix then the | |
627 | direct method must instead have a final argument of type @|va_list|. | |
628 | \end{itemize} | |
b1254eb6 MW |
629 | Primary and @|around| methods must have the same return type as the message; |
630 | @|before| and @|after| methods must return @|void| regardless of the | |
631 | message's return type. | |
3cc520db MW |
632 | |
633 | If there are no applicable primary methods then no effective method is | |
634 | constructed: the vtables contain null pointers in place of pointers to method | |
635 | entry functions. | |
636 | ||
f1aa19a8 | 637 | \begin{figure} |
d82d5db5 | 638 | \hbox to\hsize{\hss\hbox{\begin{tikzpicture} |
a4094071 | 639 | [order/.append style={color=green!70!black}, |
f1aa19a8 MW |
640 | code/.append style={font=\sffamily}, |
641 | action/.append style={font=\itshape}, | |
642 | method/.append style={rectangle, draw=black, thin, fill=blue!30, | |
643 | text height=\ht\strutbox, text depth=\dp\strutbox, | |
644 | minimum width=40mm}] | |
645 | ||
646 | \def\delgstack#1#2#3{ | |
647 | \node (#10) [method, #2] {#3}; | |
648 | \node (#11) [method, above=6mm of #10] {#3}; | |
649 | \draw [->] ($(#10.north)!.5!(#10.north west) + (0mm, 1mm)$) -- | |
650 | ++(0mm, 4mm) | |
651 | node [code, left=4pt, midway] {next_method}; | |
652 | \draw [<-] ($(#10.north)!.5!(#10.north east) + (0mm, 1mm)$) -- | |
653 | ++(0mm, 4mm) | |
654 | node [action, right=4pt, midway] {return}; | |
655 | \draw [->] ($(#11.north)!.5!(#11.north west) + (0mm, 1mm)$) -- | |
656 | ++(0mm, 4mm) | |
657 | node [code, left=4pt, midway] {next_method} | |
658 | node (ld) [above] {$\smash\vdots\mathstrut$}; | |
659 | \draw [<-] ($(#11.north)!.5!(#11.north east) + (0mm, 1mm)$) -- | |
660 | ++(0mm, 4mm) | |
661 | node [action, right=4pt, midway] {return} | |
662 | node (rd) [above] {$\smash\vdots\mathstrut$}; | |
663 | \draw [->] ($(ld.north) + (0mm, 1mm)$) -- ++(0mm, 4mm) | |
664 | node [code, left=4pt, midway] {next_method}; | |
665 | \draw [<-] ($(rd.north) + (0mm, 1mm)$) -- ++(0mm, 4mm) | |
666 | node [action, right=4pt, midway] {return}; | |
667 | \node (p) at ($(ld.north)!.5!(rd.north)$) {}; | |
668 | \node (#1n) [method, above=5mm of p] {#3}; | |
669 | \draw [->, order] ($(#10.south east) + (4mm, 1mm)$) -- | |
670 | ($(#1n.north east) + (4mm, -1mm)$) | |
671 | node [midway, right, align=left] | |
672 | {Most to \\ least \\ specific};} | |
673 | ||
dc20d91f | 674 | \delgstack{a}{}{@|around| method} |
f1aa19a8 MW |
675 | \draw [<-] ($(a0.south)!.5!(a0.south west) - (0mm, 1mm)$) -- |
676 | ++(0mm, -4mm); | |
677 | \draw [->] ($(a0.south)!.5!(a0.south east) - (0mm, 1mm)$) -- | |
678 | ++(0mm, -4mm) | |
679 | node [action, right=4pt, midway] {return}; | |
680 | ||
681 | \draw [->] ($(an.north)!.6!(an.north west) + (0mm, 1mm)$) -- | |
682 | ++(-8mm, 8mm) | |
683 | node [code, midway, left=3mm] {next_method} | |
684 | node (b0) [method, above left = 1mm + 4mm and -6mm - 4mm] {}; | |
685 | \node (b1) [method] at ($(b0) - (2mm, 2mm)$) {}; | |
dc20d91f | 686 | \node (bn) [method] at ($(b1) - (2mm, 2mm)$) {@|before| method}; |
f1aa19a8 MW |
687 | \draw [->, order] ($(bn.west) - (6mm, 0mm)$) -- ++(12mm, 12mm) |
688 | node [midway, above left, align=center] {Most to \\ least \\ specific}; | |
689 | \draw [->] ($(b0.north east) + (-10mm, 1mm)$) -- ++(8mm, 8mm) | |
690 | node (p) {}; | |
691 | ||
692 | \delgstack{m}{above right=1mm and 0mm of an.west |- p}{Primary method} | |
693 | \draw [->] ($(mn.north)!.5!(mn.north west) + (0mm, 1mm)$) -- ++(0mm, 4mm) | |
694 | node [code, left=4pt, midway] {next_method} | |
695 | node [above right = 0mm and -8mm] | |
696 | {$\vcenter{\hbox{\Huge\textcolor{red}{!}}} | |
697 | \vcenter{\hbox{\begin{tabular}[c]{l} | |
698 | \textsf{next_method} \\ | |
699 | pointer is null | |
700 | \end{tabular}}}$}; | |
701 | ||
702 | \draw [->, color=blue, dotted] | |
703 | ($(m0.south)!.2!(m0.south east) - (0mm, 1mm)$) -- | |
704 | ($(an.north)!.2!(an.north east) + (0mm, 1mm)$) | |
705 | node [midway, sloped, below] {Return value}; | |
706 | ||
707 | \draw [<-] ($(an.north)!.6!(an.north east) + (0mm, 1mm)$) -- | |
708 | ++(8mm, 8mm) | |
709 | node [action, midway, right=3mm] {return} | |
710 | node (f0) [method, above right = 1mm and -6mm] {}; | |
711 | \node (f1) [method] at ($(f0) + (-2mm, 2mm)$) {}; | |
dc20d91f | 712 | \node (fn) [method] at ($(f1) + (-2mm, 2mm)$) {@|after| method}; |
f1aa19a8 MW |
713 | \draw [<-, order] ($(f0.east) + (6mm, 0mm)$) -- ++(-12mm, 12mm) |
714 | node [midway, above right, align=center] | |
715 | {Least to \\ most \\ specific}; | |
716 | \draw [<-] ($(fn.north west) + (6mm, 1mm)$) -- ++(-8mm, 8mm); | |
717 | ||
d82d5db5 | 718 | \end{tikzpicture}}\hss} |
f1aa19a8 MW |
719 | |
720 | \caption{The standard method combination} | |
721 | \label{fig:concepts.methods.stdmeth} | |
722 | \end{figure} | |
723 | ||
3cc520db | 724 | The effective method for a message with standard method combination works as |
f1aa19a8 | 725 | follows (see also~\xref{fig:concepts.methods.stdmeth}). |
3cc520db MW |
726 | \begin{enumerate} |
727 | ||
8ec911fa | 728 | \item If any applicable methods have the @|around| rôle, then the most |
3cc520db MW |
729 | specific such method, with respect to the class of the receiving object, is |
730 | invoked. | |
731 | ||
b1254eb6 | 732 | Within the body of an @|around| method, the variable @|next_method| is |
3cc520db MW |
733 | defined, having pointer-to-function type. The method may call this |
734 | function, as described below, any number of times. | |
735 | ||
b1254eb6 MW |
736 | If there any remaining @|around| methods, then @|next_method| invokes the |
737 | next most specific such method, returning whichever value that method | |
dc20d91f MW |
738 | returns; otherwise the behaviour of @|next_method| is to invoke the |
739 | @|before| methods (if any), followed by the most specific primary method, | |
b0563651 | 740 | followed by the @|after| methods (if any), and to return whichever value |
dc20d91f MW |
741 | was returned by the most specific primary method, as described in the |
742 | following items. That is, the behaviour of the least specific @|around| | |
743 | method's @|next_method| function is exactly the behaviour that the | |
744 | effective method would have if there were no @|around| methods. Note that | |
745 | if the least-specific @|around| method calls its @|next_method| more than | |
746 | once then the whole sequence of @|before|, primary, and @|after| methods | |
747 | occurs multiple times. | |
3cc520db | 748 | |
b1254eb6 MW |
749 | The value returned by the most specific @|around| method is the value |
750 | returned by the effective method. | |
3cc520db | 751 | |
8ec911fa | 752 | \item If any applicable methods have the @|before| rôle, then they are all |
3cc520db MW |
753 | invoked, starting with the most specific. |
754 | ||
755 | \item The most specific applicable primary method is invoked. | |
756 | ||
757 | Within the body of a primary method, the variable @|next_method| is | |
758 | defined, having pointer-to-function type. If there are no remaining less | |
759 | specific primary methods, then @|next_method| is a null pointer. | |
760 | Otherwise, the method may call the @|next_method| function any number of | |
761 | times. | |
762 | ||
763 | The behaviour of the @|next_method| function, if it is not null, is to | |
764 | invoke the next most specific applicable primary method, and to return | |
765 | whichever value that method returns. | |
766 | ||
b1254eb6 MW |
767 | If there are no applicable @|around| methods, then the value returned by |
768 | the most specific primary method is the value returned by the effective | |
769 | method; otherwise the value returned by the most specific primary method is | |
770 | returned to the least specific @|around| method, which called it via its | |
771 | own @|next_method| function. | |
3cc520db | 772 | |
8ec911fa | 773 | \item If any applicable methods have the @|after| rôle, then they are all |
3cc520db | 774 | invoked, starting with the \emph{least} specific. (Hence, the most |
b1254eb6 | 775 | specific @|after| method is invoked with the most `afterness'.) |
3cc520db MW |
776 | |
777 | \end{enumerate} | |
778 | ||
b1254eb6 MW |
779 | A typical use for @|around| methods is to allow a base class to set up the |
780 | dynamic environment appropriately for the primary methods of its subclasses, | |
756e9293 | 781 | e.g., by claiming a lock, and releasing it afterwards. |
3cc520db | 782 | |
9761db0d | 783 | The @|next_method| function provided to methods with the primary and |
8ec911fa | 784 | @|around| rôles accepts the same arguments, and returns the same type, as the |
3cc520db MW |
785 | message, except that one or two additional arguments are inserted at the |
786 | front of the argument list. The first additional argument is always the | |
787 | receiving object, @|me|. If the message accepts a variable argument suffix, | |
788 | then the second addition argument is a @|va_list|; otherwise there is no | |
789 | second additional argument; otherwise, In the former case, a variable | |
790 | @|sod__master_ap| of type @|va_list| is defined, containing a separate copy | |
791 | of the argument pointer (so the method body can process the variable argument | |
792 | suffix itself, and still pass a fresh copy on to the next method). | |
793 | ||
8ec911fa | 794 | A method with the primary or @|around| rôle may use the convenience macro |
3cc520db MW |
795 | @|CALL_NEXT_METHOD|, which takes no arguments itself, and simply calls |
796 | @|next_method| with appropriate arguments: the receiver @|me| pointer, the | |
797 | argument pointer @|sod__master_ap| (if applicable), and the method's | |
798 | arguments. If the method body has overwritten its formal arguments, then | |
799 | @|CALL_NEXT_METHOD| will pass along the updated values, rather than the | |
800 | original ones. | |
801 | ||
781a8fbd MW |
802 | A primary or @|around| method which invokes its @|next_method| function is |
803 | said to \emph{extend} the message behaviour; a method which does not invoke | |
804 | its @|next_method| is said to \emph{override} the behaviour. Note that a | |
805 | method may make a decision to override or extend at runtime. | |
806 | ||
43073476 | 807 | \subsubsection{Aggregating method combinations} |
3cc520db MW |
808 | A number of other method combinations are provided. They are called |
809 | `aggregating' method combinations because, instead of invoking just the most | |
810 | specific primary method, as the standard method combination does, they invoke | |
811 | the applicable primary methods in turn and aggregate the return values from | |
812 | each. | |
813 | ||
8ec911fa | 814 | The aggregating method combinations accept the same four rôles as the |
b1254eb6 MW |
815 | standard method combination, and @|around|, @|before|, and @|after| methods |
816 | work in the same way. | |
3cc520db MW |
817 | |
818 | The aggregating method combinations provided are as follows. | |
819 | \begin{description} \let\makelabel\code | |
820 | \item[progn] The message must return @|void|. The applicable primary methods | |
821 | are simply invoked in turn, most specific first. | |
822 | \item[sum] The message must return a numeric type.\footnote{% | |
3e2f441e | 823 | The Sod translator doesn't check this, since it doesn't have enough |
3cc520db MW |
824 | insight into @|typedef| names.} % |
825 | The applicable primary methods are invoked in turn, and their return values | |
826 | added up. The final result is the sum of the individual values. | |
827 | \item[product] The message must return a numeric type. The applicable | |
828 | primary methods are invoked in turn, and their return values multiplied | |
829 | together. The final result is the product of the individual values. | |
830 | \item[min] The message must return a scalar type. The applicable primary | |
831 | methods are invoked in turn. The final result is the smallest of the | |
832 | individual values. | |
833 | \item[max] The message must return a scalar type. The applicable primary | |
834 | methods are invoked in turn. The final result is the largest of the | |
835 | individual values. | |
665a0455 MW |
836 | \item[and] The message must return a scalar type. The applicable primary |
837 | methods are invoked in turn. If any method returns zero then the final | |
838 | result is zero and no further methods are invoked. If all of the | |
839 | applicable primary methods return nonzero, then the final result is the | |
840 | result of the last primary method. | |
841 | \item[or] The message must return a scalar type. The applicable primary | |
842 | methods are invoked in turn. If any method returns nonzero then the final | |
843 | result is that nonzero value and no further methods are invoked. If all of | |
844 | the applicable primary methods return zero, then the final result is zero. | |
3cc520db MW |
845 | \end{description} |
846 | ||
847 | There is also a @|custom| aggregating method combination, which is described | |
848 | in \xref{sec:fixme.custom-aggregating-method-combination}. | |
849 | ||
43073476 | 850 | |
f4e44f7f | 851 | \subsection{Method entries} \label{sec:concepts.methods.entry} |
caa6f4b9 | 852 | |
caa6f4b9 MW |
853 | The effective methods for each class are determined at translation time, by |
854 | the Sod translator. For each effective method, one or more \emph{method | |
855 | entry functions} are constructed. A method entry function has three | |
856 | responsibilities. | |
857 | \begin{itemize} | |
858 | \item It converts the receiver pointer to the correct type. Method entry | |
859 | functions can perform these conversions extremely efficiently: there are | |
860 | separate method entries for each chain of each class which can receive a | |
861 | message, so method entry functions are in the privileged situation of | |
862 | knowing the \emph{exact} class of the receiving object. | |
863 | \item If the message accepts a variable-length argument tail, then two method | |
864 | entry functions are created for each chain of each class: one receives a | |
865 | variable-length argument tail, as intended, and captures it in a @|va_list| | |
866 | object; the other accepts an argument of type @|va_list| in place of the | |
867 | variable-length tail and arranges for it to be passed along to the direct | |
868 | methods. | |
869 | \item It invokes the effective method with the appropriate arguments. There | |
870 | might or might not be an actual function corresponding to the effective | |
871 | method itself: the translator may instead open-code the effective method's | |
872 | behaviour into each method entry function; and the machinery for handling | |
873 | `delegation chains', such as is used for @|around| methods and primary | |
874 | methods in the standard method combination, is necessarily scattered among | |
875 | a number of small functions. | |
876 | \end{itemize} | |
877 | ||
878 | ||
43073476 MW |
879 | \subsection{Messages with keyword arguments} |
880 | \label{sec:concepts.methods.keywords} | |
881 | ||
882 | A message or a direct method may declare that it accepts keyword arguments. | |
883 | A message which accepts keyword arguments is called a \emph{keyword message}; | |
884 | a direct method which accepts keyword arguments is called a \emph{keyword | |
885 | method}. | |
886 | ||
887 | While method combinations may set their own rules, usually keyword methods | |
888 | can only be defined on keyword messages, and all methods defined on a keyword | |
889 | message must be keyword methods. The direct methods defined on a keyword | |
890 | message may differ in the keywords they accept, both from each other, and | |
bf2e7452 MW |
891 | from the message. If two applicable methods on the same message both accept |
892 | a keyword argument with the same name, then these two keyword arguments must | |
893 | also have the same type. Different applicable methods may declare keyword | |
894 | arguments with the same name but different defaults; see below. | |
43073476 MW |
895 | |
896 | The keyword arguments acceptable in a message sent to an object are the | |
897 | keywords listed in the message definition, together with all of the keywords | |
898 | accepted by any applicable method. There is no easy way to determine at | |
899 | runtime whether a particular keyword is acceptable in a message to a given | |
900 | instance. | |
901 | ||
902 | At runtime, a direct method which accepts one or more keyword arguments | |
903 | receives an additional argument named @|suppliedp|. This argument is a small | |
904 | structure. For each keyword argument named $k$ accepted by the direct | |
905 | method, @|suppliedp| contains a one-bit-wide bitfield member of type | |
906 | @|unsigned|, also named $k$. If a keyword argument named $k$ was passed in | |
907 | the message, then @|suppliedp.$k$| is one, and $k$ contains the argument | |
908 | value; otherwise @|suppliedp.$k$| is zero, and $k$ contains the default value | |
909 | from the direct method definition if there was one, or an unspecified value | |
910 | otherwise. | |
911 | ||
d24d47f5 MW |
912 | %%%-------------------------------------------------------------------------- |
913 | \section{The object lifecycle} \label{sec:concepts.lifecycle} | |
914 | ||
915 | \subsection{Creation} \label{sec:concepts.lifecycle.birth} | |
916 | ||
917 | Construction of a new instance of a class involves three steps. | |
918 | \begin{enumerate} | |
919 | \item \emph{Allocation} arranges for there to be storage space for the | |
920 | instance's slots and associated metadata. | |
921 | \item \emph{Imprinting} fills in the instance's metadata, associating the | |
922 | instance with its class. | |
923 | \item \emph{Initialization} stores appropriate initial values in the | |
924 | instance's slots, and maybe links it into any external data structures as | |
925 | necessary. | |
926 | \end{enumerate} | |
927 | The \descref{SOD_DECL}[macro]{mac} handles constructing instances with | |
a42893dd | 928 | automatic storage duration (`on the stack'). Similarly, the |
ea214e5e | 929 | \descref{SOD_MAKE}[macro]{mac} and the \descref*{sod_make}{fun} and |
a42893dd MW |
930 | \descref{sod_makev}{fun} functions construct instances allocated from the |
931 | standard @|malloc| heap. Programmers can add support for other allocation | |
932 | strategies by using the \descref{SOD_INIT}[macro]{mac} and the | |
ea214e5e MW |
933 | \descref*{sod_init}{fun} and \descref{sod_initv}{fun} functions, which |
934 | package up imprinting and initialization. | |
d24d47f5 MW |
935 | |
936 | \subsubsection{Allocation} | |
937 | Instances of most classes (specifically including those classes defined by | |
938 | Sod itself) can be held in any storage of sufficient size. The in-memory | |
939 | layout of an instance of some class~$C$ is described by the type @|struct | |
940 | $C$__ilayout|, and if the relevant class is known at compile time then the | |
941 | best way to discover the layout size is with the @|sizeof| operator. Failing | |
942 | that, the size required to hold an instance of $C$ is available in a slot in | |
99fca9a5 MW |
943 | $C$'s class object, as @|$C$__class@->cls.initsz|. The necessary alignment, |
944 | in bytes, is provided as @|$C$__class@->cls.align|, should this be necessary. | |
d24d47f5 MW |
945 | |
946 | It is not in general sufficient to declare, or otherwise allocate, an object | |
947 | of the class type $C$. The class type only describes a single chain of the | |
948 | object's layout. It is nearly always an error to use the class type as if it | |
949 | is a \emph{complete type}, e.g., to declare objects or arrays of the class | |
950 | type, or to enquire about its size or alignment requirements. | |
951 | ||
952 | Instance layouts may be declared as objects with automatic storage duration | |
953 | (colloquially, `allocated on the stack') or allocated dynamically, e.g., | |
954 | using @|malloc|. They may be included as members of structures or unions, or | |
955 | elements of arrays. Sod's runtime system doesn't retain addresses of | |
956 | instances, so, for example, Sod doesn't make using fancy allocators which | |
957 | sometimes move objects around in memory any more difficult than it needs to | |
958 | be. | |
959 | ||
d24d47f5 MW |
960 | The following simple function correctly allocates and returns space for an |
961 | instance of a class given a pointer to its class object @<cls>. | |
962 | \begin{prog} | |
020b9e2b | 963 | void *allocate_instance(const SodClass *cls) \\ \ind |
d24d47f5 MW |
964 | \{ return malloc(cls@->cls.initsz); \} |
965 | \end{prog} | |
966 | ||
967 | \subsubsection{Imprinting} | |
968 | Once storage has been allocated, it must be \emph{imprinted} before it can be | |
969 | used as an instance of a class, e.g., before any messages can be sent to it. | |
970 | ||
971 | Imprinting an instance stores some metadata about its direct class in the | |
972 | instance structure, so that the rest of the program (and Sod's runtime | |
973 | library) can tell what sort of object it is, and how to use it.\footnote{% | |
974 | Specifically, imprinting an instance's storage involves storing the | |
975 | appropriate vtable pointers in the right places in it.} % | |
976 | A class object's @|imprint| slot points to a function which will correctly | |
977 | imprint storage for one of that class's instances. | |
978 | ||
979 | Once an instance's storage has been imprinted, it is technically possible to | |
980 | send messages to the instance; however the instance's slots are still | |
756e9293 MW |
981 | uninitialized at this point, so the applicable methods are unlikely to do |
982 | much of any use unless they've been written specifically for the purpose. | |
d24d47f5 MW |
983 | |
984 | The following simple function imprints storage at address @<p> as an instance | |
985 | of a class, given a pointer to its class object @<cls>. | |
986 | \begin{prog} | |
020b9e2b | 987 | void imprint_instance(const SodClass *cls, void *p) \\ \ind |
d24d47f5 MW |
988 | \{ cls@->cls.imprint(p); \} |
989 | \end{prog} | |
990 | ||
991 | \subsubsection{Initialization} | |
992 | The final step for constructing a new instance is to \emph{initialize} it, to | |
993 | establish the necessary invariants for the instance itself and the | |
994 | environment in which it operates. | |
995 | ||
996 | Details of initialization are necessarily class-specific, but typically it | |
997 | involves setting the instance's slots to appropriate values, and possibly | |
d1b394fa MW |
998 | linking it into some larger data structure to keep track of it. It is |
999 | possible for initialization methods to attempt to allocate resources, but | |
1000 | this must be done carefully: there is currently no way to report an error | |
1001 | from object initialization, so the object must be marked as incompletely | |
1002 | initialized, and left in a state where it will be safe to tear down later. | |
d24d47f5 | 1003 | |
a142609c MW |
1004 | Initialization is performed by sending the imprinted instance an @|init| |
1005 | message, defined by the @|SodObject| class. This message uses a nonstandard | |
1006 | method combination which works like the standard combination, except that the | |
1007 | \emph{default behaviour}, if there is no overriding method, is to initialize | |
b2983f35 MW |
1008 | the instance's slots, as described below, and to invoke each superclass's |
1009 | initialization fragments. This default behaviour may be invoked multiple | |
1010 | times if some method calls on its @|next_method| more than once, unless some | |
1011 | other method takes steps to prevent this. | |
a142609c | 1012 | |
27ec3825 MW |
1013 | Slots are initialized in a well-defined order. |
1014 | \begin{itemize} | |
054e8f8f MW |
1015 | \item Slots defined by a more specific superclass are initialized after slots |
1016 | defined by a less specific superclass. | |
27ec3825 MW |
1017 | \item Slots defined by the same class are initialized in the order in which |
1018 | their definitions appear. | |
1019 | \end{itemize} | |
1020 | ||
a42893dd MW |
1021 | A class can define \emph{initialization fragments}: pieces of literal code to |
1022 | be executed to set up a new instance. Each superclass's initialization | |
1023 | fragments are executed with @|me| bound to an instance pointer of the | |
1024 | appropriate superclass type, immediately after that superclass's slots (if | |
1025 | any) have been initialized; therefore, fragments defined by a more specific | |
13cb243a | 1026 | superclass are executed after fragments defined by a less specific |
a42893dd MW |
1027 | superclass. A class may define more than one initialization fragment: the |
1028 | fragments are executed in the order in which they appear in the class | |
1029 | definition. It is possible for an initialization fragment to use @|return| | |
1030 | or @|goto| for special control-flow effects, but this is not likely to be a | |
1031 | good idea. | |
1032 | ||
b2983f35 MW |
1033 | The @|init| message accepts keyword arguments |
1034 | (\xref{sec:concepts.methods.keywords}). The set of acceptable keywords is | |
1035 | determined by the applicable methods as usual, but also by the | |
1036 | \emph{initargs} defined by the receiving instance's class and its | |
1037 | superclasses, which are made available to slot initializers and | |
1038 | initialization fragments. | |
1039 | ||
1040 | There are two kinds of initarg definitions. \emph{User initargs} are defined | |
1041 | by an explicit @|initarg| item appearing in a class definition: the item | |
1042 | defines a name, type, and (optionally) a default value for the initarg. | |
1043 | \emph{Slot initargs} are defined by attaching an @|initarg| property to a | |
756e9293 MW |
1044 | slot or slot initializer item: the property's value determines the initarg's |
1045 | name, while the type is taken from the underlying slot type; slot initargs do | |
1046 | not have default values. Both kinds define a \emph{direct initarg} for the | |
e1f775a8 MW |
1047 | containing class. (Note that a slot may have any number of slot initargs; |
1048 | and any number of slots may have initargs with the same name.) | |
b2983f35 MW |
1049 | |
1050 | Initargs are inherited. The \emph{applicable} direct initargs for an @|init| | |
1051 | effective method are those defined by the receiving object's class, and all | |
1052 | of its superclasses. Applicable direct initargs with the same name are | |
1053 | merged to form \emph{effective initargs}. An error is reported if two | |
1054 | applicable direct initargs have the same name but different types. The | |
1055 | default value of an effective initarg is taken from the most specific | |
1056 | applicable direct initarg which specifies a defalt value; if no applicable | |
1057 | direct initarg specifies a default value then the effective initarg has no | |
1058 | default. | |
1059 | ||
1060 | All initarg values are made available at runtime to user code -- | |
1061 | initialization fragments and slot initializer expressions -- through local | |
1062 | variables and a @|suppliedp| structure, as in a direct method | |
1063 | (\xref{sec:concepts.methods.keywords}). Furthermore, slot initarg | |
1064 | definitions influence the initialization of slots. | |
1065 | ||
1066 | The process for deciding how to initialize a particular slot works as | |
1067 | follows. | |
1068 | \begin{enumerate} | |
e1f775a8 | 1069 | |
b2983f35 MW |
1070 | \item If there are any slot initargs defined on the slot, or any of its slot |
1071 | initializers, \emph{and} the sender supplied a value for one or more of the | |
e1f775a8 MW |
1072 | corresponding effective initargs, then the value of the most specific such |
1073 | initarg is stored in the slot. (For this purpose, initargs defined earlier | |
1074 | in a class definition are more specific than initargs defined later.) | |
1075 | ||
b2983f35 MW |
1076 | \item Otherwise, if there are any slot initializers defined which include an |
1077 | initializer expression, then the initializer expression from the most | |
1078 | specific such slot initializer is evaluated and its value stored in the | |
e1f775a8 MW |
1079 | slot. (A class may define at most one initializer for any particular slot, |
1080 | so no further disambiguation is required.) | |
1081 | ||
b2983f35 | 1082 | \item Otherwise, the slot is left uninitialized. |
e1f775a8 | 1083 | |
b2983f35 MW |
1084 | \end{enumerate} |
1085 | Note that the default values (if any) of effective initargs do \emph{not} | |
1086 | affect this procedure. | |
d24d47f5 | 1087 | |
d24d47f5 MW |
1088 | |
1089 | \subsection{Destruction} | |
1090 | \label{sec:concepts.lifecycle.death} | |
1091 | ||
1092 | Destruction of an instance, when it is no longer required, consists of two | |
1093 | steps. | |
1094 | \begin{enumerate} | |
1095 | \item \emph{Teardown} releases any resources held by the instance and | |
1096 | disentangles it from any external data structures. | |
1097 | \item \emph{Deallocation} releases the memory used to store the instance so | |
1098 | that it can be reused. | |
1099 | \end{enumerate} | |
a42893dd MW |
1100 | Teardown alone, for objects which require special deallocation, or for which |
1101 | deallocation occurs automatically (e.g., instances with automatic storage | |
1102 | duration, or instances whose storage will be garbage-collected), is performed | |
1103 | using the \descref{sod_teardown}[function]{fun}. Destruction of instances | |
1104 | allocated from the standard @|malloc| heap is done using the | |
1105 | \descref{sod_destroy}[function]{fun}. | |
d24d47f5 MW |
1106 | |
1107 | \subsubsection{Teardown} | |
7646dc4c MW |
1108 | Details of teardown are necessarily class-specific, but typically it |
1109 | involves releasing resources held by the instance, and disentangling it from | |
1110 | any data structures it might be linked into. | |
a42893dd MW |
1111 | |
1112 | Teardown is performed by sending the instance the @|teardown| message, | |
1113 | defined by the @|SodObject| class. The message returns an integer, used as a | |
1114 | boolean flag. If the message returns zero, then the instance's storage | |
1115 | should be deallocated. If the message returns nonzero, then it is safe for | |
1116 | the caller to forget about instance, but should not deallocate its storage. | |
1117 | This is \emph{not} an error return: if some teardown method fails then the | |
1118 | program may be in an inconsistent state and should not continue. | |
d24d47f5 | 1119 | |
a42893dd MW |
1120 | This simple protocol can be used, for example, to implement a reference |
1121 | counting system, as follows. | |
d24d47f5 | 1122 | \begin{prog} |
020b9e2b | 1123 | [nick = ref] \\ |
d7451ac3 | 1124 | class ReferenceCountedObject: SodObject \{ \\ \ind |
020b9e2b MW |
1125 | unsigned nref = 1; \\- |
1126 | void inc() \{ me@->ref.nref++; \} \\- | |
1127 | [role = around] \\ | |
1128 | int obj.teardown() \\ | |
1129 | \{ \\ \ind | |
1130 | if (--\,--me@->ref.nref) return (1); \\ | |
1131 | else return (CALL_NEXT_METHOD); \-\\ | |
1132 | \} \-\\ | |
d24d47f5 MW |
1133 | \} |
1134 | \end{prog} | |
1135 | ||
fa7e2d72 MW |
1136 | The @|teardown| message uses a nonstandard method combination which works |
1137 | like the standard combination, except that the \emph{default behaviour}, if | |
1138 | there is no overriding method, is to execute the superclass's teardown | |
1139 | fragments, and to return zero. This default behaviour may be invoked | |
1140 | multiple times if some method calls on its @|next_method| more than once, | |
1141 | unless some other method takes steps to prevent this. | |
a42893dd MW |
1142 | |
1143 | A class can define \emph{teardown fragments}: pieces of literal code to be | |
1144 | executed to shut down an instance. Each superclass's teardown fragments are | |
1145 | executed with @|me| bound to an instance pointer of the appropriate | |
1146 | superclass type; fragments defined by a more specific superclass are executed | |
13cb243a | 1147 | before fragments defined by a less specific superclass. A class may define |
a42893dd MW |
1148 | more than one teardown fragment: the fragments are executed in the order in |
1149 | which they appear in the class definition. It is possible for an | |
1150 | initialization fragment to use @|return| or @|goto| for special control-flow | |
1151 | effects, but this is not likely to be a good idea. Similarly, it's probably | |
1152 | a better idea to use an @|around| method to influence the return value than | |
1153 | to write an explicit @|return| statement in a teardown fragment. | |
1154 | ||
d24d47f5 MW |
1155 | \subsubsection{Deallocation} |
1156 | The details of instance deallocation are obviously specific to the allocation | |
1157 | strategy used by the instance, and this is often orthogonal from the object's | |
1158 | class. | |
1159 | ||
1160 | The code which makes the decision to destroy an object may often not be aware | |
1161 | of the object's direct class. Low-level details of deallocation often | |
1162 | require the proper base address of the instance's storage, which can be | |
1163 | determined using the \descref{SOD_INSTBASE}[macro]{mac}. | |
1164 | ||
3cc520db MW |
1165 | %%%-------------------------------------------------------------------------- |
1166 | \section{Metaclasses} \label{sec:concepts.metaclasses} | |
1f7d590d | 1167 | |
71efc524 MW |
1168 | In Sod, every object is an instance of some class, and -- unlike, say, |
1169 | \Cplusplus\ -- classes are proper objects. It follows that, in Sod, every | |
1170 | class~$C$ is itself an instance of some class~$M$, which is called $C$'s | |
1171 | \emph{metaclass}. Metaclass instances are usually constructed statically, at | |
1172 | compile time, and marked read-only. | |
1173 | ||
1174 | As an added complication, Sod classes, and other metaobjects such as | |
1175 | messages, methods, slots and so on, also have classes \emph{at translation | |
1176 | time}. These translation-time metaclasses are not Sod classes; they are CLOS | |
1177 | classes, implemented in Common Lisp. | |
1178 | ||
1179 | ||
1180 | \subsection{Runtime metaclasses} | |
1181 | \label{sec:concepts.metaclasses.runtime} | |
1182 | ||
1183 | Like other classes, metaclasses can declare messages, and define slots and | |
1184 | methods. Slots defined by the metaclass are called \emph{class slots}, as | |
1185 | opposed to \emph{instance slots}. Similarly, messages and methods defined by | |
1186 | the metaclass are termed \emph{class messages} and \emph{class methods} | |
1187 | respectively, though these are used much less frequently. | |
1188 | ||
1189 | \subsubsection{The braid} | |
1190 | Every object is an instance of some class. There are only finitely many | |
1191 | classes. | |
1192 | ||
1193 | \begin{figure} | |
1194 | \centering | |
1195 | \begin{tikzpicture} | |
1196 | \node[lit] (obj) {SodObject}; | |
1197 | \node[lit] (cls) [right=10mm of obj] {SodClass}; | |
1198 | \draw [->, dashed] (obj) to[bend right] (cls); | |
1199 | \draw [->] (cls) to[bend right] (obj); | |
1200 | \draw [->, dashed] (cls) to[loop right] (cls); | |
1201 | \end{tikzpicture} | |
1202 | \qquad | |
1203 | \fbox{\ \begin{tikzpicture} | |
1204 | \node (subclass) {subclass of}; | |
1205 | \node (instance) [below=\jot of subclass] {instance of}; | |
1206 | \draw [->] ($(subclass.west) - (10mm, 0)$) -- ++(8mm, 0); | |
1207 | \draw [->, dashed] ($(instance.west) - (10mm, 0)$) -- ++(8mm, 0); | |
1208 | \end{tikzpicture}} | |
1209 | \caption{The Sod braid} \label{fig:concepts.metaclasses.braid} | |
1210 | \end{figure} | |
1211 | ||
1212 | Consider the directed graph whose nodes are classes, and where there is an | |
1213 | arc from $C$ to $D$ if and only if $C$ is an instance of $D$. There are only | |
1214 | finitely many nodes. Every node has an arc leaving it, because every object | |
1215 | -- and hence every class -- is an instance of some class. Therefore this | |
1216 | graph must contain at least one cycle. | |
1217 | ||
1218 | In Sod, this situation is resolved in the simplest manner possible: | |
1219 | @|SodClass| is the only predefined metaclass, and it is an instance of | |
1220 | itself. The only other predefined class is @|SodObject|, which is also an | |
1221 | instance of @|SodClass|. There is exactly one root class, namely | |
1222 | @|SodObject|; consequently, @|SodClass| is a direct subclass of @|SodObject|. | |
1223 | ||
1224 | \Xref{fig:concepts.metaclasses.braid} shows a diagram of this situation. | |
1225 | ||
1226 | \subsubsection{Class slots and initializers} | |
1227 | Instance initializers were described in \xref{sec:concepts.classes.slots}. A | |
1228 | class can also define \emph{class initializers}, which provide values for | |
1229 | slots defined by its metaclass. The initial value for a class slot is | |
1230 | determined as follows. | |
1231 | \begin{itemize} | |
1232 | \item Nonstandard slot classes may be initialized by custom Lisp code. For | |
1233 | example, all of the slots defined by @|SodClass| are of this kind. User | |
1234 | initializers are not permitted for such slots. | |
1235 | \item If the class or any of its superclasses defines a class initializer for | |
1236 | the slot, then the class initializer defined by the most specific such | |
1237 | superclass is used. | |
1238 | \item Otherwise, if the metaclass or one of its superclasses defines an | |
1239 | instance initializer, then the instance initializer defined by he most | |
1240 | specific such class is used. | |
1241 | \item Otherwise there is no initializer, and an error will be reported. | |
1242 | \end{itemize} | |
1243 | Initializers for class slots must be constant expressions (for scalar slots) | |
1244 | or aggregate initializers containing constant expressions. | |
1245 | ||
1246 | \subsubsection{Metaclass selection and consistency} | |
1247 | Sod enforces a \emph{metaclass consistency rule}: if $C$ has metaclass $M$, | |
1248 | then any subclass $C$ must have a metaclass which is a subclass of $M$. | |
1249 | ||
1250 | The definition of a new class can name the new class's metaclass explicitly, | |
1251 | by defining a @|metaclass| property; the Sod translator will verify that the | |
1252 | choice of metaclass is acceptable. | |
1253 | ||
1254 | If no @|metaclass| property is given, then the translator will select a | |
1255 | default metaclass as follows. Let $C_1$, $C_2$, \dots, $C_n$ be the direct | |
1256 | superclasses of the new class, and let $M_1$, $M_2$, \dots, $M_n$ be their | |
1257 | respective metaclasses (not necessarily distinct). If there exists exactly | |
1258 | one minimal metaclass $M_i$, i.e., there exists an $i$, with $1 \le i \le n$, | |
1259 | such that $M_i$ is a subclass of every $M_j$, for $1 \le j \le n$, then $M_i$ | |
1260 | is selected as the new class's metaclass. Otherwise the situation is | |
1261 | ambiguous and an error will be reported. Usually, the ambiguity can be | |
1262 | resolved satisfactorily by defining a new class $M^*$ as a direct subclass of | |
1263 | the minimal $M_j$. | |
1264 | ||
1265 | ||
1266 | \subsection{Translation-time metaobjects} | |
1267 | \label{sec:concepts.metaclasses.compile-time} | |
1268 | ||
5c3d43e7 MW |
1269 | Within the translator, modules, classes, slots and initializers, messages and |
1270 | methods are all represented as instances of classes. Since the translator is | |
1271 | written in Common Lisp, these translation-time metaobject classes are all | |
1272 | CLOS classes. Extensions can influence the translator's behaviour -- and | |
1273 | hence the layout and behaviour of instances at runtime -- by subclassing the | |
1274 | built-in metaobject classes and implementing methods on appropriate generic | |
1275 | functions. | |
1276 | ||
1277 | Metaobject classes are chosen in a fairly standard way. | |
1278 | \begin{itemize} | |
1279 | \item All metaobject definitions support a symbol-valued property, usually | |
1280 | named @|@<thing>_class| (e.g., @|slot_class|, @|method_class|), which sets | |
1281 | the metaobject class explicitly. (The class for a class metaobject is | |
1282 | taken from the @|lisp_class| property, because @|class_class| seems less | |
1283 | meaningful.) | |
1284 | \item Failing that, the metaobject's parents choose a default metaobject | |
1285 | class, based on the new metaobject's properties; i.e., slots and messages | |
1286 | have their metaobject classes chosen by the defining class metaobject; | |
1287 | initializer and initarg classes are chosen by the defining class metaobject | |
1288 | and the direct slot metaobject; and method classes are chosen by the | |
1289 | defining class metaobject and the message metaobject. | |
1290 | \item Classes have no parents; instead, the default is simply to use the | |
1291 | builtin metaobject class @|sod-class|. | |
1292 | \item Modules are a special case because the property syntax is rather | |
1293 | awkward. All modules are initially created as instances of the built-in | |
1294 | metaclass @|module|. Once the module has been parsed completely, the | |
1295 | module metaobject's classes is changed, using @|change-class|, to the class | |
1296 | specified in the module's property set. | |
1297 | \end{itemize} | |
71efc524 | 1298 | |
caa6f4b9 MW |
1299 | %%%-------------------------------------------------------------------------- |
1300 | \section{Compatibility considerations} \label{sec:concepts.compatibility} | |
1301 | ||
1302 | Sod doesn't make source-level compatibility especially difficult. As long as | |
1303 | classes, slots, and messages don't change names or dissappear, and slots and | |
1304 | messages retain their approximate types, everything will be fine. | |
1305 | ||
1306 | Binary compatibility is much more difficult. Unfortunately, Sod classes have | |
1307 | rather fragile binary interfaces.\footnote{% | |
1308 | Research suggestion: investigate alternative instance and vtable layouts | |
1309 | which improve binary compatibility, probably at the expense of instance | |
1310 | compactness, and efficiency of slot access and message sending. There may | |
1311 | be interesting trade-offs to be made.} % | |
1312 | ||
6390b845 | 1313 | If instances are allocated \fixme{incomplete} |
caa6f4b9 | 1314 | |
1f7d590d MW |
1315 | %%%----- That's all, folks -------------------------------------------------- |
1316 | ||
1317 | %%% Local variables: | |
1318 | %%% mode: LaTeX | |
1319 | %%% TeX-master: "sod.tex" | |
1320 | %%% TeX-PDF-mode: t | |
1321 | %%% End: |