Commit | Line | Data |
---|---|---|
1f7d590d MW |
1 | %%% -*-latex-*- |
2 | %%% | |
3 | %%% Module syntax | |
4 | %%% | |
5 | %%% (c) 2015 Straylight/Edgeware | |
6 | %%% | |
7 | ||
8 | %%%----- Licensing notice --------------------------------------------------- | |
9 | %%% | |
e0808c47 | 10 | %%% This file is part of the Sensible Object Design, an object system for C. |
1f7d590d MW |
11 | %%% |
12 | %%% SOD is free software; you can redistribute it and/or modify | |
13 | %%% it under the terms of the GNU General Public License as published by | |
14 | %%% the Free Software Foundation; either version 2 of the License, or | |
15 | %%% (at your option) any later version. | |
16 | %%% | |
17 | %%% SOD is distributed in the hope that it will be useful, | |
18 | %%% but WITHOUT ANY WARRANTY; without even the implied warranty of | |
19 | %%% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | |
20 | %%% GNU General Public License for more details. | |
21 | %%% | |
22 | %%% You should have received a copy of the GNU General Public License | |
23 | %%% along with SOD; if not, write to the Free Software Foundation, | |
24 | %%% Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. | |
25 | ||
26 | \chapter{Module syntax} \label{ch:syntax} | |
27 | ||
68a620ab MW |
28 | %%%-------------------------------------------------------------------------- |
29 | \section{Lexical syntax} \label{sec:syntax.lex} | |
1f7d590d MW |
30 | |
31 | Whitespace and comments are discarded. The remaining characters are | |
32 | collected into tokens according to the following syntax. | |
33 | ||
34 | \begin{grammar} | |
35 | <token> ::= <identifier> | |
36 | \alt <string-literal> | |
37 | \alt <char-literal> | |
38 | \alt <integer-literal> | |
39 | \alt <punctuation> | |
40 | \end{grammar} | |
41 | ||
42 | This syntax is slightly ambiguous, and is disambiguated by the \emph{maximal | |
43 | munch} rule: at each stage we take the longest sequence of characters which | |
44 | could be a token. | |
45 | ||
68a620ab MW |
46 | |
47 | \subsection{Identifiers} \label{sec:syntax.lex.id} | |
1f7d590d MW |
48 | |
49 | \begin{grammar} | |
50 | <identifier> ::= <id-start-char> @<id-body-char>^* | |
51 | ||
52 | <id-start-char> ::= <alpha-char> | "_" | |
53 | ||
54 | <id-body-char> ::= <id-start-char> @! <digit-char> | |
55 | ||
56 | <alpha-char> ::= "A" | "B" | \dots\ | "Z" | |
57 | \alt "a" | "b" | \dots\ | "z" | |
58 | \alt <extended-alpha-char> | |
59 | ||
60 | <digit-char> ::= "0" | <nonzero-digit-char> | |
61 | ||
cee29adc | 62 | <nonzero-digit-char> ::= "1" | "2" $| \ldots |$ "9" |
1f7d590d MW |
63 | \end{grammar} |
64 | ||
65 | The precise definition of @<alpha-char> is left to the function | |
66 | \textsf{alpha-char-p} in the hosting Lisp system. For portability, | |
67 | programmers are encouraged to limit themselves to the standard ASCII letters. | |
68 | ||
69 | There are no reserved words at the lexical level, but the higher-level syntax | |
70 | recognizes certain identifiers as \emph{keywords} in some contexts. There is | |
71 | also an ambiguity (inherited from C) in the declaration syntax which is | |
72 | settled by distinguishing type names from other identifiers at a lexical | |
73 | level. | |
74 | ||
68a620ab MW |
75 | |
76 | \subsection{String and character literals} \label{sec:syntax.lex.string} | |
1f7d590d MW |
77 | |
78 | \begin{grammar} | |
79 | <string-literal> ::= "\"" @<string-literal-char>^* "\"" | |
80 | ||
81 | <char-literal> ::= "'" <char-literal-char> "'" | |
82 | ||
83 | <string-literal-char> ::= any character other than "\\" or "\"" | |
84 | \alt "\\" <char> | |
85 | ||
86 | <char-literal-char> ::= any character other than "\\" or "'" | |
87 | \alt "\\" <char> | |
88 | ||
89 | <char> ::= any single character | |
90 | \end{grammar} | |
91 | ||
92 | The syntax for string and character literals differs from~C. In particular, | |
93 | escape sequences such as @`\textbackslash n' are not recognized. The use | |
94 | of string and character literals in Sod, outside of C~fragments, is limited, | |
95 | and the simple syntax seems adequate. For the sake of future compatibility, | |
96 | the use of character sequences which resemble C escape sequences is | |
97 | discouraged. | |
98 | ||
99 | \subsubsection{Integer literals} \label{sec:syntax.lex.int} | |
100 | ||
101 | \begin{grammar} | |
102 | <integer-literal> ::= <decimal-integer> | |
103 | \alt <binary-integer> | |
104 | \alt <octal-integer> | |
105 | \alt <hex-integer> | |
106 | ||
cc0bcf39 | 107 | <decimal-integer> ::= "0" | <nonzero-digit-char> @<digit-char>^* |
1f7d590d MW |
108 | |
109 | <binary-integer> ::= "0" @("b"|"B"@) @<binary-digit-char>^+ | |
110 | ||
111 | <binary-digit-char> ::= "0" | "1" | |
112 | ||
113 | <octal-integer> ::= "0" @["o"|"O"@] @<octal-digit-char>^+ | |
114 | ||
cee29adc | 115 | <octal-digit-char> ::= "0" | "1" $| \ldots |$ "7" |
1f7d590d MW |
116 | |
117 | <hex-integer> ::= "0" @("x"|"X"@) @<hex-digit-char>^+ | |
118 | ||
119 | <hex-digit-char> ::= <digit-char> | |
120 | \alt "A" | "B" | "C" | "D" | "E" | "F" | |
121 | \alt "a" | "b" | "c" | "d" | "e" | "f" | |
122 | \end{grammar} | |
123 | ||
124 | Sod understands only integers, not floating-point numbers; its integer syntax | |
125 | goes slightly beyond C in allowing a @`0o' prefix for octal and @`0b' for | |
126 | binary. However, length and signedness indicators are not permitted. | |
127 | ||
68a620ab MW |
128 | |
129 | \subsection{Punctuation} \label{sec:syntax.lex.punct} | |
1f7d590d MW |
130 | |
131 | \begin{grammar} | |
132 | <punctuation> ::= any nonalphanumeric character other than "_", "\"" or "'" | |
133 | \end{grammar} | |
134 | ||
68a620ab MW |
135 | |
136 | \subsection{Comments} \label{sec:syntax.lex.comment} | |
1f7d590d MW |
137 | |
138 | \begin{grammar} | |
139 | <comment> ::= <block-comment> | |
140 | \alt <line-comment> | |
141 | ||
142 | <block-comment> ::= | |
143 | "/*" | |
144 | @<not-star>^* @(@<star>^+ <not-star-or-slash> @<not-star>^*@)^* | |
145 | @<star>^* | |
146 | "*/" | |
147 | ||
148 | <star> ::= "*" | |
149 | ||
150 | <not-star> ::= any character other than "*" | |
151 | ||
152 | <not-star-or-slash> ::= any character other than "*" or "/" | |
153 | ||
20f9c213 | 154 | <line-comment> ::= "/\,/" @<not-newline>^* <newline> |
1f7d590d MW |
155 | |
156 | <newline> ::= a newline character | |
157 | ||
158 | <not-newline> ::= any character other than newline | |
159 | \end{grammar} | |
160 | ||
20f9c213 MW |
161 | Comments are exactly as in C99: both traditional block comments `@|/*| \dots\ |
162 | @|*/|' and \Cplusplus-style `@|/\,/| \dots' comments are permitted and | |
163 | ignored. | |
1f7d590d | 164 | |
68a620ab MW |
165 | |
166 | \subsection{Special nonterminals} \label{sec:syntax.lex.special} | |
1f7d590d MW |
167 | |
168 | Aside from the lexical syntax presented above (\xref{sec:lexical-syntax}), | |
169 | two special nonterminals occur in the module syntax. | |
170 | ||
68a620ab | 171 | \subsubsection{S-expressions} |
1f7d590d MW |
172 | \begin{grammar} |
173 | <s-expression> ::= an S-expression, as parsed by the Lisp reader | |
174 | \end{grammar} | |
175 | ||
176 | When an S-expression is expected, the Sod parser simply calls the host Lisp | |
68a620ab MW |
177 | system's @|read| function. Sod modules are permitted to modify the read |
178 | table to extend the S-expression syntax. | |
1f7d590d MW |
179 | |
180 | S-expressions are self-delimiting, so no end-marker is needed. | |
181 | ||
68a620ab | 182 | \subsubsection{C fragments} |
1f7d590d MW |
183 | \begin{grammar} |
184 | <c-fragment> ::= a sequence of C tokens, with matching brackets | |
185 | \end{grammar} | |
186 | ||
187 | Sequences of C code are simply stored and written to the output unchanged | |
188 | during translation. They are read using a simple scanner which nonetheless | |
189 | understands C comments and string and character literals. | |
190 | ||
191 | A C fragment is terminated by one of a small number of delimiter characters | |
192 | determined by the immediately surrounding context -- usually a closing brace | |
193 | or bracket. The first such delimiter character which is not enclosed in | |
194 | brackets, braces or parenthesis ends the fragment. | |
195 | ||
68a620ab MW |
196 | %%%-------------------------------------------------------------------------- |
197 | \section{Module syntax} \label{sec:syntax.module} | |
1f7d590d MW |
198 | |
199 | \begin{grammar} | |
200 | <module> ::= @<definition>^* | |
201 | ||
202 | <definition> ::= <import-definition> | |
203 | \alt <load-definition> | |
204 | \alt <lisp-definition> | |
205 | \alt <code-definition> | |
206 | \alt <typename-definition> | |
207 | \alt <class-definition> | |
208 | \end{grammar} | |
209 | ||
68a620ab MW |
210 | A @<module> is the top-level syntactic item. A module consists of a sequence |
211 | of definitions. | |
1f7d590d | 212 | |
8399be6f MW |
213 | [FIXME] |
214 | Properties: | |
215 | \begin{description} | |
216 | \item[@"module_class"] A symbol naming the Lisp class to use to | |
217 | represent the module. | |
218 | \item[@"guard"] An identifier to use as the guard symbol used to prevent | |
219 | multiple inclusion in the header file. | |
220 | \end{description} | |
221 | ||
222 | ||
68a620ab | 223 | \subsection{Simple definitions} \label{sec:syntax.module.simple} |
1f7d590d | 224 | |
68a620ab | 225 | \subsubsection{Importing modules} |
1f7d590d MW |
226 | \begin{grammar} |
227 | <import-definition> ::= "import" <string> ";" | |
228 | \end{grammar} | |
229 | ||
230 | The module named @<string> is processed and its definitions made available. | |
231 | ||
232 | A search is made for a module source file as follows. | |
233 | \begin{itemize} | |
234 | \item The module name @<string> is converted into a filename by appending | |
235 | @`.sod', if it has no extension already.\footnote{% | |
236 | Technically, what happens is \textsf{(merge-pathnames name (make-pathname | |
237 | :type "SOD" :case :common))}, so exactly what this means varies | |
238 | according to the host system.} % | |
239 | \item The file is looked for relative to the directory containing the | |
240 | importing module. | |
241 | \item If that fails, then the file is looked for in each directory on the | |
242 | module search path in turn. | |
243 | \item If the file still isn't found, an error is reported and the import | |
244 | fails. | |
245 | \end{itemize} | |
246 | At this point, if the file has previously been imported, nothing further | |
247 | happens.\footnote{% | |
248 | This check is done using \textsf{truename}, so it should see through simple | |
249 | tricks like symbolic links. However, it may be confused by fancy things | |
250 | like bind mounts and so on.} % | |
251 | ||
252 | Recursive imports, either direct or indirect, are an error. | |
253 | ||
68a620ab | 254 | \subsubsection{Loading extensions} |
1f7d590d MW |
255 | \begin{grammar} |
256 | <load-definition> ::= "load" <string> ";" | |
257 | \end{grammar} | |
258 | ||
259 | The Lisp file named @<string> is loaded and evaluated. | |
260 | ||
261 | A search is made for a Lisp source file as follows. | |
262 | \begin{itemize} | |
263 | \item The name @<string> is converted into a filename by appending @`.lisp', | |
264 | if it has no extension already.\footnote{% | |
265 | Technically, what happens is \textsf{(merge-pathnames name (make-pathname | |
266 | :type "LISP" :case :common))}, so exactly what this means varies | |
267 | according to the host system.} % | |
268 | \item A search is then made in the same manner as for module imports | |
269 | (\xref{sec:syntax-module}). | |
270 | \end{itemize} | |
271 | If the file is found, it is loaded using the host Lisp's \textsf{load} | |
272 | function. | |
273 | ||
274 | Note that Sod doesn't attempt to compile Lisp files, or even to look for | |
275 | existing compiled files. The right way to package a substantial extension to | |
276 | the Sod translator is to provide the extension as a standard ASDF system (or | |
277 | similar) and leave a dropping @"foo-extension.lisp" in the module path saying | |
278 | something like | |
279 | \begin{quote} | |
280 | \textsf{(asdf:load-system :foo-extension)} | |
281 | \end{quote} | |
282 | which will arrange for the extension to be compiled if necessary. | |
283 | ||
284 | (This approach means that the language doesn't need to depend on any | |
285 | particular system definition facility. It's bad enough already that it | |
286 | depends on Common Lisp.) | |
287 | ||
68a620ab | 288 | \subsubsection{Lisp escapes} |
1f7d590d MW |
289 | \begin{grammar} |
290 | <lisp-definition> ::= "lisp" <s-expression> ";" | |
291 | \end{grammar} | |
292 | ||
293 | The @<s-expression> is evaluated immediately. It can do anything it likes. | |
294 | ||
eae50115 MW |
295 | \begin{boxy}[Warning!] |
296 | This means that hostile Sod modules are a security hazard. Lisp code can | |
297 | read and write files, start other programs, and make network connections. | |
298 | Don't install Sod modules from sources that you don't trust.\footnote{% | |
299 | Presumably you were going to run the corresponding code at some point, so | |
300 | this isn't as unusually scary as it sounds. But please be careful.} % | |
301 | \end{boxy} | |
1f7d590d | 302 | |
68a620ab | 303 | \subsubsection{Declaring type names} |
1f7d590d MW |
304 | \begin{grammar} |
305 | <typename-definition> ::= | |
ea08dc56 | 306 | "typename" <list>$[\mbox{@<identifier>}]$ ";" |
1f7d590d MW |
307 | \end{grammar} |
308 | ||
309 | Each @<identifier> is declared as naming a C type. This is important because | |
310 | the C type syntax -- which Sod uses -- is ambiguous, and disambiguation is | |
311 | done by distinguishing type names from other identifiers. | |
312 | ||
313 | Don't declare class names using @"typename"; use @"class" forward | |
314 | declarations instead. | |
315 | ||
68a620ab MW |
316 | |
317 | \subsection{Literal code} \label{sec:syntax.module.literal} | |
1f7d590d MW |
318 | |
319 | \begin{grammar} | |
320 | <code-definition> ::= | |
4fc52153 | 321 | "code" <identifier> ":" <item-name> @[<constraints>@] |
1f7d590d MW |
322 | "{" <c-fragment> "}" |
323 | ||
ea08dc56 | 324 | <constraints> ::= "[" <list>$[\mbox{@<constraint>}]$ "]" |
1f7d590d | 325 | |
4fc52153 MW |
326 | <constraint> ::= @<item-name>^+ |
327 | ||
328 | <item-name> ::= <identifier> @! "(" @<identifier>^+ ")" | |
1f7d590d MW |
329 | \end{grammar} |
330 | ||
331 | The @<c-fragment> will be output unchanged to one of the output files. | |
332 | ||
333 | The first @<identifier> is the symbolic name of an output file. Predefined | |
334 | output file names are @"c" and @"h", which are the implementation code and | |
335 | header file respectively; other output files can be defined by extensions. | |
336 | ||
4fc52153 MW |
337 | Output items are named with a sequence of identifiers, separated by |
338 | whitespace, and enclosed in parentheses. As an abbreviation, a name | |
339 | consisting of a single identifier may be written as just that identifier, | |
340 | without the parentheses. | |
1f7d590d MW |
341 | |
342 | The @<constraints> provide a means for specifying where in the output file | |
343 | the output item should appear. (Note the two kinds of square brackets shown | |
344 | in the syntax: square brackets must appear around the constraints if they are | |
345 | present, but that they may be omitted.) Each comma-separated @<constraint> | |
4fc52153 MW |
346 | is a sequence of names of output items, and indicates that the output items |
347 | must appear in the order given -- though the translator is free to insert | |
348 | additional items in between them. (The particular output items needn't be | |
349 | defined already -- indeed, they needn't be defined ever.) | |
1f7d590d MW |
350 | |
351 | There is a predefined output item @"includes" in both the @"c" and @"h" | |
352 | output files which is a suitable place for inserting @"\#include" | |
353 | preprocessor directives in order to declare types and functions for use | |
354 | elsewhere in the generated output files. | |
355 | ||
1f7d590d | 356 | |
68a620ab | 357 | \subsection{Property sets} \label{sec:syntax.module.properties} |
1f7d590d | 358 | \begin{grammar} |
ea08dc56 | 359 | <properties> ::= "[" <list>$[\mbox{@<property>}]$ "]" |
1f7d590d MW |
360 | |
361 | <property> ::= <identifier> "=" <expression> | |
362 | \end{grammar} | |
363 | ||
364 | Property sets are a means for associating miscellaneous information with | |
365 | classes and related items. By using property sets, additional information | |
366 | can be passed to extensions without the need to introduce idiosyncratic | |
367 | syntax. | |
368 | ||
369 | A property has a name, given as an @<identifier>, and a value computed by | |
370 | evaluating an @<expression>. The value can be one of a number of types, | |
371 | though the only operators currently defined act on integer values only. | |
372 | ||
68a620ab | 373 | \subsubsection{The expression evaluator} |
1f7d590d | 374 | \begin{grammar} |
20f9c213 | 375 | <expression> ::= <term> | <expression> "+" <term> | <expression> "--" <term> |
1f7d590d MW |
376 | |
377 | <term> ::= <factor> | <term> "*" <factor> | <term> "/" <factor> | |
378 | ||
20f9c213 | 379 | <factor> ::= <primary> | "+" <factor> | "--" <factor> |
1f7d590d MW |
380 | |
381 | <primary> ::= | |
382 | <integer-literal> | <string-literal> | <char-literal> | <identifier> | |
1ad4b33a | 383 | \alt "<" <plain-type> ">" |
1f7d590d MW |
384 | \alt "?" <s-expression> |
385 | \alt "(" <expression> ")" | |
386 | \end{grammar} | |
387 | ||
388 | The arithmetic expression syntax is simple and standard; there are currently | |
389 | no bitwise, logical, or comparison operators. | |
390 | ||
391 | A @<primary> expression may be a literal or an identifier. Note that | |
392 | identifiers stand for themselves: they \emph{do not} denote values. For more | |
393 | fancy expressions, the syntax | |
394 | \begin{quote} | |
395 | @"?" @<s-expression> | |
396 | \end{quote} | |
397 | causes the @<s-expression> to be evaluated using the Lisp \textsf{eval} | |
398 | function. | |
399 | %%% FIXME crossref to extension docs | |
400 | ||
68a620ab MW |
401 | |
402 | \subsection{C types} \label{sec:syntax.module.types} | |
1f7d590d MW |
403 | |
404 | Sod's syntax for C types closely mirrors the standard C syntax. A C type has | |
405 | two parts: a sequence of @<declaration-specifier>s and a @<declarator>. In | |
406 | Sod, a type must contain at least one @<declaration-specifier> (i.e., | |
407 | `implicit @"int"' is forbidden), and storage-class specifiers are not | |
408 | recognized. | |
409 | ||
68a620ab | 410 | \subsubsection{Declaration specifiers} |
1f7d590d MW |
411 | \begin{grammar} |
412 | <declaration-specifier> ::= <type-name> | |
413 | \alt "struct" <identifier> | "union" <identifier> | "enum" <identifier> | |
414 | \alt "void" | "char" | "int" | "float" | "double" | |
415 | \alt "short" | "long" | |
416 | \alt "signed" | "unsigned" | |
2e01fd8b MW |
417 | \alt "bool" | "_Bool" |
418 | \alt "imaginary" | "_Imaginary" | "complex" | "_Complex" | |
1f7d590d | 419 | \alt <qualifier> |
db56b1d3 | 420 | \alt <storage-specifier> |
ae0f15ee | 421 | \alt <atomic-type> |
1f7d590d | 422 | |
ae0f15ee MW |
423 | <qualifier> ::= <atomic> | "const" | "volatile" | "restrict" |
424 | ||
20f9c213 MW |
425 | <plain-type> ::= @<declaration-specifier>^+ <abstract-declarator> |
426 | ||
ae0f15ee | 427 | <atomic-type> ::= |
20f9c213 | 428 | <atomic> "(" <plain-type> ")" |
ae0f15ee MW |
429 | |
430 | <atomic> ::= "atomic" | "_Atomic" | |
1f7d590d | 431 | |
db56b1d3 MW |
432 | <storage-specifier> ::= <alignas> "(" <c-fragment> ")" |
433 | ||
434 | <alignas> ::= "alignas" "_Alignas" | |
1f7d590d MW |
435 | |
436 | <type-name> ::= <identifier> | |
437 | \end{grammar} | |
438 | ||
439 | A @<type-name> is an identifier which has been declared as being a type name, | |
2e01fd8b MW |
440 | using the @"typename" or @"class" definitions. The following type names are |
441 | defined in the built-in module. | |
442 | \begin{itemize} | |
443 | \item @"va_list" | |
444 | \item @"size_t" | |
445 | \item @"ptrdiff_t" | |
446 | \item @"wchar_t" | |
447 | \end{itemize} | |
1f7d590d MW |
448 | |
449 | Declaration specifiers may appear in any order. However, not all | |
450 | combinations are permitted. A declaration specifier must consist of zero or | |
db56b1d3 MW |
451 | more @<qualifier>s, zero or more @<storage-specifier>s, and one of the |
452 | following, up to reordering. | |
1f7d590d MW |
453 | \begin{itemize} |
454 | \item @<type-name> | |
ae0f15ee | 455 | \item @<atomic-type> |
1f7d590d MW |
456 | \item @"struct" @<identifier>, @"union" @<identifier>, @"enum" @<identifier> |
457 | \item @"void" | |
2e01fd8b | 458 | \item @"_Bool", @"bool" |
1f7d590d MW |
459 | \item @"char", @"unsigned char", @"signed char" |
460 | \item @"short", @"unsigned short", @"signed short" | |
461 | \item @"short int", @"unsigned short int", @"signed short int" | |
462 | \item @"int", @"unsigned int", @"signed int", @"unsigned", @"signed" | |
463 | \item @"long", @"unsigned long", @"signed long" | |
464 | \item @"long int", @"unsigned long int", @"signed long int" | |
465 | \item @"long long", @"unsigned long long", @"signed long long" | |
466 | \item @"long long int", @"unsigned long long int", @"signed long long int" | |
467 | \item @"float", @"double", @"long double" | |
2e01fd8b MW |
468 | \item @"float _Imaginary", @"double _Imaginary", @"long double _Imaginary" |
469 | \item @"float imaginary", @"double imaginary", @"long double imaginary" | |
470 | \item @"float _Complex", @"double _Complex", @"long double _Complex" | |
471 | \item @"float complex", @"double complex", @"long double complex" | |
1f7d590d MW |
472 | \end{itemize} |
473 | All of these have their usual C meanings. | |
474 | ||
68a620ab | 475 | \subsubsection{Declarators} |
1f7d590d | 476 | \begin{grammar} |
43073476 | 477 | <declarator>$[k, a]$ ::= @<pointer>^* <primary-declarator>$[k, a]$ |
1f7d590d | 478 | |
43073476 MW |
479 | <primary-declarator>$[k, a]$ ::= $k$ |
480 | \alt "(" <primary-declarator>$[k, a]$ ")" | |
481 | \alt <primary-declarator>$[k, a]$ @<declarator-suffix>$[a]$ | |
1f7d590d MW |
482 | |
483 | <pointer> ::= "*" @<qualifier>^* | |
484 | ||
43073476 MW |
485 | <declarator-suffix>$[a]$ ::= "[" <c-fragment> "]" |
486 | \alt "(" $a$ ")" | |
1f7d590d | 487 | |
20f9c213 MW |
488 | <argument-list> ::= $\epsilon$ | "\dots" |
489 | \alt <list>$[\mbox{@<argument>}]$ @["," "\dots"@] | |
1f7d590d MW |
490 | |
491 | <argument> ::= @<declaration-specifier>^+ <argument-declarator> | |
492 | ||
f64eb323 | 493 | <abstract-declarator> ::= <declarator>$[\epsilon, \mbox{@<argument-list>}]$ |
ae0f15ee | 494 | |
43073476 MW |
495 | <argument-declarator> ::= |
496 | <declarator>$[\mbox{@<identifier> @! $\epsilon$}, \mbox{@<argument-list>}]$ | |
1f7d590d | 497 | |
43073476 MW |
498 | <simple-declarator> ::= |
499 | <declarator>$[\mbox{@<identifier>}, \mbox{@<argument-list>}]$ | |
1f7d590d MW |
500 | \end{grammar} |
501 | ||
502 | The declarator syntax is taken from C, but with some differences. | |
503 | \begin{itemize} | |
504 | \item Array dimensions are uninterpreted @<c-fragments>, terminated by a | |
505 | closing square bracket. This allows array dimensions to contain arbitrary | |
506 | constant expressions. | |
507 | \item A declarator may have either a single @<identifier> at its centre or a | |
508 | pair of @<identifier>s separated by a @`.'; this is used to refer to | |
509 | slots or messages defined in superclasses. | |
510 | \end{itemize} | |
511 | The remaining differences are (I hope) a matter of presentation rather than | |
512 | substance. | |
513 | ||
43073476 MW |
514 | There is additional syntax to support messages and methods which accept |
515 | keyword arguments. | |
516 | ||
517 | \begin{grammar} | |
518 | <keyword-argument> ::= <argument> @["=" <c-fragment>@] | |
519 | ||
520 | <keyword-argument-list> ::= | |
521 | @[<list>$[\mbox{@<argument>}]$@] | |
522 | "?" @[<list>$[\mbox{@<keyword-argument>}]$@] | |
523 | ||
524 | <method-argument-list> ::= <argument-list> @! <keyword-argument-list> | |
525 | ||
526 | <dotted-name> ::= <identifier> "." <identifier> | |
527 | ||
528 | <keyword-declarator>$[k]$ ::= | |
529 | <declarator>$[k, \mbox{@<method-argument-list>}]$ | |
530 | \end{grammar} | |
531 | ||
68a620ab MW |
532 | |
533 | \subsection{Class definitions} \label{sec:syntax.module.class} | |
1f7d590d MW |
534 | |
535 | \begin{grammar} | |
536 | <class-definition> ::= <class-forward-declaration> | |
537 | \alt <full-class-definition> | |
538 | \end{grammar} | |
539 | ||
68a620ab | 540 | \subsubsection{Forward declarations} |
1f7d590d MW |
541 | \begin{grammar} |
542 | <class-forward-declaration> ::= "class" <identifier> ";" | |
543 | \end{grammar} | |
544 | ||
545 | A @<class-forward-declaration> informs Sod that an @<identifier> will be used | |
546 | to name a class which is currently undefined. Forward declarations are | |
547 | necessary in order to resolve certain kinds of circularity. For example, | |
7119ea4e | 548 | \begin{prog} |
020b9e2b MW |
549 | class Sub; \\+ |
550 | ||
fd040f06 | 551 | class Super: SodObject \{ \\ \ind |
020b9e2b MW |
552 | Sub *sub; \-\\ |
553 | \}; \\+ | |
554 | ||
fd040f06 | 555 | class Sub: Super \{ \\ \ind |
020b9e2b | 556 | /* \dots\ */ \-\\ |
7119ea4e MW |
557 | \}; |
558 | \end{prog} | |
1f7d590d | 559 | |
68a620ab | 560 | \subsubsection{Full class definitions} |
1f7d590d MW |
561 | \begin{grammar} |
562 | <full-class-definition> ::= | |
563 | @[<properties>@] | |
ea08dc56 MW |
564 | "class" <identifier> ":" <list>$[\mbox{@<identifier>}]$ |
565 | "{" @<properties-class-item>^* "}" | |
1f7d590d | 566 | |
391c5a34 MW |
567 | <properties-class-item> ::= @[<properties>@] <class-item> |
568 | ||
569 | <class-item> ::= <slot-item> | |
570 | \alt <initializer-item> | |
b2983f35 | 571 | \alt <initarg-item> |
a42893dd | 572 | \alt <fragment-item> |
1f7d590d MW |
573 | \alt <message-item> |
574 | \alt <method-item> | |
1f7d590d MW |
575 | \end{grammar} |
576 | ||
577 | A full class definition provides a complete description of a class. | |
578 | ||
579 | The first @<identifier> gives the name of the class. It is an error to | |
580 | give the name of an existing class (other than a forward-referenced class), | |
581 | or an existing type name. It is conventional to give classes `MixedCase' | |
582 | names, to distinguish them from other kinds of identifiers. | |
583 | ||
ea08dc56 MW |
584 | The @<list>$[\mbox{@<identifier>}]$ names the direct superclasses for the new |
585 | class. It is an error if any of these @<identifier>s does not name a defined | |
8d952432 MW |
586 | class. The superclass list is required, and must not be empty; listing |
587 | @|SodObject| as your class's superclass is a good choice if nothing else | |
588 | seems suitable. It's not possible to define a \emph{root class} in the Sod | |
589 | language: you must use Lisp to do this, and it's quite involved. | |
1f7d590d MW |
590 | |
591 | The @<properties> provide additional information. The standard class | |
592 | properties are as follows. | |
593 | \begin{description} | |
594 | \item[@"lisp_class"] The name of the Lisp class to use within the translator | |
595 | to represent this class. The property value must be an identifier; the | |
596 | default is @"sod_class". Extensions may define classes with additional | |
597 | behaviour, and may recognize additional class properties. | |
598 | \item[@"metaclass"] The name of the Sod metaclass for this class. In the | |
599 | generated code, a class is itself an instance of another class -- its | |
600 | \emph{metaclass}. The metaclass defines which slots the class will have, | |
601 | which messages it will respond to, and what its behaviour will be when it | |
602 | receives them. The property value must be an identifier naming a defined | |
603 | subclass of @"SodClass". The default metaclass is @"SodClass". | |
9cd46aef | 604 | See \xref{sec:concepts.metaclasses} for more details. |
1f7d590d MW |
605 | \item[@"nick"] A nickname for the class, to be used to distinguish it from |
606 | other classes in various limited contexts. The property value must be an | |
607 | identifier; the default is constructed by forcing the class name to | |
608 | lower-case. | |
609 | \end{description} | |
610 | ||
611 | The class body consists of a sequence of @<class-item>s enclosed in braces. | |
612 | These items are discussed on the following sections. | |
613 | ||
68a620ab | 614 | \subsubsection{Slot items} |
1f7d590d MW |
615 | \begin{grammar} |
616 | <slot-item> ::= | |
ea08dc56 | 617 | @<declaration-specifier>^+ <list>$[\mbox{@<init-declarator>}]$ ";" |
1f7d590d | 618 | |
0bc19f1c | 619 | <init-declarator> ::= <simple-declarator> @["=" <initializer>@] |
1f7d590d MW |
620 | \end{grammar} |
621 | ||
622 | A @<slot-item> defines one or more slots. All instances of the class and any | |
623 | subclass will contain these slot, with the names and types given by the | |
624 | @<declaration-specifiers> and the @<declarators>. Slot declarators may not | |
bc7dff5c | 625 | contain dotted names. |
1f7d590d MW |
626 | |
627 | It is not possible to declare a slot with function type: such an item is | |
628 | interpreted as being a @<message-item> or @<method-item>. Pointers to | |
629 | functions are fine. | |
630 | ||
8399be6f MW |
631 | Properties: |
632 | \begin{description} | |
633 | \item[@"slot_class"] A symbol naming the Lisp class to use to represent the | |
634 | direct slot. | |
635 | \item[@"initarg"] An identifier naming an initialization argument which can | |
636 | be used to provide a value for the slot. See | |
637 | \xref{sec:concepts.lifecycle.birth} for the details. | |
638 | \end{description} | |
639 | ||
1f7d590d MW |
640 | An @<initializer>, if present, is treated as if a separate |
641 | @<initializer-item> containing the slot name and initializer were present. | |
642 | For example, | |
7119ea4e | 643 | \begin{prog} |
020b9e2b | 644 | [nick = eg] \\ |
fd040f06 | 645 | class Example: Super \{ \\ \ind |
020b9e2b | 646 | int foo = 17; \-\\ |
7119ea4e MW |
647 | \}; |
648 | \end{prog} | |
1f7d590d | 649 | means the same as |
7119ea4e | 650 | \begin{prog} |
020b9e2b | 651 | [nick = eg] \\ |
fd040f06 | 652 | class Example: Super \{ \\ \ind |
020b9e2b MW |
653 | int foo; \\ |
654 | eg.foo = 17; \-\\ | |
7119ea4e MW |
655 | \}; |
656 | \end{prog} | |
1f7d590d | 657 | |
68a620ab | 658 | \subsubsection{Initializer items} |
1f7d590d | 659 | \begin{grammar} |
391c5a34 | 660 | <initializer-item> ::= @["class"@] <list>$[\mbox{@<slot-initializer>}]$ ";" |
1f7d590d | 661 | |
b2983f35 | 662 | <slot-initializer> ::= <dotted-name> @["=" <initializer>@] |
1f7d590d | 663 | |
a888e3ac | 664 | <initializer> :: <c-fragment> |
1f7d590d MW |
665 | \end{grammar} |
666 | ||
667 | An @<initializer-item> provides an initial value for one or more slots. If | |
668 | prefixed by @"class", then the initial values are for class slots (i.e., | |
669 | slots of the class object itself); otherwise they are for instance slots. | |
670 | ||
bc7dff5c MW |
671 | The first component of the @<dotted-name> must be the nickname of one of the |
672 | class's superclasses (including itself); the second must be the name of a | |
673 | slot defined in that superclass. | |
1f7d590d | 674 | |
8399be6f MW |
675 | Properties: |
676 | \begin{description} | |
677 | \item[@"initializer_class"] A symbol naming the Lisp class to use to | |
678 | represent the initializer. | |
679 | \item[@"initarg"] An identifier naming an initialization argument which can | |
680 | be used to provide a value for the slot. See | |
681 | \xref{sec:concepts.lifecycle.birth} for the details. An initializer item | |
682 | must have either an @|initarg| property, or an initializer expression, or | |
683 | both. | |
0e5c0b9e MW |
684 | \item[@"initarg_class"] A symbol naming the Lisp class to use to represent |
685 | the initarg. Only permitted if @"initarg" is also set. | |
8399be6f | 686 | \end{description} |
b2983f35 MW |
687 | |
688 | Each class may define at most one initializer item with an explicit | |
689 | initializer expression for a given slot. | |
690 | ||
691 | \subsubsection{Initarg items} | |
692 | \begin{grammar} | |
693 | <initarg-item> ::= | |
694 | "initarg" | |
695 | @<declaration-specifier>^+ | |
696 | <list>$[\mbox{@<init-declarator>}]$ ";" | |
697 | \end{grammar} | |
0e5c0b9e MW |
698 | Properties: |
699 | \begin{description} | |
700 | \item[@"initarg_class"] A symbol naming the Lisp class to use to represent | |
701 | the initarg. | |
702 | \end{description} | |
b2983f35 | 703 | |
a42893dd MW |
704 | \subsubsection{Fragment items} |
705 | \begin{grammar} | |
706 | <fragment-item> ::= <fragment-kind> "{" <c-fragment> "}" | |
707 | ||
708 | <fragment-kind> ::= "init" | "teardown" | |
709 | \end{grammar} | |
710 | ||
68a620ab | 711 | \subsubsection{Message items} |
1f7d590d MW |
712 | \begin{grammar} |
713 | <message-item> ::= | |
391c5a34 MW |
714 | @<declaration-specifier>^+ |
715 | <keyword-declarator>$[\mbox{@<identifier>}]$ | |
716 | @[<method-body>@] | |
1f7d590d | 717 | \end{grammar} |
8399be6f MW |
718 | Properties: |
719 | \begin{description} | |
720 | \item[@"message_class"] A symbol naming the Lisp class to use to represent | |
721 | the message. | |
722 | \item[@"combination"] A keyword naming the aggregating method combination to | |
723 | use. | |
724 | \item[@"most_specific"] A keyword, either @`first' or @`last', according to | |
725 | whether the most specific applicable method should be invoked first or | |
726 | last. | |
727 | \end{description} | |
728 | ||
729 | Properties for the @|custom| aggregating method combination: | |
730 | \begin{description} | |
731 | \item[@"retvar"] An identifier for the return value from the effective | |
732 | method. The default is @|sod__ret|. Only permitted if the message return | |
733 | type is not @|void|. | |
734 | \item[@"valvar"] An identifier holding each return value from a direct method | |
735 | in the effective method. The default is @|sod__val|. Only permitted if | |
736 | the method return type (see @"methty" below) is not @|void|. | |
737 | \item[@"methty"] A C type, which is the return type for direct methods of | |
738 | this message. | |
739 | \item[@"decls"] A code fragment containing declarations to be inserted at the | |
740 | head of the effective method body. The default is to insert nothing. | |
741 | \item[@"before"] A code fragment containing initialization to be performed at | |
742 | the beginning of the effective method body. The default is to insert | |
743 | nothing. | |
b07535d8 MW |
744 | \item[@"empty"] A code fragment executed if there are no primary methods; |
745 | it should usually store a suitable (identity) value in @<retvar>. The | |
746 | default is not to emit an effective method at all if there are no primary | |
747 | methods. | |
8399be6f MW |
748 | \item[@"first"] A code fragment to set the return value after calling the |
749 | first applicable direct method. The default is to use the @"each" | |
750 | fragment. | |
751 | \item[@"each"] A code fragment to set the return value after calling a direct | |
752 | method. If @"first" is also set, then it is used after the first direct | |
753 | method instead of this. The default is to insert nothing, which is | |
754 | probably not what you want. | |
755 | \item[@"after"] A code fragment inserted at the end of the effective method | |
756 | body. The default is to insert nothing. | |
757 | \item[@"count"] An identifier naming a variable to be declared in the | |
758 | effective method body, of type @|size_t|, holding the number of applicable | |
759 | methods. The default is not to provide such a variable. | |
760 | \end{description} | |
1f7d590d | 761 | |
68a620ab | 762 | \subsubsection{Method items} |
1f7d590d MW |
763 | \begin{grammar} |
764 | <method-item> ::= | |
391c5a34 MW |
765 | @<declaration-specifier>^+ |
766 | <keyword-declarator>$[\mbox{@<dotted-name>}]$ | |
ea08dc56 | 767 | <method-body> |
1f7d590d MW |
768 | |
769 | <method-body> ::= "{" <c-fragment> "}" | "extern" ";" | |
770 | \end{grammar} | |
8399be6f MW |
771 | Properties: |
772 | \begin{description} | |
773 | \item[@"method_class"] A symbol naming the Lisp class to use to represent | |
774 | the direct method. | |
775 | \item[@"role"] A keyword naming the direct method's rôle. For the built-in | |
776 | `simple' message classes, the acceptable rôle names are @|before|, | |
777 | @|after|, and @|around|. By default, a primary method is constructed. | |
778 | \end{description} | |
1f7d590d | 779 | |
1f7d590d MW |
780 | %%%----- That's all, folks -------------------------------------------------- |
781 | ||
782 | %%% Local variables: | |
783 | %%% mode: LaTeX | |
784 | %%% TeX-master: "sod.tex" | |
785 | %%% TeX-PDF-mode: t | |
786 | %%% End: |