1 $Id: config-syntax 5843 2002-11-19 00:08:18Z rra $
3 This file documents the standardized syntax for INN configuration files.
4 This is the syntax that the parsing code in libinn will understand and the
5 syntax towards which all configuration files should move.
7 The basic structure of a configuration file is a tree of groups. Each
8 group has a type and an optional tag, and may contain zero or more
9 parameter settings, an association of a name with a value. All parameter
10 names and group types are simple case-sensitive strings composed of
11 printable ASCII characters and not containing whitespace or any of the
12 characters "\:;{}[]<>" or the double-quote. A group may contain another
13 group (and in fact the top level of the file can be thought of as a
14 top-level group that isn't allowed to contain parameter settings).
16 Supported parameter values are booleans, integers, real numbers, strings,
19 The basic syntax looks like:
23 parameter: [ string string ... ]
31 Tags are strings, with the same syntax as a string value for a parameter;
32 they are optional and may be omitted. A tag can be thought of as the name
33 of a particular group, whereas the <group-type> says what that group is
34 intended to specify and there may be many groups with the same type.
36 The second parameter example above has as its value a list. The square
37 brackets are part of the syntax of the configuration file; lists are
38 enclosed in square brackets and the elements are space-separated.
40 As seen above, groups may be nested.
42 Multiple occurances of the same parameter in the parameter section of a
43 group is an error. In practice, the second parameter will take precedent,
44 but an error will be reported when such a configuration file is parsed.
46 Parameter values inherit. In other words, the structure:
52 third { third-parameter: 1 }
58 is parsed into a tree that looks like:
60 +-------+ +--------+ +-------+
61 | first |-+-| second |---| third |
62 +-------+ | +--------+ +-------+
68 where each box is a group. The type of the group is given in the box;
69 none of these groups have tags except for the only group of type
70 "another", which has the tag "tag". The group of type "third" has three
71 parameters set, namely "third-parameter" (set in the group itself),
72 "second-parameter" (inherited from the group of type "second"), and
73 "first-parameter" (inherited from "first" by "second" and then from
76 The practical meaning of this is that enclosing groups can be used to set
77 default values for a set of subgroups. For example, consider the
78 following configuration that defines three peers of a news server and
79 newsgroups they're allowed to send:
81 peer news1.example.com { newsgroups: * }
82 peer news2.example.com { newsgroups: * }
83 peer news3.example.com { newsgroups: * }
85 This could instead be written as:
90 peer news1.example.com { }
91 peer news2.example.com { }
92 peer news3.example.com { }
97 peer news1.example.com {
100 peer news2.example.com { }
101 peer news3.example.com { }
104 and for a client program that only cares about the defined list of peers,
105 these three structures would be entirely equivalent; all questions about
106 what parameters are defined in the peer groups would have identical
107 answers either way this configuration was written.
109 Note that the second form above is preferred as a matter of style to the
110 third, since otherwise it's tempting to derive some significance from the
111 nesting structure of the peer groups. Also note that in the second
112 example above, the enclosing group *must* have a type other than "peer";
113 to see why, consider the program that asks the configuration parser for a
114 list of all defined peer groups and uses the resulting list to build some
115 internal data structures. If the enclosing group in the second example
116 above had been of type peer, there would be four peer groups instead of
117 three and one of them wouldn't have a tag, probably provoking an error
120 Boolean values may be given as yes, true, or on, or as no, false, or off.
121 Integers must be between -2,147,483,648 and +2,147,483,647 inclusive (the
122 same as the minimums for a C99 signed long). Floating point numbers must
123 be between 0 and 1e37 in absolute magnitude (the same as the minimums for
124 a C99 double) and can safely expect eight digits of precision.
126 Strings are optionally enclosed in double quotes, and must be quoted if
127 they contain any whitespace, double-quote, or any characters in the set
128 "\:;[]{}<>". Escape sequences in strings (sequences beginning with \) are
129 parsed the same as they are in C. Strings can be continued on multiple
130 lines by ending each line in a backslash, and the newline is not
131 considered part of such a continued string (to embed a literal newline in
134 Lists of strings are delimited by [] and consist of whitespace-separated
135 strings, which must follow the same quoting rules as all other strings.
136 Group tags are also strings and follow the same quoting rules.
138 There are two more bits of syntax. Normally, parameters must be separated
139 by newlines, but for convenience it's possible to put multiple parameters
140 on the same line separated by semicolons:
142 parameter: value; parameter: value
144 Finally, the body of a group may be defined in a separate file. To do
145 this, rather than writing the body of the group enclosed in {}, instead
146 give the file name in <>:
150 (The filename is also a string and may be double-quoted if necessary, but
151 since file names rarely contain any of the excluded characters it's rarely
154 Here is the (almost) complete ABNF for the configuration file syntax.
155 The syntax is per RFC 2234.
157 First the basic syntax elements and possible parameter values:
159 newline = %d13 / %d10 / %d13.10
160 ; Any of CR, LF, or CRLF are interpreted
163 comment = *WSP "#" *(WSP / VCHAR / %x8A-FF) newline
165 WHITE = WSP / newline [comment]
167 boolean = "yes" / "on" / "true" / "no" / "off" / "false"
169 integer = ["-"] 1*DIGIT
171 real-number = ["-"] 1*DIGIT "." 1*DIGIT [ "e" ["-"] 1*DIGIT ]
173 non-special = %x21 / %x23-39 / %x3D / %x3F-5A / %x5E-7A
174 / %x7C / %x7E / %x8A-FF
175 ; All VCHAR except "\:;<>[]{}
177 quoted-string = DQUOTE 1*(WSP / VCHAR / %x8A-FF) DQUOTE
178 ; DQUOTE within the quoted string must be
179 ; written as 0x5C.22 (\"), and backslash
180 ; sequences are interpreted as in C
183 string = 1*non-special / quoted-string
185 list-body = string *( 1*WHITE string )
187 list = "[" *WHITE [ list-body ] *WHITE "]"
189 Now the general structure:
191 parameter-name = 1*non-special
193 parameter-value = boolean / integer / real-number / string / list
195 parameter = parameter-name ":" 1*WSP parameter-value
197 parameter-list = parameter [ *WHITE (";" / newline) *WHITE parameter ]
199 group-list = group *( *WHITE group )
201 group-body = parameter-list [ *WHITE newline *WHITE group-list ]
206 group-contents = "{" *WHITE [ group-body ] *WHITE "}"
209 group-type = 1*non-special
213 group-name = group-type [ 1*WHITE group-tag ]
215 group = group-name 1*WHITE group-contents
217 file = *WHITE *( group *WHITE )
219 One implication of this grammar is that any line outside a quoted string
220 that begins with "#", optionally preceded by whitespace, is regarded as a
221 comment and discarded. The line must begin with "#" (and optional
222 whitespace); comments at the end of lines aren't permitted. "#" has no
223 special significance in quoted strings, even if it's at the beginning of a
224 line. Note that comments cannot be continued to the next line in any way;
225 each comment line must begin with "#".
227 It's unclear the best thing to do with high-bit characters (both literal
228 characters with value > 0x7F in a configuration file and characters with
229 such values created in quoted strings with \<octal>, \x, \u, or \U). In
230 the long term, INN should move towards assuming UTF-8 everywhere, as this
231 is the direction that all of the news standards are heading, but in the
232 interim various non-Unicode character sets are in widespread use and there
233 must be some way of encoding those values in INN configuration files (so
234 that things like the default Organization header value can be set
237 As a compromise, the configuration parser will pass unaltered any literal
238 characters with value > 0x7F to the calling application, and \<octal> and
239 \x escapes will generate eight-bit characters in the strings (and
240 therefore cannot be used to generate UTF-8 strings containing code points
241 greater than U+007F). \u and \U, in contrast, will generate characters