chiark / gitweb /
Infrastructure: Split the files into subdirectories.
[mLib] / struct / sym.3
b6b9d458 1.\" -*-nroff-*- VS
3.sp 1
d66d7727 4.RS
6.ft B
7.. VE
9.ft R
12.sp 1
fbf20b5b 14.TH sym 3 "8 May 1999" "Straylight/Edgeware" "mLib utilities library"
b6b9d458 15.SH NAME
16sym \- symbol table manager
08da152e 17.\" @sym_create
18.\" @sym_destroy
19.\" @sym_find
20.\" @sym_remove
21.\" @sym_mkiter
22.\" @sym_next
24.\" @SYM_NAME
0c404077 25.\" @SYM_LEN
26.\" @SYM_HASH
08da152e 27.\"
b6b9d458 28.SH SYNOPSIS
30.B "#include <mLib/sym.h>"
32.BI "void sym_create(sym_table *" t );
33.BI "void sym_destroy(sym_table *" t );
b6b9d458 35.BI "void *sym_find(sym_table *" t ,
36.BI " const char *" n ", long " l ,
37.BI " size_t " sz ", unsigned *" f );
38.BI "void sym_remove(sym_table *" t ", void *" b );
0c404077 40.BI "const char *SYM_NAME(const void *" p );
41.BI "size_t SYM_LEN(const void *" p );
42.BI "uint32 SYM_HASH(const void *" p );
b6b9d458 44.BI "void sym_mkiter(sym_iter *" i ", sym_table *" t );
45.BI "void *sym_next(sym_iter *" i );
0c404077 47.SH "DESCRIPTION"
b6b9d458 48The
49.B sym
50functions implement a data structure often described as a dictionary, a
51finite map, an associative array, or a symbol table. It associates
52.I values
54.I keys
55such that the value corresponding to a given key can be found quickly.
56Additionally, all stored associations can be enumerated.
58The interface provides an
59.I intrusive
60symbol table. The data objects stored in the table must include a small
61header used by the symbol table manager. This reduces the amount of
62pointer fiddling that needs to be done, and in practice doesn't seem to
63be much of a problem. It's also fairly easy to construct a
64non-intrusive interface if you really want one.
66There are three main data structures involved in the interface:
68.B sym_table
69Keeps track of the information associated with a particular table.
71.B sym_base
72The header which must be attached to the front of all the value
75.B sym_iter
76An iterator object, used for enumerating all of the associations stored
77in a symbol table.
79All of the above data structures should be considered
80.IR opaque :
81don't try looking inside. Representations have changed in the past, and
82they may change again in the future.
0c404077 83.SS "Creation and destruction"
b6b9d458 84The
85.B sym_table
86object itself needs to be allocated by the caller. It is initialized by
87passing it to the function
88.BR sym_create .
89After initialization, the table contains no entries.
91Initializing a symbol table involves allocating some memory. If this
d2a91066 92allocation fails, an
b6b9d458 93.B EXC_NOMEM
94exception is raised.
96When a symbol table is no longer needed, the memory occupied by the
97values and other maintenance structures can be reclaimed by calling
98.BR sym_destroy .
0c404077 99Any bits of user data attached to values should previously have been
101.SS "Adding, searching and removing"
b6b9d458 102Most of the actual work is done by the function
103.BR sym_find .
104It does both lookup and creation, depending on its arguments. To do its
105job, it needs to know the following bits of information:
ff76c38f 107.BI "sym_table *" t
b6b9d458 108A pointer to a symbol table to manipulate.
ff76c38f 110.BI "const char *" n
b6b9d458 111The address of the
112.I key
113to look up or create. Usually this will be a simple text string,
114although it can actually be any arbitrary binary data.
ff76c38f 116.BI "long " l
b6b9d458 117The length of the key. If this is \-1,
118.B sym_find
119assumes that the key is a null-terminated string, and calculates its
0c404077 120length itself. This is entirely equivalent to passing
121.BI strlen( n )\fR.
b6b9d458 122.TP
ff76c38f 123.BI "size_t " sz
b6b9d458 124The size of the value block to allocate if the key could not be found.
125If this is zero, no value is allocated, and a null pointer is returned
126to indicate an unsuccessful lookup.
ff76c38f 128.BI "unsigned *" f
b6b9d458 129The address of a `found' flag to set. This is an output parameter. On
131.B sym_find
132will set the value of
133.BI * f
134to zero if the key could not be found, or nonzero if it was found. This
135can be used to tell whether the value returned has been newly allocated,
136or whether it was already in the table.
0c404077 138A terminating null byte is appended to the copy of the symbol's name in
139memory. This is not considered to be a part of the symbol's name, and
140does not contribute to the name's length as reported by the
b6b9d458 143.PP
144A symbol can be removed from the table by calling
145.BR sym_remove ,
146passing the symbol table itself, and the value block that needs
0c404077 148.SS "Enquiries about symbols"
149Three macros are provided to enable simple enquiries about a symbol.
150Given a pointer
151.I s
152to a symbol table entry,
153.BI SYM_LEN( s )
154returns the length of the symbol's name (excluding any terminating null
d4efbcd9 155byte);
0c404077 156.BI SYM_NAME( s )
157returns a pointer to the symbol's name; and
158.BI SYM_HASH( s )
159returns the symbol's hash value.
160.SS "Enumerating symbols"
b6b9d458 161Enumerating the values in a symbol table is fairly simple. Allocate a
162.B sym_iter
163object from somewhere. Attach it to a symbol table by calling
164.BR sym_mkiter ,
165and passing in the addresses of the iterator and the symbol table.
166Then, each call to
167.B sym_next
168will return a different value from the symbol table, until all of them
169have been enumerated, at which point,
170.B sym_next
171returns a null pointer.
173It's safe to remove the symbol you've just been returned by
174.BR sym_next .
175However, it's not safe to remove any other symbol. So don't do that.
177When you've finished with an iterator, it's safe to just throw it away.
178You don't need to call any functions beforehand.
0c404077 179.SS "Use in practice"
b6b9d458 180In normal use, the keys are simple strings (usually identifiers from
181some language), and the values are nontrivial structures providing
182information about types and values.
184In this case, you'd define something like the following structure for
185your values:
187typedef struct val {
188 sym_base _base; /* Symbol header */
189 unsigned type; /* Type of this symbol */
190 int dispoff; /* Which display variable is in */
191 size_t frameoff; /* Offset of variable in frame */
192} val;
194Given a pointer
195.I v
196to a
197.BR val ,
198you can find the variable's name by calling
199.BI SYM_NAME( v )\fR.
201You can look up a name in the table by saying something like:
203val *v = sym_find(t, name, -1, 0, 0);
204if (!v)
205 error("unknown variable `%s'", name);
207You can add in a new variable by saying something like
209unsigned f;
210val *v = sym_find(t, name, -1, sizeof(val), &f);
211if (f)
212 error("variable `%s' already exists", name);
213/* fill in v */
215You can examine all the variables in your symbol table by saying
216something like:
218sym_iter i;
219val *v;
221for (sym_mkiter(&i, t); (v = sym_next(&i)) != 0; ) {
222 /* ... */
225That ought to be enough examples to be getting on with.
0c404077 226.SS Implementation
6f444bda 227The symbol table is an extensible hashtable, using the universal hash
228function described in
229.BR unihash (3)
230and the global hashing key. The hash chains are kept very short
231(probably too short, actually). Every time a symbol is found, its block
232is promoted to the front of its bin chain so it gets found faster next
b6b9d458 234.SH SEE ALSO
0c404077 235.BR hash (3),
08da152e 236.BR mLib (3).
b6b9d458 237.SH AUTHOR
9b5ac6ff 238Mark Wooding, <>