X-Git-Url: http://www.chiark.greenend.org.uk/ucgi/~mdw/git/mLib/blobdiff_plain/8e94a44e0eacbec394bc93d87db275295e7a670d..b6b9d458c78364bdbbd7fbd7ec543bc364014b45:/man/sym.3 diff --git a/man/sym.3 b/man/sym.3 new file mode 100644 index 0000000..10b8084 --- /dev/null +++ b/man/sym.3 @@ -0,0 +1,214 @@ +.\" -*-nroff-*- +.de VS +.sp 1 +.RS 5 +.nf +.ft B +.. +.de VE +.ft R +.fi +.RE +.sp 1 +.. +.TH sym 3mLib "8 May 1999" mLib +.SH NAME +sym \- symbol table manager +.SH SYNOPSIS +.nf +.B "#include " + +.BI "void sym_create(sym_table *" t ); +.BI "void sym_destroy(sym_table *" t ); + +.BI "char *SYM_NAME(void *" p ); +.BI "void *sym_find(sym_table *" t , +.BI " const char *" n ", long " l , +.BI " size_t " sz ", unsigned *" f ); +.BI "void sym_remove(sym_table *" t ", void *" b ); + +.BI "void sym_mkiter(sym_iter *" i ", sym_table *" t ); +.BI "void *sym_next(sym_iter *" i ); +.fi +.SH "HOW IT WORKS" +The +.B sym +functions implement a data structure often described as a dictionary, a +finite map, an associative array, or a symbol table. It associates +.I values +with +.I keys +such that the value corresponding to a given key can be found quickly. +Additionally, all stored associations can be enumerated. +.PP +The interface provides an +.I intrusive +symbol table. The data objects stored in the table must include a small +header used by the symbol table manager. This reduces the amount of +pointer fiddling that needs to be done, and in practice doesn't seem to +be much of a problem. It's also fairly easy to construct a +non-intrusive interface if you really want one. +.PP +There are three main data structures involved in the interface: +.TP +.B sym_table +Keeps track of the information associated with a particular table. +.TP +.B sym_base +The header which must be attached to the front of all the value +objects. +.TP +.B sym_iter +An iterator object, used for enumerating all of the associations stored +in a symbol table. +.PP +All of the above data structures should be considered +.IR opaque : +don't try looking inside. Representations have changed in the past, and +they may change again in the future. +.SH "CREATION AND DESTRUCTION" +The +.B sym_table +object itself needs to be allocated by the caller. It is initialized by +passing it to the function +.BR sym_create . +After initialization, the table contains no entries. +.PP +Initializing a symbol table involves allocating some memory. If this +allocation failes, an +.B EXC_NOMEM +exception is raised. +.PP +When a symbol table is no longer needed, the memory occupied by the +values and other maintenance structures can be reclaimed by calling +.BR sym_destroy . +Any bits of user data attached to the symbol table values should have +previously been destroyed. +.SH "ADDING, SEARCHING AND REMOVING" +Most of the actual work is done by the function +.BR sym_find . +It does both lookup and creation, depending on its arguments. To do its +job, it needs to know the following bits of information: +.TP +.I t +A pointer to a symbol table to manipulate. +.TP +.I n +The address of the +.I key +to look up or create. Usually this will be a simple text string, +although it can actually be any arbitrary binary data. +.TP +.I l +The length of the key. If this is \-1, +.B sym_find +assumes that the key is a null-terminated string, and calculates its +length itself. +.TP +.I sz +The size of the value block to allocate if the key could not be found. +If this is zero, no value is allocated, and a null pointer is returned +to indicate an unsuccessful lookup. +.TP +.I f +The address of a `found' flag to set. This is an output parameter. On +exit, +.B sym_find +will set the value of +.BI * f +to zero if the key could not be found, or nonzero if it was found. This +can be used to tell whether the value returned has been newly allocated, +or whether it was already in the table. +.PP +The macro +.B SYM_NAME +will return the key associated with a given value block. You must use +this macro rather than fiddling inside the +.B sym_base +structure. +.PP +A symbol can be removed from the table by calling +.BR sym_remove , +passing the symbol table itself, and the value block that needs +removing. +.SH ENUMERATION +Enumerating the values in a symbol table is fairly simple. Allocate a +.B sym_iter +object from somewhere. Attach it to a symbol table by calling +.BR sym_mkiter , +and passing in the addresses of the iterator and the symbol table. +Then, each call to +.B sym_next +will return a different value from the symbol table, until all of them +have been enumerated, at which point, +.B sym_next +returns a null pointer. +.PP +It's safe to remove the symbol you've just been returned by +.BR sym_next . +However, it's not safe to remove any other symbol. So don't do that. +.PP +When you've finished with an iterator, it's safe to just throw it away. +You don't need to call any functions beforehand. +.SH "USE IN PRACTICE" +In normal use, the keys are simple strings (usually identifiers from +some language), and the values are nontrivial structures providing +information about types and values. +.PP +In this case, you'd define something like the following structure for +your values: +.VS +typedef struct val { + sym_base _base; /* Symbol header */ + unsigned type; /* Type of this symbol */ + int dispoff; /* Which display variable is in */ + size_t frameoff; /* Offset of variable in frame */ +} val; +.VE +Given a pointer +.I v +to a +.BR val , +you can find the variable's name by calling +.BI SYM_NAME( v )\fR. +.PP +You can look up a name in the table by saying something like: +.VS +val *v = sym_find(t, name, -1, 0, 0); +if (!v) + error("unknown variable `%s'", name); +.VE +You can add in a new variable by saying something like +.VS +unsigned f; +val *v = sym_find(t, name, -1, sizeof(val), &f); +if (f) + error("variable `%s' already exists", name); +/* fill in v */ +.VE +You can examine all the variables in your symbol table by saying +something like: +.VS +sym_iter i; +val *v; + +for (sym_mkiter(&i, t); (v = sym_next(&i)) != 0; ) { + /* ... */ +} +.VE +That ought to be enough examples to be getting on with. +.SH CAVEATS +The symbol table manager requires the suballocator (see +.BR sub (3mLib) +for details). You must ensure that +.B sub_init +has been called before using any symbol tables in your program. +.SH IMPLEMENTATION +The symbol table is an extensible hashtable, using a 32-bit CRC as the +hash function. The hash chains are kept very short (probably too short, +actually). Every time a symbol is found, its block is promoted to the +front of its bin chain so it gets found faster next time. +.SH SEE ALSO +.BR sub (3mLib). +.SH AUTHOR +Mark Wooding,