chiark / gitweb /
Release version 2.1.1.
[mLib] / dstr.3
CommitLineData
b6b9d458 1.\" -*-nroff-*-
2.de VS
3.sp 1
d66d7727 4.RS
b6b9d458 5.nf
6.ft B
7..
8.de VE
9.ft R
10.fi
11.RE
12.sp 1
13..
08da152e 14.de hP
b6b9d458 15.IP
16.ft B
17\h'-\w'\\$1\ 'u'\\$1\ \c
18.ft P
19..
20.ie t .ds o \(bu
21.el .ds o o
fbf20b5b 22.TH dstr 3 "8 May 1999" "Straylight/Edgeware" "mLib utilities library"
7527ed0b 23.SH NAME
b6b9d458 24dstr \- a simple dynamic string type
08da152e 25.\" @dstr_create
26.\" @dstr_destroy
27.\" @dstr_reset
28.\" @dstr_ensure
29.\" @dstr_tidy
30.\"
31.\" @dstr_putc
32.\" @dstr_putz
33.\" @dstr_puts
34.\" @dstr_putf
35.\" @dstr_putd
36.\" @dstr_putm
37.\" @dstr_putline
38.\" @dstr_write
39.\"
e49a7995 40.\" @DSTR_INIT
08da152e 41.\" @DCREATE
42.\" @DDESTROY
43.\" @DRESET
44.\" @DENSURE
45.\" @DPUTC
46.\" @DPUTZ
47.\" @DPUTS
48.\" @DPUTD
49.\" @DPUTM
50.\" @DWRITE
51.\"
b6b9d458 52.SH SYNOPSIS
53.nf
54.B "#include <mLib/dstr.h>"
55
56.BI "void dstr_create(dstr *" d );
57.BI "void dstr_destroy(dstr *" d );
58.BI "void dstr_reset(dstr *" d );
59
60.BI "void dstr_ensure(dstr *" d ", size_t " sz );
61.BI "void dstr_tidy(dstr *" d );
62
63.BI "void dstr_putc(dstr *" d ", char " ch );
64.BI "void dstr_putz(dstr *" d );
65.BI "void dstr_puts(dstr *" d ", const char *" s );
5a18a126 66.BI "int dstr_vputf(dstr *" d ", va_list *" ap );
d2a91066 67.BI "int dstr_putf(dstr *" d ", ...);"
b6b9d458 68.BI "void dstr_putd(dstr *" d ", const dstr *" p );
69.BI "void dstr_putm(dstr *" d ", const void *" p ", size_t " sz );
70.BI "int dstr_putline(dstr *" d ", FILE *" fp );
71.BI "size_t dstr_write(const dstr *" d ", FILE *" fp );
72
e49a7995 73.BI "dstr " d " = DSTR_INIT;"
b6b9d458 74.BI "void DCREATE(dstr *" d );
75.BI "void DDESTROY(dstr *" d );
76.BI "void DRESET(dstr *" d );
77.BI "void DENSURE(dstr *" d ", size_t " sz );
08da152e 78.BI "void DPUTC(dstr *" c ", char " ch );
b6b9d458 79.BI "void DPUTZ(dstr *" d );
80.BI "void DPUTS(dstr *" d ", const char *" s );
81.BI "void DPUTD(dstr *" d ", const dstr *" p );
82.BI "void DPUTM(dstr *" d ", const void *" p ", size_t " sz );
83.BI "size_t DWRITE(const dstr *" d ", FILE *" fp );
84.fi
750e4b6c 85.SH DESCRIPTION
b6b9d458 86The header
87.B dstr.h
88declares a type for representing dynamically extending strings, and a
89small collection of useful operations on them. None of the operations
90returns a failure result on an out-of-memory condition; instead, the
91exception
92.B EXC_NOMEM
93is raised.
94.PP
95Many of the functions which act on dynamic strings have macro
96equivalents. These equivalent macros may evaluate their arguments
97multiple times.
750e4b6c 98.SS "Underlying type"
b6b9d458 99A
100.B dstr
101object is a small structure with the following members:
102.VS
103typedef struct dstr {
104 char *buf; /* Pointer to string buffer */
105 size_t sz; /* Size of the buffer */
106 size_t len; /* Length of the string */
cededfbe 107 arena *a; /* Pointer to arena */
b6b9d458 108} dstr;
109.VE
110The
111.B buf
112member points to the actual character data in the string. The data may
113or may not be null terminated, depending on what operations have
114recently been performed on it. None of the
115.B dstr
116functions depend on the string being null-terminated; indeed, all of
117them work fine on strings containing arbitrary binary data. You can
118force null-termination by calling the
119.B dstr_putz
120function, or the
121.B DPUTZ
122macro.
123.PP
124The
125.B sz
126member describes the current size of the buffer. This reflects the
127maximum possible length of string that can be represented in
128.B buf
129without allocating a new buffer.
130.PP
131The
132.B len
133member describes the current length of the string. It is the number of
134bytes in the string which are actually interesting. The length does
135.I not
136include a null-terminating byte, if there is one.
137.PP
138The following invariants are maintained by
139.B dstr
140and must hold when any function is called:
08da152e 141.hP \*o
d4efbcd9 142If
b6b9d458 143.B sz
144is nonzero, then
145.B buf
146points to a block of memory of length
147.BR sz .
148If
149.B sz
150is zero, then
151.B buf
152is a null pointer.
08da152e 153.hP \*o
b6b9d458 154At all times,
7527ed0b 155.BR sz " \(>= " len.
b6b9d458 156.PP
d2a91066 157Note that there is no equivalent of the standard C distinction between
b6b9d458 158the empty string (a pointer to an array of characters whose first
d2a91066 159element is zero) and the nonexistent string (a null pointer). Any
b6b9d458 160.B dstr
161whose
162.B len
163is zero is an empty string.
cededfbe 164.PP
165The
166.I a
167member refers to the arena from which the string's buffer has been
168allocated. Immediately after creation, this is set to be
169.BR arena_stdlib (3);
170you can set it to point to any other arena of your choice before the
171buffer is allocated.
750e4b6c 172.SS "Creation and destruction"
b6b9d458 173The caller is responsible for allocating the
174.B dstr
528c8b4d 175structure. It can be initialized:
08da152e 176.hP \*o
528c8b4d 177using the macro
b6b9d458 178.B DSTR_INIT
528c8b4d 179as an initializer in the declaration of the object,
08da152e 180.hP \*o
528c8b4d 181passing its address to the
b6b9d458 182.B dstr_create
528c8b4d 183function, or
08da152e 184.hP \*o
528c8b4d 185passing its address to the (equivalent)
b6b9d458 186.B DCREATE
187macro.
188.PP
189The initial value of a
190.B dstr
191is the empty string.
192.PP
193The additional storage space for a string's contents may be reclaimed by
194passing it to the
195.B dstr_destroy
196function, or the
197.B DDESTROY
198macro. After destruction, a string's value is reset to the empty
199string:
200.I "it's still a valid"
201.BR dstr .
202However, once a string has been destroyed, it's safe to deallocate the
203underlying
204.B dstr
205object.
206.PP
207The
208.B dstr_reset
209function empties a string
210.I without
211deallocating any memory. Therefore appending more characters is quick,
d2a91066 212because the old buffer is still there and doesn't need to be allocated.
b6b9d458 213Calling
214.VS
215dstr_reset(d);
216.VE
d2a91066 217is equivalent to directly assigning
b6b9d458 218.VS
219d->len = 0;
220.VE
221There's also a macro
222.B DRESET
223which does the same job as the
224.B dstr_reset
225function.
750e4b6c 226.SS "Extending a string"
b6b9d458 227All memory allocation for strings is done by the function
228.BR dstr_ensure .
d4efbcd9 229Given a pointer
b6b9d458 230.I d
231to a
232.B dstr
233and a size
234.IR sz ,
235the function ensures that there are at least
236.I sz
237unused bytes in the string's buffer. The current algorithm for
238extending the buffer is fairly unsophisticated, but seems to work
239relatively well \- see the source if you really want to know what it's
240doing.
241.PP
242Extending a string never returns a failure result. Instead, if there
243isn't enough memory for a longer string, the exception
244.B EXC_NOMEM
245is raised. See
08da152e 246.BR exc (3)
d4efbcd9 247for more information about
b6b9d458 248.BR mLib 's
249exception handling system.
250.PP
251Note that if an ensure operation needs to reallocate a string buffer,
252any pointers you've taken into the string become invalid.
253.PP
254There's a macro
255.B DENSURE
256which does a quick inline check to see whether there's enough space in
257a string's buffer. This saves a procedure call when no reallocation
258needs to be done. The
259.B DENSURE
260macro is called in the same way as the
261.B dstr_ensure
262function.
263.PP
264The function
265.B dstr_tidy
266`trims' a string's buffer so that it's just large enough for the string
267contents and a null terminating byte. This might raise an exception due
268to lack of memory. (There are two possible ways this might happen.
d2a91066 269Firstly, the underlying allocator might just be brain-damaged enough to
b6b9d458 270fail on reducing a block's size. Secondly, tidying an empty string with no
271buffer allocated for it causes allocation of a buffer large enough for
272the terminating null byte.)
750e4b6c 273.SS "Contributing data to a string"
b6b9d458 274There are a collection of functions which add data to a string. All of
275these functions add their new data to the
276.I end
277of the string. This is good, because programs usually build strings
278left-to-right. If you want to do something more clever, that's up to
279you.
280.PP
281Several of these functions have equivalent macros which do the main work
282inline. (There still might need to be a function call if the buffer
283needs to be extended.)
284.PP
285Any of these functions might extend the string, causing pointers into
286the string buffer to be invalidated. If you don't want that to happen,
287pre-ensure enough space before you start.
288.PP
289The simplest function is
290.B dstr_putc
291which appends a single character
292.I ch
293to the end of the string. It has a macro equivalent called
294.BR DPUTC .
295.PP
296The function
297.B dstr_putz
298places a zero byte at the end of the string. It does
299.I not
300affect the string's length, so any other data added to the string will
301overwrite the null terminator. This is useful if you want to pass your
302string to one of the standard C library string-handling functions. The
303macro
304.B DPUTZ
305does the same thing.
306.PP
307The function
308.B dstr_puts
309writes a C-style null-terminated string to the end of a dynamic string.
310A terminating zero byte is also written, as if
311.B dstr_putz
312were called. The macro
313.B DPUTS
314does the same job.
315.PP
316The function
317.B dstr_putf
318works similarly to the standard
319.BR sprintf (3)
320function. It accepts a
321.BR print (3)-style
322format string and an arbitrary number of arguments to format and writes
323the resulting text to the end of a dynamic string, returning the number
324of characters so written. A terminating zero byte is also appended.
325The formatting is intended to be convenient and safe rather than
326efficient, so don't expect blistering performance. Similarly, there may
327be differences between the formatting done by
328.B dstr_putf
329and
330.BR sprintf (3)
331because the former has to do most of its work itself. In particular,
332.B dstr_putf
333doesn't (and probably never will) understand the
334.RB ` n$ '
d2a91066 335positional parameter notation accepted by many Unix C libraries. There
b6b9d458 336is no macro equivalent of
337.BR dstr_putf .
338.PP
339The function
340.B dstr_vputf
341provides access to the `guts' of
342.BR dstr_putf :
5a18a126 343given a format string and a pointer to a
344.BR va_list
345it will format the arguments according to the format string, just as
b6b9d458 346.B dstr_putf
5a18a126 347does. (Note: that's a
348.BR "va_list *" ,
349not a plain
350.BR va_list ,
351so that it gets updated properly on exit.)
b6b9d458 352.PP
353The function
354.B dstr_putd
355appends the contents of one dynamic string to another. A null
356terminator is also appended. The macro
357.B DPUTD
358does the same thing.
359.PP
360The function
361.B dstr_putm
362puts an arbitrary block of memory, addressed by
363.IR p ,
364with length
365.I sz
366bytes, at the end of a dynamic string. No terminating null is appended:
367it's assumed that if you're playing with arbitrary chunks of memory then
368you're probably not going to be using the resulting data as a normal
369text string. The macro
370.B DPUTM
371works the same way.
372.PP
373The function
374.B dstr_putline
375reads a line from an input stream
376.I fp
377and appends it to a string. If an error occurs, or end-of-file is
378encountered, before any characters have been read, then
379.B dstr_putline
380returns the value
750e4b6c 381.B EOF
382and does not extend the string. Otherwise, it reads until it encounters
383a newline character, an error, or end-of-file, and returns the number of
384characters read. If reading was terminated by a newline character, the
385newline character is
b6b9d458 386.I not
387inserted in the buffer. A terminating null is appended, as by
388.BR dstr_putz .
750e4b6c 389.SS "Other functions"
b6b9d458 390The
391.B dstr_write
392function writes a string to an output stream
393.IR fp .
394It returns the number of characters written, or
395.B 0
396if an error occurred before the first write. No newline character is
397written to the stream, unless the string actually contains one already.
398The macro
399.B DWRITE
400is equivalent.
401.SH "SECURITY CONSIDERATIONS"
d2a91066 402The implementation of the
b6b9d458 403.B dstr
404functions is designed to do string handling in security-critical
405programs. However, there may be bugs in the code somewhere. In
406particular, the
407.B dstr_putf
f1583053 408functions are quite complicated, and could do with some checking by
b6b9d458 409independent people who know what they're doing.
08da152e 410.SH "SEE ALSO"
411.BR exc (3),
412.BR mLib (3).
b6b9d458 413.SH AUTHOR
9b5ac6ff 414Mark Wooding, <mdw@distorted.org.uk>