chiark / gitweb /
More portability enhancements.
[mLib] / man / dstr.3
CommitLineData
b6b9d458 1.\" -*-nroff-*-
2.de VS
3.sp 1
4.RS 5
5.nf
6.ft B
7..
8.de VE
9.ft R
10.fi
11.RE
12.sp 1
13..
14.de HP
15.IP
16.ft B
17\h'-\w'\\$1\ 'u'\\$1\ \c
18.ft P
19..
20.ie t .ds o \(bu
21.el .ds o o
22.TH dstr 3mLib "8 May 1999" "mLib"
23dstr \- a simple dynamic string type
24.SH SYNOPSIS
25.nf
26.B "#include <mLib/dstr.h>"
27
28.BI "void dstr_create(dstr *" d );
29.BI "void dstr_destroy(dstr *" d );
30.BI "void dstr_reset(dstr *" d );
31
32.BI "void dstr_ensure(dstr *" d ", size_t " sz );
33.BI "void dstr_tidy(dstr *" d );
34
35.BI "void dstr_putc(dstr *" d ", char " ch );
36.BI "void dstr_putz(dstr *" d );
37.BI "void dstr_puts(dstr *" d ", const char *" s );
38.BI "int dstr_vputf(dstr *" d ", va_list " ap );
39.BI "int dstr_putf(dstr *" d ", ...);
40.BI "void dstr_putd(dstr *" d ", const dstr *" p );
41.BI "void dstr_putm(dstr *" d ", const void *" p ", size_t " sz );
42.BI "int dstr_putline(dstr *" d ", FILE *" fp );
43.BI "size_t dstr_write(const dstr *" d ", FILE *" fp );
44
45.BI "void DCREATE(dstr *" d );
46.BI "void DDESTROY(dstr *" d );
47.BI "void DRESET(dstr *" d );
48.BI "void DENSURE(dstr *" d ", size_t " sz );
49.BI "void DPUTZ(dstr *" d );
50.BI "void DPUTS(dstr *" d ", const char *" s );
51.BI "void DPUTD(dstr *" d ", const dstr *" p );
52.BI "void DPUTM(dstr *" d ", const void *" p ", size_t " sz );
53.BI "size_t DWRITE(const dstr *" d ", FILE *" fp );
54.fi
55.SH SUMMARY
56The header
57.B dstr.h
58declares a type for representing dynamically extending strings, and a
59small collection of useful operations on them. None of the operations
60returns a failure result on an out-of-memory condition; instead, the
61exception
62.B EXC_NOMEM
63is raised.
64.PP
65Many of the functions which act on dynamic strings have macro
66equivalents. These equivalent macros may evaluate their arguments
67multiple times.
68.SH "UNDERLYING TYPE"
69A
70.B dstr
71object is a small structure with the following members:
72.VS
73typedef struct dstr {
74 char *buf; /* Pointer to string buffer */
75 size_t sz; /* Size of the buffer */
76 size_t len; /* Length of the string */
77} dstr;
78.VE
79The
80.B buf
81member points to the actual character data in the string. The data may
82or may not be null terminated, depending on what operations have
83recently been performed on it. None of the
84.B dstr
85functions depend on the string being null-terminated; indeed, all of
86them work fine on strings containing arbitrary binary data. You can
87force null-termination by calling the
88.B dstr_putz
89function, or the
90.B DPUTZ
91macro.
92.PP
93The
94.B sz
95member describes the current size of the buffer. This reflects the
96maximum possible length of string that can be represented in
97.B buf
98without allocating a new buffer.
99.PP
100The
101.B len
102member describes the current length of the string. It is the number of
103bytes in the string which are actually interesting. The length does
104.I not
105include a null-terminating byte, if there is one.
106.PP
107The following invariants are maintained by
108.B dstr
109and must hold when any function is called:
110.HP \*o
111If
112.B sz
113is nonzero, then
114.B buf
115points to a block of memory of length
116.BR sz .
117If
118.B sz
119is zero, then
120.B buf
121is a null pointer.
122.HP \*o
123At all times,
124.BI sz " >= " len\fR.
125.PP
126Note that there is no equaivalent of the standard C distinction between
127the empty string (a pointer to an array of characters whose first
128element is zero) and the nonexistant string (a null pointer). Any
129.B dstr
130whose
131.B len
132is zero is an empty string.
133.SH "CREATION AND DESTRUCTION"
134The caller is responsible for allocating the
135.B dstr
136structure. It can be initialized in any of the following ways:
137.HP \*o
138Using the macro
139.B DSTR_INIT
140as an initializer in the declaration of the object.
141.HP \*o
142Passing its address to the
143.B dstr_create
144function.
145.HP \*o
146Passing its address to the (equivalent)
147.B DCREATE
148macro.
149.PP
150The initial value of a
151.B dstr
152is the empty string.
153.PP
154The additional storage space for a string's contents may be reclaimed by
155passing it to the
156.B dstr_destroy
157function, or the
158.B DDESTROY
159macro. After destruction, a string's value is reset to the empty
160string:
161.I "it's still a valid"
162.BR dstr .
163However, once a string has been destroyed, it's safe to deallocate the
164underlying
165.B dstr
166object.
167.PP
168The
169.B dstr_reset
170function empties a string
171.I without
172deallocating any memory. Therefore appending more characters is quick,
173beause the old buffer is still there and doesn't need to be allocated.
174Calling
175.VS
176dstr_reset(d);
177.VE
178is equivalent to directly assinging
179.VS
180d->len = 0;
181.VE
182There's also a macro
183.B DRESET
184which does the same job as the
185.B dstr_reset
186function.
187.SH "EXTENDING A STRING"
188All memory allocation for strings is done by the function
189.BR dstr_ensure .
190Given a pointer
191.I d
192to a
193.B dstr
194and a size
195.IR sz ,
196the function ensures that there are at least
197.I sz
198unused bytes in the string's buffer. The current algorithm for
199extending the buffer is fairly unsophisticated, but seems to work
200relatively well \- see the source if you really want to know what it's
201doing.
202.PP
203Extending a string never returns a failure result. Instead, if there
204isn't enough memory for a longer string, the exception
205.B EXC_NOMEM
206is raised. See
207.BR exc (3mLib)
208for more information about
209.BR mLib 's
210exception handling system.
211.PP
212Note that if an ensure operation needs to reallocate a string buffer,
213any pointers you've taken into the string become invalid.
214.PP
215There's a macro
216.B DENSURE
217which does a quick inline check to see whether there's enough space in
218a string's buffer. This saves a procedure call when no reallocation
219needs to be done. The
220.B DENSURE
221macro is called in the same way as the
222.B dstr_ensure
223function.
224.PP
225The function
226.B dstr_tidy
227`trims' a string's buffer so that it's just large enough for the string
228contents and a null terminating byte. This might raise an exception due
229to lack of memory. (There are two possible ways this might happen.
230Firstly, the underlying allocator might just be braindamaged enough to
231fail on reducing a block's size. Secondly, tidying an empty string with no
232buffer allocated for it causes allocation of a buffer large enough for
233the terminating null byte.)
234.SH "CONTRIBUTING DATA TO A STRING"
235There are a collection of functions which add data to a string. All of
236these functions add their new data to the
237.I end
238of the string. This is good, because programs usually build strings
239left-to-right. If you want to do something more clever, that's up to
240you.
241.PP
242Several of these functions have equivalent macros which do the main work
243inline. (There still might need to be a function call if the buffer
244needs to be extended.)
245.PP
246Any of these functions might extend the string, causing pointers into
247the string buffer to be invalidated. If you don't want that to happen,
248pre-ensure enough space before you start.
249.PP
250The simplest function is
251.B dstr_putc
252which appends a single character
253.I ch
254to the end of the string. It has a macro equivalent called
255.BR DPUTC .
256.PP
257The function
258.B dstr_putz
259places a zero byte at the end of the string. It does
260.I not
261affect the string's length, so any other data added to the string will
262overwrite the null terminator. This is useful if you want to pass your
263string to one of the standard C library string-handling functions. The
264macro
265.B DPUTZ
266does the same thing.
267.PP
268The function
269.B dstr_puts
270writes a C-style null-terminated string to the end of a dynamic string.
271A terminating zero byte is also written, as if
272.B dstr_putz
273were called. The macro
274.B DPUTS
275does the same job.
276.PP
277The function
278.B dstr_putf
279works similarly to the standard
280.BR sprintf (3)
281function. It accepts a
282.BR print (3)-style
283format string and an arbitrary number of arguments to format and writes
284the resulting text to the end of a dynamic string, returning the number
285of characters so written. A terminating zero byte is also appended.
286The formatting is intended to be convenient and safe rather than
287efficient, so don't expect blistering performance. Similarly, there may
288be differences between the formatting done by
289.B dstr_putf
290and
291.BR sprintf (3)
292because the former has to do most of its work itself. In particular,
293.B dstr_putf
294doesn't (and probably never will) understand the
295.RB ` n$ '
296positional paramter notation accepted by many Unix C libraries. There
297is no macro equivalent of
298.BR dstr_putf .
299.PP
300The function
301.B dstr_vputf
302provides access to the `guts' of
303.BR dstr_putf :
304given a format string and a
305.B va_list
306pointer, it will format the arguments according to the format string,
307just as
308.B dstr_putf
309does.
310.PP
311The function
312.B dstr_putd
313appends the contents of one dynamic string to another. A null
314terminator is also appended. The macro
315.B DPUTD
316does the same thing.
317.PP
318The function
319.B dstr_putm
320puts an arbitrary block of memory, addressed by
321.IR p ,
322with length
323.I sz
324bytes, at the end of a dynamic string. No terminating null is appended:
325it's assumed that if you're playing with arbitrary chunks of memory then
326you're probably not going to be using the resulting data as a normal
327text string. The macro
328.B DPUTM
329works the same way.
330.PP
331The function
332.B dstr_putline
333reads a line from an input stream
334.I fp
335and appends it to a string. If an error occurs, or end-of-file is
336encountered, before any characters have been read, then
337.B dstr_putline
338returns the value
339.BR EOF.
340Otherwise, it reads until it encounters a newline character, an error,
341or end-of-file, and returns the number of characters read. If reading
342was terminated by a newline character, the newline character is
343.I not
344inserted in the buffer. A terminating null is appended, as by
345.BR dstr_putz .
346.SH "OTHER FUNCTIONS"
347The
348.B dstr_write
349function writes a string to an output stream
350.IR fp .
351It returns the number of characters written, or
352.B 0
353if an error occurred before the first write. No newline character is
354written to the stream, unless the string actually contains one already.
355The macro
356.B DWRITE
357is equivalent.
358.SH "SECURITY CONSIDERATIONS"
359The implemenetation of the
360.B dstr
361functions is designed to do string handling in security-critical
362programs. However, there may be bugs in the code somewhere. In
363particular, the
364.B dstr_putf
365functions is quite complicated, and could do with some checking by
366independent people who know what they're doing.
367.SH AUTHOR
368Mark Wooding, <mdw@nsict.org>