b6b9d458 |
1 | .\" -*-nroff-*- |
2 | .de VS |
3 | .sp 1 |
4 | .RS 5 |
5 | .nf |
6 | .ft B |
7 | .. |
8 | .de VE |
9 | .ft R |
10 | .fi |
11 | .RE |
12 | .sp 1 |
13 | .. |
14 | .de HP |
15 | .IP |
16 | .ft B |
17 | \h'-\w'\\$1\ 'u'\\$1\ \c |
18 | .ft P |
19 | .. |
20 | .ie t .ds o \(bu |
21 | .el .ds o o |
22 | .TH dstr 3mLib "8 May 1999" "mLib" |
23 | dstr \- a simple dynamic string type |
24 | .SH SYNOPSIS |
25 | .nf |
26 | .B "#include <mLib/dstr.h>" |
27 | |
28 | .BI "void dstr_create(dstr *" d ); |
29 | .BI "void dstr_destroy(dstr *" d ); |
30 | .BI "void dstr_reset(dstr *" d ); |
31 | |
32 | .BI "void dstr_ensure(dstr *" d ", size_t " sz ); |
33 | .BI "void dstr_tidy(dstr *" d ); |
34 | |
35 | .BI "void dstr_putc(dstr *" d ", char " ch ); |
36 | .BI "void dstr_putz(dstr *" d ); |
37 | .BI "void dstr_puts(dstr *" d ", const char *" s ); |
38 | .BI "int dstr_vputf(dstr *" d ", va_list " ap ); |
39 | .BI "int dstr_putf(dstr *" d ", ...); |
40 | .BI "void dstr_putd(dstr *" d ", const dstr *" p ); |
41 | .BI "void dstr_putm(dstr *" d ", const void *" p ", size_t " sz ); |
42 | .BI "int dstr_putline(dstr *" d ", FILE *" fp ); |
43 | .BI "size_t dstr_write(const dstr *" d ", FILE *" fp ); |
44 | |
45 | .BI "void DCREATE(dstr *" d ); |
46 | .BI "void DDESTROY(dstr *" d ); |
47 | .BI "void DRESET(dstr *" d ); |
48 | .BI "void DENSURE(dstr *" d ", size_t " sz ); |
49 | .BI "void DPUTZ(dstr *" d ); |
50 | .BI "void DPUTS(dstr *" d ", const char *" s ); |
51 | .BI "void DPUTD(dstr *" d ", const dstr *" p ); |
52 | .BI "void DPUTM(dstr *" d ", const void *" p ", size_t " sz ); |
53 | .BI "size_t DWRITE(const dstr *" d ", FILE *" fp ); |
54 | .fi |
55 | .SH SUMMARY |
56 | The header |
57 | .B dstr.h |
58 | declares a type for representing dynamically extending strings, and a |
59 | small collection of useful operations on them. None of the operations |
60 | returns a failure result on an out-of-memory condition; instead, the |
61 | exception |
62 | .B EXC_NOMEM |
63 | is raised. |
64 | .PP |
65 | Many of the functions which act on dynamic strings have macro |
66 | equivalents. These equivalent macros may evaluate their arguments |
67 | multiple times. |
68 | .SH "UNDERLYING TYPE" |
69 | A |
70 | .B dstr |
71 | object is a small structure with the following members: |
72 | .VS |
73 | typedef struct dstr { |
74 | char *buf; /* Pointer to string buffer */ |
75 | size_t sz; /* Size of the buffer */ |
76 | size_t len; /* Length of the string */ |
77 | } dstr; |
78 | .VE |
79 | The |
80 | .B buf |
81 | member points to the actual character data in the string. The data may |
82 | or may not be null terminated, depending on what operations have |
83 | recently been performed on it. None of the |
84 | .B dstr |
85 | functions depend on the string being null-terminated; indeed, all of |
86 | them work fine on strings containing arbitrary binary data. You can |
87 | force null-termination by calling the |
88 | .B dstr_putz |
89 | function, or the |
90 | .B DPUTZ |
91 | macro. |
92 | .PP |
93 | The |
94 | .B sz |
95 | member describes the current size of the buffer. This reflects the |
96 | maximum possible length of string that can be represented in |
97 | .B buf |
98 | without allocating a new buffer. |
99 | .PP |
100 | The |
101 | .B len |
102 | member describes the current length of the string. It is the number of |
103 | bytes in the string which are actually interesting. The length does |
104 | .I not |
105 | include a null-terminating byte, if there is one. |
106 | .PP |
107 | The following invariants are maintained by |
108 | .B dstr |
109 | and must hold when any function is called: |
110 | .HP \*o |
111 | If |
112 | .B sz |
113 | is nonzero, then |
114 | .B buf |
115 | points to a block of memory of length |
116 | .BR sz . |
117 | If |
118 | .B sz |
119 | is zero, then |
120 | .B buf |
121 | is a null pointer. |
122 | .HP \*o |
123 | At all times, |
124 | .BI sz " >= " len\fR. |
125 | .PP |
126 | Note that there is no equaivalent of the standard C distinction between |
127 | the empty string (a pointer to an array of characters whose first |
128 | element is zero) and the nonexistant string (a null pointer). Any |
129 | .B dstr |
130 | whose |
131 | .B len |
132 | is zero is an empty string. |
133 | .SH "CREATION AND DESTRUCTION" |
134 | The caller is responsible for allocating the |
135 | .B dstr |
136 | structure. It can be initialized in any of the following ways: |
137 | .HP \*o |
138 | Using the macro |
139 | .B DSTR_INIT |
140 | as an initializer in the declaration of the object. |
141 | .HP \*o |
142 | Passing its address to the |
143 | .B dstr_create |
144 | function. |
145 | .HP \*o |
146 | Passing its address to the (equivalent) |
147 | .B DCREATE |
148 | macro. |
149 | .PP |
150 | The initial value of a |
151 | .B dstr |
152 | is the empty string. |
153 | .PP |
154 | The additional storage space for a string's contents may be reclaimed by |
155 | passing it to the |
156 | .B dstr_destroy |
157 | function, or the |
158 | .B DDESTROY |
159 | macro. After destruction, a string's value is reset to the empty |
160 | string: |
161 | .I "it's still a valid" |
162 | .BR dstr . |
163 | However, once a string has been destroyed, it's safe to deallocate the |
164 | underlying |
165 | .B dstr |
166 | object. |
167 | .PP |
168 | The |
169 | .B dstr_reset |
170 | function empties a string |
171 | .I without |
172 | deallocating any memory. Therefore appending more characters is quick, |
173 | beause the old buffer is still there and doesn't need to be allocated. |
174 | Calling |
175 | .VS |
176 | dstr_reset(d); |
177 | .VE |
178 | is equivalent to directly assinging |
179 | .VS |
180 | d->len = 0; |
181 | .VE |
182 | There's also a macro |
183 | .B DRESET |
184 | which does the same job as the |
185 | .B dstr_reset |
186 | function. |
187 | .SH "EXTENDING A STRING" |
188 | All memory allocation for strings is done by the function |
189 | .BR dstr_ensure . |
190 | Given a pointer |
191 | .I d |
192 | to a |
193 | .B dstr |
194 | and a size |
195 | .IR sz , |
196 | the function ensures that there are at least |
197 | .I sz |
198 | unused bytes in the string's buffer. The current algorithm for |
199 | extending the buffer is fairly unsophisticated, but seems to work |
200 | relatively well \- see the source if you really want to know what it's |
201 | doing. |
202 | .PP |
203 | Extending a string never returns a failure result. Instead, if there |
204 | isn't enough memory for a longer string, the exception |
205 | .B EXC_NOMEM |
206 | is raised. See |
207 | .BR exc (3mLib) |
208 | for more information about |
209 | .BR mLib 's |
210 | exception handling system. |
211 | .PP |
212 | Note that if an ensure operation needs to reallocate a string buffer, |
213 | any pointers you've taken into the string become invalid. |
214 | .PP |
215 | There's a macro |
216 | .B DENSURE |
217 | which does a quick inline check to see whether there's enough space in |
218 | a string's buffer. This saves a procedure call when no reallocation |
219 | needs to be done. The |
220 | .B DENSURE |
221 | macro is called in the same way as the |
222 | .B dstr_ensure |
223 | function. |
224 | .PP |
225 | The function |
226 | .B dstr_tidy |
227 | `trims' a string's buffer so that it's just large enough for the string |
228 | contents and a null terminating byte. This might raise an exception due |
229 | to lack of memory. (There are two possible ways this might happen. |
230 | Firstly, the underlying allocator might just be braindamaged enough to |
231 | fail on reducing a block's size. Secondly, tidying an empty string with no |
232 | buffer allocated for it causes allocation of a buffer large enough for |
233 | the terminating null byte.) |
234 | .SH "CONTRIBUTING DATA TO A STRING" |
235 | There are a collection of functions which add data to a string. All of |
236 | these functions add their new data to the |
237 | .I end |
238 | of the string. This is good, because programs usually build strings |
239 | left-to-right. If you want to do something more clever, that's up to |
240 | you. |
241 | .PP |
242 | Several of these functions have equivalent macros which do the main work |
243 | inline. (There still might need to be a function call if the buffer |
244 | needs to be extended.) |
245 | .PP |
246 | Any of these functions might extend the string, causing pointers into |
247 | the string buffer to be invalidated. If you don't want that to happen, |
248 | pre-ensure enough space before you start. |
249 | .PP |
250 | The simplest function is |
251 | .B dstr_putc |
252 | which appends a single character |
253 | .I ch |
254 | to the end of the string. It has a macro equivalent called |
255 | .BR DPUTC . |
256 | .PP |
257 | The function |
258 | .B dstr_putz |
259 | places a zero byte at the end of the string. It does |
260 | .I not |
261 | affect the string's length, so any other data added to the string will |
262 | overwrite the null terminator. This is useful if you want to pass your |
263 | string to one of the standard C library string-handling functions. The |
264 | macro |
265 | .B DPUTZ |
266 | does the same thing. |
267 | .PP |
268 | The function |
269 | .B dstr_puts |
270 | writes a C-style null-terminated string to the end of a dynamic string. |
271 | A terminating zero byte is also written, as if |
272 | .B dstr_putz |
273 | were called. The macro |
274 | .B DPUTS |
275 | does the same job. |
276 | .PP |
277 | The function |
278 | .B dstr_putf |
279 | works similarly to the standard |
280 | .BR sprintf (3) |
281 | function. It accepts a |
282 | .BR print (3)-style |
283 | format string and an arbitrary number of arguments to format and writes |
284 | the resulting text to the end of a dynamic string, returning the number |
285 | of characters so written. A terminating zero byte is also appended. |
286 | The formatting is intended to be convenient and safe rather than |
287 | efficient, so don't expect blistering performance. Similarly, there may |
288 | be differences between the formatting done by |
289 | .B dstr_putf |
290 | and |
291 | .BR sprintf (3) |
292 | because the former has to do most of its work itself. In particular, |
293 | .B dstr_putf |
294 | doesn't (and probably never will) understand the |
295 | .RB ` n$ ' |
296 | positional paramter notation accepted by many Unix C libraries. There |
297 | is no macro equivalent of |
298 | .BR dstr_putf . |
299 | .PP |
300 | The function |
301 | .B dstr_vputf |
302 | provides access to the `guts' of |
303 | .BR dstr_putf : |
304 | given a format string and a |
305 | .B va_list |
306 | pointer, it will format the arguments according to the format string, |
307 | just as |
308 | .B dstr_putf |
309 | does. |
310 | .PP |
311 | The function |
312 | .B dstr_putd |
313 | appends the contents of one dynamic string to another. A null |
314 | terminator is also appended. The macro |
315 | .B DPUTD |
316 | does the same thing. |
317 | .PP |
318 | The function |
319 | .B dstr_putm |
320 | puts an arbitrary block of memory, addressed by |
321 | .IR p , |
322 | with length |
323 | .I sz |
324 | bytes, at the end of a dynamic string. No terminating null is appended: |
325 | it's assumed that if you're playing with arbitrary chunks of memory then |
326 | you're probably not going to be using the resulting data as a normal |
327 | text string. The macro |
328 | .B DPUTM |
329 | works the same way. |
330 | .PP |
331 | The function |
332 | .B dstr_putline |
333 | reads a line from an input stream |
334 | .I fp |
335 | and appends it to a string. If an error occurs, or end-of-file is |
336 | encountered, before any characters have been read, then |
337 | .B dstr_putline |
338 | returns the value |
339 | .BR EOF. |
340 | Otherwise, it reads until it encounters a newline character, an error, |
341 | or end-of-file, and returns the number of characters read. If reading |
342 | was terminated by a newline character, the newline character is |
343 | .I not |
344 | inserted in the buffer. A terminating null is appended, as by |
345 | .BR dstr_putz . |
346 | .SH "OTHER FUNCTIONS" |
347 | The |
348 | .B dstr_write |
349 | function writes a string to an output stream |
350 | .IR fp . |
351 | It returns the number of characters written, or |
352 | .B 0 |
353 | if an error occurred before the first write. No newline character is |
354 | written to the stream, unless the string actually contains one already. |
355 | The macro |
356 | .B DWRITE |
357 | is equivalent. |
358 | .SH "SECURITY CONSIDERATIONS" |
359 | The implemenetation of the |
360 | .B dstr |
361 | functions is designed to do string handling in security-critical |
362 | programs. However, there may be bugs in the code somewhere. In |
363 | particular, the |
364 | .B dstr_putf |
365 | functions is quite complicated, and could do with some checking by |
366 | independent people who know what they're doing. |
367 | .SH AUTHOR |
368 | Mark Wooding, <mdw@nsict.org> |