.\" -*-nroff-*- .de VS .sp 1 .RS .nf .ft B .. .de VE .ft R .fi .RE .sp 1 .. .de hP .IP .ft B \h'-\w'\\$1\ 'u'\\$1\ \c .ft P .. .ie t .ds o \(bu .el .ds o o .TH dstr 3 "8 May 1999" "Straylight/Edgeware" "mLib utilities library" .SH NAME dstr \- a simple dynamic string type .\" @dstr_create .\" @dstr_destroy .\" @dstr_reset .\" @dstr_ensure .\" @dstr_tidy .\" .\" @dstr_putc .\" @dstr_putz .\" @dstr_puts .\" @dstr_putf .\" @dstr_putd .\" @dstr_putm .\" @dstr_putline .\" @dstr_write .\" .\" @DSTR_INIT .\" @DCREATE .\" @DDESTROY .\" @DRESET .\" @DENSURE .\" @DPUTC .\" @DPUTZ .\" @DPUTS .\" @DPUTD .\" @DPUTM .\" @DWRITE .\" .SH SYNOPSIS .nf .B "#include " .BI "void dstr_create(dstr *" d ); .BI "void dstr_destroy(dstr *" d ); .BI "void dstr_reset(dstr *" d ); .BI "void dstr_ensure(dstr *" d ", size_t " sz ); .BI "void dstr_tidy(dstr *" d ); .BI "void dstr_putc(dstr *" d ", char " ch ); .BI "void dstr_putz(dstr *" d ); .BI "void dstr_puts(dstr *" d ", const char *" s ); .BI "int dstr_vputf(dstr *" d ", va_list *" ap ); .BI "int dstr_putf(dstr *" d ", ...);" .BI "void dstr_putd(dstr *" d ", const dstr *" p ); .BI "void dstr_putm(dstr *" d ", const void *" p ", size_t " sz ); .BI "int dstr_putline(dstr *" d ", FILE *" fp ); .BI "size_t dstr_write(const dstr *" d ", FILE *" fp ); .BI "dstr " d " = DSTR_INIT;" .BI "void DCREATE(dstr *" d ); .BI "void DDESTROY(dstr *" d ); .BI "void DRESET(dstr *" d ); .BI "void DENSURE(dstr *" d ", size_t " sz ); .BI "void DPUTC(dstr *" c ", char " ch ); .BI "void DPUTZ(dstr *" d ); .BI "void DPUTS(dstr *" d ", const char *" s ); .BI "void DPUTD(dstr *" d ", const dstr *" p ); .BI "void DPUTM(dstr *" d ", const void *" p ", size_t " sz ); .BI "size_t DWRITE(const dstr *" d ", FILE *" fp ); .fi .SH DESCRIPTION The header .B dstr.h declares a type for representing dynamically extending strings, and a small collection of useful operations on them. None of the operations returns a failure result on an out-of-memory condition; instead, the exception .B EXC_NOMEM is raised. .PP Many of the functions which act on dynamic strings have macro equivalents. These equivalent macros may evaluate their arguments multiple times. .SS "Underlying type" A .B dstr object is a small structure with the following members: .VS typedef struct dstr { char *buf; /* Pointer to string buffer */ size_t sz; /* Size of the buffer */ size_t len; /* Length of the string */ arena *a; /* Pointer to arena */ } dstr; .VE The .B buf member points to the actual character data in the string. The data may or may not be null terminated, depending on what operations have recently been performed on it. None of the .B dstr functions depend on the string being null-terminated; indeed, all of them work fine on strings containing arbitrary binary data. You can force null-termination by calling the .B dstr_putz function, or the .B DPUTZ macro. .PP The .B sz member describes the current size of the buffer. This reflects the maximum possible length of string that can be represented in .B buf without allocating a new buffer. .PP The .B len member describes the current length of the string. It is the number of bytes in the string which are actually interesting. The length does .I not include a null-terminating byte, if there is one. .PP The following invariants are maintained by .B dstr and must hold when any function is called: .hP \*o If .B sz is nonzero, then .B buf points to a block of memory of length .BR sz . If .B sz is zero, then .B buf is a null pointer. .hP \*o At all times, .BR sz " \(>= " len. .PP Note that there is no equivalent of the standard C distinction between the empty string (a pointer to an array of characters whose first element is zero) and the nonexistent string (a null pointer). Any .B dstr whose .B len is zero is an empty string. .PP The .I a member refers to the arena from which the string's buffer has been allocated. Immediately after creation, this is set to be .BR arena_stdlib (3); you can set it to point to any other arena of your choice before the buffer is allocated. .SS "Creation and destruction" The caller is responsible for allocating the .B dstr structure. It can be initialized: .hP \*o using the macro .B DSTR_INIT as an initializer in the declaration of the object, .hP \*o passing its address to the .B dstr_create function, or .hP \*o passing its address to the (equivalent) .B DCREATE macro. .PP The initial value of a .B dstr is the empty string. .PP The additional storage space for a string's contents may be reclaimed by passing it to the .B dstr_destroy function, or the .B DDESTROY macro. After destruction, a string's value is reset to the empty string: .I "it's still a valid" .BR dstr . However, once a string has been destroyed, it's safe to deallocate the underlying .B dstr object. .PP The .B dstr_reset function empties a string .I without deallocating any memory. Therefore appending more characters is quick, because the old buffer is still there and doesn't need to be allocated. Calling .VS dstr_reset(d); .VE is equivalent to directly assigning .VS d->len = 0; .VE There's also a macro .B DRESET which does the same job as the .B dstr_reset function. .SS "Extending a string" All memory allocation for strings is done by the function .BR dstr_ensure . Given a pointer .I d to a .B dstr and a size .IR sz , the function ensures that there are at least .I sz unused bytes in the string's buffer. The current algorithm for extending the buffer is fairly unsophisticated, but seems to work relatively well \- see the source if you really want to know what it's doing. .PP Extending a string never returns a failure result. Instead, if there isn't enough memory for a longer string, the exception .B EXC_NOMEM is raised. See .BR exc (3) for more information about .BR mLib 's exception handling system. .PP Note that if an ensure operation needs to reallocate a string buffer, any pointers you've taken into the string become invalid. .PP There's a macro .B DENSURE which does a quick inline check to see whether there's enough space in a string's buffer. This saves a procedure call when no reallocation needs to be done. The .B DENSURE macro is called in the same way as the .B dstr_ensure function. .PP The function .B dstr_tidy `trims' a string's buffer so that it's just large enough for the string contents and a null terminating byte. This might raise an exception due to lack of memory. (There are two possible ways this might happen. Firstly, the underlying allocator might just be brain-damaged enough to fail on reducing a block's size. Secondly, tidying an empty string with no buffer allocated for it causes allocation of a buffer large enough for the terminating null byte.) .SS "Contributing data to a string" There are a collection of functions which add data to a string. All of these functions add their new data to the .I end of the string. This is good, because programs usually build strings left-to-right. If you want to do something more clever, that's up to you. .PP Several of these functions have equivalent macros which do the main work inline. (There still might need to be a function call if the buffer needs to be extended.) .PP Any of these functions might extend the string, causing pointers into the string buffer to be invalidated. If you don't want that to happen, pre-ensure enough space before you start. .PP The simplest function is .B dstr_putc which appends a single character .I ch to the end of the string. It has a macro equivalent called .BR DPUTC . .PP The function .B dstr_putz places a zero byte at the end of the string. It does .I not affect the string's length, so any other data added to the string will overwrite the null terminator. This is useful if you want to pass your string to one of the standard C library string-handling functions. The macro .B DPUTZ does the same thing. .PP The function .B dstr_puts writes a C-style null-terminated string to the end of a dynamic string. A terminating zero byte is also written, as if .B dstr_putz were called. The macro .B DPUTS does the same job. .PP The function .B dstr_putf works similarly to the standard .BR sprintf (3) function. It accepts a .BR print (3)-style format string and an arbitrary number of arguments to format and writes the resulting text to the end of a dynamic string, returning the number of characters so written. A terminating zero byte is also appended. The formatting is intended to be convenient and safe rather than efficient, so don't expect blistering performance. Similarly, there may be differences between the formatting done by .B dstr_putf and .BR sprintf (3) because the former has to do most of its work itself. In particular, .B dstr_putf doesn't (and probably never will) understand the .RB ` n$ ' positional parameter notation accepted by many Unix C libraries. There is no macro equivalent of .BR dstr_putf . .PP The function .B dstr_vputf provides access to the `guts' of .BR dstr_putf : given a format string and a pointer to a .BR va_list it will format the arguments according to the format string, just as .B dstr_putf does. (Note: that's a .BR "va_list *" , not a plain .BR va_list , so that it gets updated properly on exit.) .PP The function .B dstr_putd appends the contents of one dynamic string to another. A null terminator is also appended. The macro .B DPUTD does the same thing. .PP The function .B dstr_putm puts an arbitrary block of memory, addressed by .IR p , with length .I sz bytes, at the end of a dynamic string. No terminating null is appended: it's assumed that if you're playing with arbitrary chunks of memory then you're probably not going to be using the resulting data as a normal text string. The macro .B DPUTM works the same way. .PP The function .B dstr_putline reads a line from an input stream .I fp and appends it to a string. If an error occurs, or end-of-file is encountered, before any characters have been read, then .B dstr_putline returns the value .B EOF and does not extend the string. Otherwise, it reads until it encounters a newline character, an error, or end-of-file, and returns the number of characters read. If reading was terminated by a newline character, the newline character is .I not inserted in the buffer. A terminating null is appended, as by .BR dstr_putz . .SS "Other functions" The .B dstr_write function writes a string to an output stream .IR fp . It returns the number of characters written, or .B 0 if an error occurred before the first write. No newline character is written to the stream, unless the string actually contains one already. The macro .B DWRITE is equivalent. .SH "SECURITY CONSIDERATIONS" The implementation of the .B dstr functions is designed to do string handling in security-critical programs. However, there may be bugs in the code somewhere. In particular, the .B dstr_putf functions are quite complicated, and could do with some checking by independent people who know what they're doing. .SH "SEE ALSO" .BR exc (3), .BR mLib (3). .SH AUTHOR Mark Wooding,