chiark / gitweb /
Various Debian fixes.
[mLib] / man / str.3
... / ...
CommitLineData
1.\" -*-nroff-*-
2.de VS
3.sp 1
4.in +5n
5.ft B
6.nf
7..
8.de VE
9.ft R
10.in -5n
11.sp 1
12.fi
13..
14.TH str 3 "20 June 1999" "Straylight/Edgeware" "mLib utilities library"
15.SH NAME
16str \- small string utilities
17.\" @str_qword
18.\" @str_qsplit
19.\" @str_getword
20.\" @str_split
21.\" @str_match
22.\" @str_sanitize
23.SH SYNOPSIS
24.nf
25.B "#include <mLib/str.h>"
26
27.BI "char *str_qword(char **" pp ", unsigned " f );
28.BI "size_t str_qsplit(char *" p ", char *" v "[], size_t " c ,
29.BI " char **" rest ", unsigned " f );
30.BI "char *str_getword(char **" pp );
31.BI "size_t str_split(char *" p ", char *" v "[], size_t " c ", char **" rest );
32.BI "int str_match(const char *" p ", const char *" s );
33.BI "void str_sanitize(char *" d ", const char *" p ", size_t " sz );
34.fi
35.SH DESCRIPTION
36The header file
37.B <mLib/str.h>
38contains a few small utility functions for manipulating null-terminated
39strings.
40.PP
41The function
42.B str_qword
43extracts the next whitespace-delimited word from a string. The
44function's argument,
45.IR pp ,
46is the address of a pointer into the string: this pointer is updated by
47.B str_qword
48so that it can extract the following word on the next call and so on.
49The return value is the address of the next word, appropriately null
50terminated. A null pointer is returned if the entire remainder of the
51string is whitespace. Note that
52.B str_qword
53modifies the string as it goes, to null-terminate the individual words.
54If the flag
55.B STRF_QUOTE
56is passed, the single- and double-quote characters may be used to quote
57whitespace within words, and the backslash can escape quote characters
58and whitespace.
59.PP
60The function
61.B str_qsplit
62divides a string into whitespace-separated words. The arguments are as
63follows:
64.TP
65.BI "char *" p
66The address of the string to split. The string is modified by having
67null terminators written after each word extracted.
68.TP
69.BI "char *" v []
70The address of an array of pointers to characters. This array will be
71filled in by
72.BR str_split :
73the first entry will point to the first word extracted from the string,
74and so on. If there aren't enough words in the string, the remaining
75array elements are filled with null pointers.
76.TP
77.BI "size_t " c
78The maximum number of words to extract; also, the number of elements in
79the array
80.IR v .
81.TP
82.BI "char **" rest
83The address of a pointer in which to store the address of the remainder
84of the string. Leading whitespace is removed from the remainder before
85storing. If the remainder string is empty, a null pointer is stored
86instead. If
87.I rest
88is null, the remainder pointer is discarded.
89.TP
90.BI "unsigned " f
91Flags, as for
92.BR str_qsplit .
93.PP
94The return value of
95.B str_qsplit
96is the number of words extracted from the input string.
97.PP
98The functions
99.B str_getword
100and
101.B str_split
102are veneers over
103.B str_qword
104and
105.B str_qsplit
106respectively; they are equivalent to calls to the latter functions with
107flags words of zero.
108.PP
109The
110.B str_match
111function does simple wildcard matching. The first argument is a
112pattern, which may contain metacharacters:
113.RB ` * '
114matches zero or more arbitrary characters;
115.RB ` ? '
116matches exactly one arbitrary characters; and
117.RB ` [ ... ] '
118matches one of the characters listed. The backslash
119.RB ` \e '
120escapes the following character. Within square brackets, the
121hyphen
122.RB ` \- '
123may be used to designate ranges of characters. If the initial character
124is
125.RB ` ! '
126or
127.RB ` ^ '
128then the sense of the match is reversed. To literally match a
129.RB ` ] '
130character, list it first; to literally match a
131.RB ` \- '
132character, list it immediately after a range, or at the beginning or end
133of the set. The return value is nonzero if the pattern
134.I p
135matches the given string
136.IR s ,
137or zero if the pattern doesn't match.
138.PP
139The function
140.B str_sanitize
141copies at most
142.I sz \- 1
143characters from the string
144.I p
145to
146.IR d .
147The result string is null terminated. Any nonprinting characters in
148.I p
149are replaced by an underscore
150.RB ` _ '
151when written to
152.IR d .
153.SH EXAMPLES
154Given the code
155.VS
156char p[] = " alpha beta gamma delta ";
157char *v[3];
158size_t n;
159char *q;
160
161n = str_split(p, v, 3, &q);
162.VE
163following the call to
164.BR str_split ,
165.B n
166will have the value 3,
167.B v[0]
168will point to
169.RB ` alpha ',
170.B v[1]
171will point to
172.RB ` beta ',
173.B v[2]
174will point to
175.RB ` gamma '
176and
177.B rest
178will point to
179.RB ` delta\ '
180(note the trailing space).
181.PP
182Similarly, given the string
183.B """\ alpha\ \ beta\ """
184instead,
185.B n
186will be assigned the value 2,
187.B v[0]
188and
189.B v[1]
190will have the same values as last time, and
191.B v[2]
192and
193.B rest
194will be null.
195.SH "SEE ALSO"
196.BR mLib (3).
197.SH AUTHOR
198Mark Wooding, <mdw@nsict.org>