chiark / gitweb /
Commit as 2.1.0.
[mLib] / str.3
b6b9d458 1.\" -*-nroff-*- VS
3.sp 1 +5n
5.ft B
7.. VE
9.ft R -5n
11.sp 1
fbf20b5b 14.TH str 3 "20 June 1999" "Straylight/Edgeware" "mLib utilities library"
b6b9d458 15.SH NAME
16str \- small string utilities
efae42a6 17.\" @str_qword
18.\" @str_qsplit
08da152e 19.\" @str_getword
20.\" @str_split
26f325c0 21.\" @str_matchx
efae42a6 22.\" @str_match
08da152e 23.\" @str_sanitize
b6b9d458 24.SH SYNOPSIS
26.B "#include <mLib/str.h>"
efae42a6 28.BI "char *str_qword(char **" pp ", unsigned " f );
29.BI "size_t str_qsplit(char *" p ", char *" v "[], size_t " c ,
30.BI " char **" rest ", unsigned " f );
b6b9d458 31.BI "char *str_getword(char **" pp );
32.BI "size_t str_split(char *" p ", char *" v "[], size_t " c ", char **" rest );
26f325c0 33.BI "int str_matchx(const char *" p ", const char *" s ", unsigned " f );
efae42a6 34.BI "int str_match(const char *" p ", const char *" s );
b6b9d458 35.BI "void str_sanitize(char *" d ", const char *" p ", size_t " sz );
38The header file
39.B <mLib/str.h>
40contains a few small utility functions for manipulating null-terminated
d4efbcd9 41strings.
b6b9d458 42.PP
43The function
efae42a6 44.B str_qword
b6b9d458 45extracts the next whitespace-delimited word from a string. The
46function's argument,
47.IR pp ,
48is the address of a pointer into the string: this pointer is updated by
efae42a6 49.B str_qword
b6b9d458 50so that it can extract the following word on the next call and so on.
51The return value is the address of the next word, appropriately null
52terminated. A null pointer is returned if the entire remainder of the
53string is whitespace. Note that
efae42a6 54.B str_qword
b6b9d458 55modifies the string as it goes, to null-terminate the individual words.
efae42a6 56If the flag
58is passed, the single- and double-quote characters may be used to quote
59whitespace within words, and the backslash can escape quote characters
60and whitespace.
b6b9d458 61.PP
62The function
efae42a6 63.B str_qsplit
b6b9d458 64divides a string into whitespace-separated words. The arguments are as
ff76c38f 67.BI "char *" p
b6b9d458 68The address of the string to split. The string is modified by having
69null terminators written after each word extracted.
ff76c38f 71.BI "char *" v []
b6b9d458 72The address of an array of pointers to characters. This array will be
73filled in by
74.BR str_split :
75the first entry will point to the first word extracted from the string,
76and so on. If there aren't enough words in the string, the remaining
77array elements are filled with null pointers.
ff76c38f 79.BI "size_t " c
d2a91066 80The maximum number of words to extract; also, the number of elements in
b6b9d458 81the array
82.IR v .
ff76c38f 84.BI "char **" rest
b6b9d458 85The address of a pointer in which to store the address of the remainder
86of the string. Leading whitespace is removed from the remainder before
87storing. If the remainder string is empty, a null pointer is stored
88instead. If
89.I rest
90is null, the remainder pointer is discarded.
efae42a6 91.TP
92.BI "unsigned " f
93Flags, as for
94.BR str_qsplit .
b6b9d458 95.PP
96The return value of
efae42a6 97.B str_qsplit
b6b9d458 98is the number of words extracted from the input string.
efae42a6 100The functions
101.B str_getword
103.B str_split
104are veneers over
105.B str_qword
107.B str_qsplit
108respectively; they are equivalent to calls to the latter functions with
109flags words of zero.
26f325c0 112.B str_matchx
efae42a6 113function does simple wildcard matching. The first argument is a
114pattern, which may contain metacharacters:
115.RB ` * '
116matches zero or more arbitrary characters;
117.RB ` ? '
118matches exactly one arbitrary characters; and
119.RB ` [ ... ] '
120matches one of the characters listed. The backslash
121.RB ` \e '
122escapes the following character. Within square brackets, the
124.RB ` \- '
125may be used to designate ranges of characters. If the initial character
127.RB ` ! '
129.RB ` ^ '
130then the sense of the match is reversed. To literally match a
131.RB ` ] '
132character, list it first; to literally match a
133.RB ` \- '
134character, list it immediately after a range, or at the beginning or end
135of the set. The return value is nonzero if the pattern
136.I p
137matches the given string
138.IR s ,
139or zero if the pattern doesn't match. If the flag
141is passed,
142.B str_matchx
143returns true if it reaches the end of the target string before finding a
144mismatch \(en i.e., if the target string is a prefix of a string which
145might match the pattern. The function
146.B str_match
147is a convenient wrapper for
148.B str_matchx
149with a zero flags word, which is the normal case.
efae42a6 150.PP
b6b9d458 151The function
152.B str_sanitize
153copies at most
154.I sz \- 1
155characters from the string
156.I p
158.IR d .
159The result string is null terminated. Any nonprinting characters in
160.I p
161are replaced by an underscore
162.RB ` _ '
163when written to
164.IR d .
166Given the code
168char p[] = " alpha beta gamma delta ";
169char *v[3];
170size_t n;
171char *q;
173n = str_split(p, v, 3, &q);
175following the call to
176.BR str_split ,
177.B n
178will have the value 3,
179.B v[0]
180will point to
181.RB ` alpha ',
182.B v[1]
183will point to
184.RB ` beta ',
185.B v[2]
186will point to
187.RB ` gamma '
189.B rest
190will point to
191.RB ` delta\ '
192(note the trailing space).
194Similarly, given the string
195.B """\ alpha\ \ beta\ """
197.B n
198will be assigned the value 2,
199.B v[0]
201.B v[1]
202will have the same values as last time, and
203.B v[2]
205.B rest
206will be null.
08da152e 207.SH "SEE ALSO"
208.BR mLib (3).
b6b9d458 209.SH AUTHOR
9b5ac6ff 210Mark Wooding, <>