14 .TH url 3 "20 June 1999" "Straylight/Edgeware" "mLib utilities library"
16 url \- manipulation of form-urlencoded strings
23 .B "#include <mLib/url.h>"
25 .BI "void url_initenc(url_ectx *" ctx );
26 .BI "void url_enc(url_ectx *" ctx ", dstr *" d ,
27 .BI " const char *" name ", const char *" value );
29 .BI "void url_initdec(url_dctx *" ctx ", const char *" p );
30 .BI "int url_dec(url_dctx *" ctx ", dstr *" n ", dstr *" v );
35 read and write `form-urlencoded' data, as specified in RFC1866. The
36 encoding represents a sequence of name/value pairs where both the name
37 and value are arbitrary binary strings (although the format is optimized
38 for textual data). An encoded string contains no nonprintable
39 characters or whitespace. This interface is capable of decoding any
40 urlencoded string; however, it can currently only
42 names and values which do not contain null bytes, because the encoding
43 interface uses standard C strings.
45 Encoding a sequence of name/value pairs is achieved using the
47 function. It requires as input an
48 .IR "encoding context" ,
49 represented as an object of type
51 This must be initialized before use by passing it to the function
55 encodes one name/value pair, appending the encoded output to a dynamic
60 You can set flags in the encoding context's
65 Be strict about escaping non-alphanumeric characters. Without this,
66 potentially unsafe characters such as
70 will be left unescaped, which makes encoded filenames (for example) more
74 Be very lax about non-alphanumeric characters. Everything except
75 obviously-unsafe characters like
84 to separate name/value pairs, rather than the ampersand
87 Decoding a sequence of name/value pairs is performed using the
89 function. It requires as input a
90 .IR "decoding context" ,
91 represented as an object of type
93 This must be initialized before use by passing it to the function
95 along with the address of the urlencoded string to decode. The string
96 is not modified during decoding. Each call to
98 extracts a name/value pair. The name and value are written to the
103 so you probably want to reset them before each call. If there are no
104 more name/value pairs to read,
106 returns zero; otherwise it returns a nonzero value.
108 You can set flags in the encoding context's
115 to separate name/value pairs,
119 Without this flag, the semicolon is considered an `ordinary' character
120 which can appear unescaped as part of names and values. (Note the
121 difference from the same flag's meaning when encoding. When encoding,
124 the use of the semicolon, and when decoding, it
128 The example code below demonstrates converting between a symbol table
129 and a urlencoded representation. The code is untested.
132 #include <mLib/alloc.h>
133 #include <mLib/dstr.h>
134 #include <mLib/sym.h>
135 #include <mLib/url.h>
142 void decode(sym_table *t, const char *p)
145 dstr n = DSTR_INIT, v = DSTR_INIT;
147 for (url_initdec(&c, p); url_dec(&c, &n, &v); ) {
149 val *vv = sym_find(t, n.buf, -1, sizeof(*vv), &f);
152 vv->v = xstrdup(v.buf);
160 void encode(sym_table *t, dstr *d)
167 for (sym_mkiter(&i, t); (v = sym_next(&i)) != 0; )
168 url_enc(&c, d, SYM_NAME(v), v->v);
174 Mark Wooding, <mdw@distorted.org.uk>.