[mLib] / man / lbuf.3

.\" -*-nroff-*-
.TH lbuf 3 "6 July 1999" mLib
.SH "NAME"
lbuf \- split lines out of asynchronously received blocks
.\" @lbuf_flush
.\" @lbuf_close
.\" @lbuf_free
.\" @lbuf_snarf
.\" @lbuf_init
.SH "SYNOPSIS"
.nf
.B "#include <mLib/lbuf.h>"

.BI "void lbuf_flush(lbuf *" b ", char *" p ", size_t " len );
.BI "void lbuf_close(lbuf *" b );
.BI "size_t lbuf_free(lbuf *" b ", char **" p );
.BI "void lbuf_snarf(lbuf *" b ", const void *" p ", size_t " sz );
.BI "void lbuf_init(lbuf *" b ,
.BI "               void (*" func ")(char *" s ", void *" p ),
.BI "               void *" p );
.fi
.SH "DESCRIPTION"
The declarations in
.B <mLib/lbuf.h>
implement a handy object called a
.IR "line buffer" .
Given unpredictably-sized chunks of data, the line buffer extracts
completed lines of text and passes them to a caller-supplied function.
This is useful in nonblocking network servers, for example: the server
can feed input from a client into a line buffer as it arrives and deal
with completed text lines as they appear without having to wait for
newline characters.
.PP
The state of a line buffer is stored in an object of type
.BR lbuf .
This is a structure which must be allocated by the caller.  The
structure should normally be considered opaque (see the section on
.B Disablement
for an exception to this).
.SS "Initialization and finalization"
The function
.B lbuf_init
initializes a line buffer ready for use.  It is given three arguments:
.TP
.I b
A pointer to the block of memory to use for the line buffer.  This is
all the memory the line buffer requires.
.TP
.I func
The
.I line-handler
function to which the line buffer should pass completed lines of text.
.TP
.I p
A pointer argument to be passed to the function when a completed line of
text arrives.
.PP
Since the line buffer requires no memory except for the actual
.B lbuf
object, and doesn't hook itself onto anything else, it can just be
thrown away when you don't want it any more.  No explicit finalization
is required.
.SS "Inserting data into the buffer"
There are two interfaces for inserting data into the buffer.  One's much
simpler than the other, although it's less expressive.
.PP
The simple interface is
.BR lbuf_snarf .
This function is given three arguments: a pointer
.I b
to a line buffer structure; a pointer
.I p
 to a chunk of data to read; and the size
.I sz
of the chunk of data.  The data is pushed through the line buffer and
any complete lines are passed on to the line handler.
.PP
The complex interface is the pair of functions
.I lbuf_free
and
.IR lbuf_flush .
.PP
The 
.B lbuf_free
function returns the address and size of a free portion of the line
buffer's memory into which data may be written.  The function is passed
the address 
.I l
of the line buffer.  Its result is the size of the free area, and it
writes the base address of this free space to the location pointed to by
the argument
.IR p .
The caller's data must be written to ascending memory locations starting
at
.BI * p
and no data may be written beyond the end of the free space.  However,
it isn't necessary to completely fill the buffer.
.PP
Once the free area has had some data written to it,
.B lbuf_flush
is called to examine the new data and break it into text lines.  This is
given three arguments:
.TP
.I b
The address of the line buffer.
.TP
.I p
The address at which the new data has been written.  This must be the
base address returned from
.BR lbuf_free .
.TP
.I len
The number of bytes which have been written to the buffer.
.PP
The
.B lbuf_flush
function breaks the new data into lines as described below, and passes
each one in turn to the line-handler function.
.PP
The
.B lbuf_snarf
function is trivially implemented in terms of the more complex
.B lbuf_free / lbuf_flush
interface.
.SS "Line breaking"
The line buffer considers a line to end with either a simple linefeed
character (the normal Unix convention) or a carriage-return/linefeed
pair (the Internet convention).
.PP
The line buffer has a fixed amount of memory available to it.  This is
deliberate, to prevent a trivial attack whereby a remote user sends a
stream of data containing no newline markers, wasting the server's
memory.  Instead, the buffer will truncate overly long lines (silently)
and return only the initial portion.  It will ignore the rest of the
line completely.
.SS "Line-handler functions"
Completed lines, as already said, are passed to the caller's
line-handler function.  The function is given two arguments:
the address
.I s
of the line which has just been read, and the pointer
.I p
which was set up in the call to
.B lbuf_init .
The line passed is null-terminated, and has had its trailing newline
stripped.  The area of memory in which the string is located may be
overwritten by the line-handler function, although writing beyond the
terminating zero byte is not permitted.
.PP
The line pointer argument
.I s
may be null to signify end-of-file.  See the next section.
.SS "Flushing the remaining data"
When the client program knows that there's no more data arriving (for
example, an end-of-file condition exists on its data source) it should
call the function
.BR lbuf_close
to flush out the remaining data in the buffer as one last (improperly
terminated) line.  This will pass the remaining text to the line
handler, if there is any, and then call the handler one final time with
a null pointer rather than the address of a text line to inform it of
the end-of-file.
.SS "Disablement"
The line buffer is intended to be used in higher-level program objects,
such as the buffer selector described in
.BR selbuf (3).
Unfortunately, a concept from this high level needs to exist at the line
buffer level, which complicates the description somewhat.  The idea is
that, when a line-handler attached to some higher-level object decides
that it's read enough, it can
.I disable
the object so that it doesn't see any more data.
.PP
Clearly, since an
.B lbuf_flush
call can emit more than one line, so it must be aware that the line
handler isn't interested in any more lines.  However, this fact must
also be signalled to the higher-level object so that it can detach
itself from its data source.
.PP
Rather than invent some complex interface for this, the line buffer
exports one of its structure members,
.BR flags .
A higher-level object wishing to disable the line buffer simply clears
the bit
.B LBUF_ENABLE
in the flags word.
.PP
Disabling a buffer causes an immediate return from
.BR lbuf_flush .
However, it is not permitted for the functions
.B lbuf_flush
or
.B lbuf_close
to be called on a disabled buffer.  (This condition isn't checked for;
it'll just do the wrong thing.)  Furthermore, the
.B lbuf_snarf
function does not handle disablement at all, because it would complicate
the interface so much that it wouldn't have any advantage over the more
general
.BR lbuf_free / lbuf_flush .
.SH "SEE ALSO"
.BR selbuf (3),
.BR mLib (3).
.SH "AUTHOR"
Mark Wooding, <mdw@nsict.org>
Commit	Line	Data
05fbeb03	1	.\" --nroff--
	2	.TH lbuf 3 "6 July 1999" mLib
	3	.SH "NAME"
	4	lbuf \- split lines out of asynchronously received blocks
	5	.\" @lbuf_flush
	6	.\" @lbuf_close
	7	.\" @lbuf_free
	8	.\" @lbuf_snarf
	9	.\" @lbuf_init
	10	.SH "SYNOPSIS"
	11	.nf
	12	.B "#include <mLib/lbuf.h>"
	13
	14	.BI "void lbuf_flush(lbuf " b ", char " p ", size_t " len );
	15	.BI "void lbuf_close(lbuf *" b );
	16	.BI "size_t lbuf_free(lbuf " b ", char *" p );
	17	.BI "void lbuf_snarf(lbuf " b ", const void " p ", size_t " sz );
	18	.BI "void lbuf_init(lbuf *" b ,
	19	.BI " void (" func ")(char " s ", void *" p ),
	20	.BI " void *" p );
	21	.fi
	22	.SH "DESCRIPTION"
	23	The declarations in
	24	.B <mLib/lbuf.h>
	25	implement a handy object called a
	26	.IR "line buffer" .
	27	Given unpredictably-sized chunks of data, the line buffer extracts
	28	completed lines of text and passes them to a caller-supplied function.
	29	This is useful in nonblocking network servers, for example: the server
	30	can feed input from a client into a line buffer as it arrives and deal
	31	with completed text lines as they appear without having to wait for
	32	newline characters.
	33	.PP
	34	The state of a line buffer is stored in an object of type
	35	.BR lbuf .
	36	This is a structure which must be allocated by the caller. The
	37	structure should normally be considered opaque (see the section on
	38	.B Disablement
	39	for an exception to this).
	40	.SS "Initialization and finalization"
	41	The function
	42	.B lbuf_init
	43	initializes a line buffer ready for use. It is given three arguments:
	44	.TP
	45	.I b
	46	A pointer to the block of memory to use for the line buffer. This is
	47	all the memory the line buffer requires.
	48	.TP
	49	.I func
	50	The
	51	.I line-handler
	52	function to which the line buffer should pass completed lines of text.
	53	.TP
	54	.I p
	55	A pointer argument to be passed to the function when a completed line of
	56	text arrives.
	57	.PP
	58	Since the line buffer requires no memory except for the actual
	59	.B lbuf
	60	object, and doesn't hook itself onto anything else, it can just be
	61	thrown away when you don't want it any more. No explicit finalization
	62	is required.
	63	.SS "Inserting data into the buffer"
	64	There are two interfaces for inserting data into the buffer. One's much
65	simpler than the other, although it's less expressive.
66	.PP
67	The simple interface is
68	.BR lbuf_snarf .
69	This function is given three arguments: a pointer
70	.I b
71	to a line buffer structure; a pointer
72	.I p
73	to a chunk of data to read; and the size
74	.I sz
75	of the chunk of data. The data is pushed through the line buffer and
76	any complete lines are passed on to the line handler.
77	.PP
78	The complex interface is the pair of functions
79	.I lbuf_free
80	and
81	.IR lbuf_flush .
82	.PP
83	The
84	.B lbuf_free
85	function returns the address and size of a free portion of the line
86	buffer's memory into which data may be written. The function is passed
87	the address
88	.I l
89	of the line buffer. Its result is the size of the free area, and it
90	writes the base address of this free space to the location pointed to by
91	the argument
92	.IR p .
93	The caller's data must be written to ascending memory locations starting
94	at
95	.BI * p
96	and no data may be written beyond the end of the free space. However,
97	it isn't necessary to completely fill the buffer.
98	.PP
99	Once the free area has had some data written to it,
100	.B lbuf_flush
101	is called to examine the new data and break it into text lines. This is
102	given three arguments:
103	.TP
104	.I b
105	The address of the line buffer.
106	.TP
107	.I p
108	The address at which the new data has been written. This must be the
109	base address returned from
110	.BR lbuf_free .
111	.TP
112	.I len
113	The number of bytes which have been written to the buffer.
114	.PP
115	The
116	.B lbuf_flush
117	function breaks the new data into lines as described below, and passes
118	each one in turn to the line-handler function.
119	.PP
120	The
121	.B lbuf_snarf
122	function is trivially implemented in terms of the more complex
123	.B lbuf_free / lbuf_flush
124	interface.
125	.SS "Line breaking"
126	The line buffer considers a line to end with either a simple linefeed
127	character (the normal Unix convention) or a carriage-return/linefeed
128	pair (the Internet convention).
129	.PP
130	The line buffer has a fixed amount of memory available to it. This is
131	deliberate, to prevent a trivial attack whereby a remote user sends a
132	stream of data containing no newline markers, wasting the server's
133	memory. Instead, the buffer will truncate overly long lines (silently)
134	and return only the initial portion. It will ignore the rest of the
135	line completely.
136	.SS "Line-handler functions"
137	Completed lines, as already said, are passed to the caller's
138	line-handler function. The function is given two arguments:
139	the address
140	.I s
141	of the line which has just been read, and the pointer
142	.I p
143	which was set up in the call to
144	.B lbuf_init .
145	The line passed is null-terminated, and has had its trailing newline
146	stripped. The area of memory in which the string is located may be
147	overwritten by the line-handler function, although writing beyond the
148	terminating zero byte is not permitted.
149	.PP
150	The line pointer argument
151	.I s
152	may be null to signify end-of-file. See the next section.
153	.SS "Flushing the remaining data"
154	When the client program knows that there's no more data arriving (for
155	example, an end-of-file condition exists on its data source) it should
156	call the function
157	.BR lbuf_close
158	to flush out the remaining data in the buffer as one last (improperly
159	terminated) line. This will pass the remaining text to the line
160	handler, if there is any, and then call the handler one final time with
161	a null pointer rather than the address of a text line to inform it of
162	the end-of-file.
163	.SS "Disablement"
164	The line buffer is intended to be used in higher-level program objects,
165	such as the buffer selector described in
166	.BR selbuf (3).
167	Unfortunately, a concept from this high level needs to exist at the line
168	buffer level, which complicates the description somewhat. The idea is
169	that, when a line-handler attached to some higher-level object decides
170	that it's read enough, it can
171	.I disable
172	the object so that it doesn't see any more data.
173	.PP
174	Clearly, since an
175	.B lbuf_flush
176	call can emit more than one line, so it must be aware that the line
177	handler isn't interested in any more lines. However, this fact must
178	also be signalled to the higher-level object so that it can detach
179	itself from its data source.
180	.PP
181	Rather than invent some complex interface for this, the line buffer
182	exports one of its structure members,
183	.BR flags .
184	A higher-level object wishing to disable the line buffer simply clears
185	the bit
186	.B LBUF_ENABLE
187	in the flags word.
188	.PP
189	Disabling a buffer causes an immediate return from
190	.BR lbuf_flush .
191	However, it is not permitted for the functions
192	.B lbuf_flush
193	or
194	.B lbuf_close
195	to be called on a disabled buffer. (This condition isn't checked for;
196	it'll just do the wrong thing.) Furthermore, the
197	.B lbuf_snarf
198	function does not handle disablement at all, because it would complicate
199	the interface so much that it wouldn't have any advantage over the more
200	general
201	.BR lbuf_free / lbuf_flush .
202	.SH "SEE ALSO"
203	.BR selbuf (3),
204	.BR mLib (3).
205	.SH "AUTHOR"
206	Mark Wooding, <mdw@nsict.org>