chiark - git - mdw - mLib/blob - man/lbuf.3

   1 .\" -*-nroff-*-
   2 .TH lbuf 3 "6 July 1999" mLib
   3 .SH "NAME"
   4 lbuf \- split lines out of asynchronously received blocks
   5 .\" @lbuf_flush
   6 .\" @lbuf_close
   7 .\" @lbuf_free
   8 .\" @lbuf_snarf
   9 .\" @lbuf_init
  10 .SH "SYNOPSIS"
  11 .nf
  12 .B "#include <mLib/lbuf.h>"
  13
  14 .BI "void lbuf_flush(lbuf *" b ", char *" p ", size_t " len );
  15 .BI "void lbuf_close(lbuf *" b );
  16 .BI "size_t lbuf_free(lbuf *" b ", char **" p );
  17 .BI "void lbuf_snarf(lbuf *" b ", const void *" p ", size_t " sz );
  18 .BI "void lbuf_init(lbuf *" b ,
  19 .BI "               void (*" func ")(char *" s ", void *" p ),
  20 .BI "               void *" p );
  21 .fi
  22 .SH "DESCRIPTION"
  23 The declarations in
  24 .B <mLib/lbuf.h>
  25 implement a handy object called a
  26 .IR "line buffer" .
  27 Given unpredictably-sized chunks of data, the line buffer extracts
  28 completed lines of text and passes them to a caller-supplied function.
  29 This is useful in nonblocking network servers, for example: the server
  30 can feed input from a client into a line buffer as it arrives and deal
  31 with completed text lines as they appear without having to wait for
  32 newline characters.
  33 .PP
  34 The state of a line buffer is stored in an object of type
  35 .BR lbuf .
  36 This is a structure which must be allocated by the caller.  The
  37 structure should normally be considered opaque (see the section on
  38 .B Disablement
  39 for an exception to this).
  40 .SS "Initialization and finalization"
  41 The function
  42 .B lbuf_init
  43 initializes a line buffer ready for use.  It is given three arguments:
  44 .TP
  45 .I b
  46 A pointer to the block of memory to use for the line buffer.  This is
  47 all the memory the line buffer requires.
  48 .TP
  49 .I func
  50 The
  51 .I line-handler
  52 function to which the line buffer should pass completed lines of text.
  53 .TP
  54 .I p
  55 A pointer argument to be passed to the function when a completed line of
  56 text arrives.
  57 .PP
  58 Since the line buffer requires no memory except for the actual
  59 .B lbuf
  60 object, and doesn't hook itself onto anything else, it can just be
  61 thrown away when you don't want it any more.  No explicit finalization
  62 is required.
  63 .SS "Inserting data into the buffer"
  64 There are two interfaces for inserting data into the buffer.  One's much
  65 simpler than the other, although it's less expressive.
  66 .PP
  67 The simple interface is
  68 .BR lbuf_snarf .
  69 This function is given three arguments: a pointer
  70 .I b
  71 to a line buffer structure; a pointer
  72 .I p
  73  to a chunk of data to read; and the size
  74 .I sz
  75 of the chunk of data.  The data is pushed through the line buffer and
  76 any complete lines are passed on to the line handler.
  77 .PP
  78 The complex interface is the pair of functions
  79 .I lbuf_free
  80 and
  81 .IR lbuf_flush .
  82 .PP
  83 The
  84 .B lbuf_free
  85 function returns the address and size of a free portion of the line
  86 buffer's memory into which data may be written.  The function is passed
  87 the address
  88 .I l
  89 of the line buffer.  Its result is the size of the free area, and it
  90 writes the base address of this free space to the location pointed to by
  91 the argument
  92 .IR p .
  93 The caller's data must be written to ascending memory locations starting
  94 at
  95 .BI * p
  96 and no data may be written beyond the end of the free space.  However,
  97 it isn't necessary to completely fill the buffer.
  98 .PP
  99 Once the free area has had some data written to it,
 100 .B lbuf_flush
 101 is called to examine the new data and break it into text lines.  This is
 102 given three arguments:
 103 .TP
 104 .I b
 105 The address of the line buffer.
 106 .TP
 107 .I p
 108 The address at which the new data has been written.  This must be the
 109 base address returned from
 110 .BR lbuf_free .
 111 .TP
 112 .I len
 113 The number of bytes which have been written to the buffer.
 114 .PP
 115 The
 116 .B lbuf_flush
 117 function breaks the new data into lines as described below, and passes
 118 each one in turn to the line-handler function.
 119 .PP
 120 The
 121 .B lbuf_snarf
 122 function is trivially implemented in terms of the more complex
 123 .B lbuf_free / lbuf_flush
 124 interface.
 125 .SS "Line breaking"
 126 The line buffer considers a line to end with either a simple linefeed
 127 character (the normal Unix convention) or a carriage-return/linefeed
 128 pair (the Internet convention).
 129 .PP
 130 The line buffer has a fixed amount of memory available to it.  This is
 131 deliberate, to prevent a trivial attack whereby a remote user sends a
 132 stream of data containing no newline markers, wasting the server's
 133 memory.  Instead, the buffer will truncate overly long lines (silently)
 134 and return only the initial portion.  It will ignore the rest of the
 135 line completely.
 136 .SS "Line-handler functions"
 137 Completed lines, as already said, are passed to the caller's
 138 line-handler function.  The function is given two arguments:
 139 the address
 140 .I s
 141 of the line which has just been read, and the pointer
 142 .I p
 143 which was set up in the call to
 144 .B lbuf_init .
 145 The line passed is null-terminated, and has had its trailing newline
 146 stripped.  The area of memory in which the string is located may be
 147 overwritten by the line-handler function, although writing beyond the
 148 terminating zero byte is not permitted.
 149 .PP
 150 The line pointer argument
 151 .I s
 152 may be null to signify end-of-file.  See the next section.
 153 .SS "Flushing the remaining data"
 154 When the client program knows that there's no more data arriving (for
 155 example, an end-of-file condition exists on its data source) it should
 156 call the function
 157 .BR lbuf_close
 158 to flush out the remaining data in the buffer as one last (improperly
 159 terminated) line.  This will pass the remaining text to the line
 160 handler, if there is any, and then call the handler one final time with
 161 a null pointer rather than the address of a text line to inform it of
 162 the end-of-file.
 163 .SS "Disablement"
 164 The line buffer is intended to be used in higher-level program objects,
 165 such as the buffer selector described in
 166 .BR selbuf (3).
 167 Unfortunately, a concept from this high level needs to exist at the line
 168 buffer level, which complicates the description somewhat.  The idea is
 169 that, when a line-handler attached to some higher-level object decides
 170 that it's read enough, it can
 171 .I disable
 172 the object so that it doesn't see any more data.
 173 .PP
 174 Clearly, since an
 175 .B lbuf_flush
 176 call can emit more than one line, so it must be aware that the line
 177 handler isn't interested in any more lines.  However, this fact must
 178 also be signalled to the higher-level object so that it can detach
 179 itself from its data source.
 180 .PP
 181 Rather than invent some complex interface for this, the line buffer
 182 exports one of its structure members,
 183 .BR flags .
 184 A higher-level object wishing to disable the line buffer simply clears
 185 the bit
 186 .B LBUF_ENABLE
 187 in the flags word.
 188 .PP
 189 Disabling a buffer causes an immediate return from
 190 .BR lbuf_flush .
 191 However, it is not permitted for the functions
 192 .B lbuf_flush
 193 or
 194 .B lbuf_close
 195 to be called on a disabled buffer.  (This condition isn't checked for;
 196 it'll just do the wrong thing.)  Furthermore, the
 197 .B lbuf_snarf
 198 function does not handle disablement at all, because it would complicate
 199 the interface so much that it wouldn't have any advantage over the more
 200 general
 201 .BR lbuf_free / lbuf_flush .
 202 .SH "SEE ALSO"
 203 .BR selbuf (3),
 204 .BR mLib (3).
 205 .SH "AUTHOR"
 206 Mark Wooding, <mdw@nsict.org>