doc/pod/hook-perl.pod

   1 =head1 INN Perl Filtering and Authentication Support
   2
   3 This is $Revision: 7860 $ dated $Date: 2008-06-07 05:46:49 -0700 (Sat, 07 Jun 2008) $.
   4
   5 This file documents INN's built-in support for Perl filtering and reader
   6 authentication.  The code is based very heavily on work by Christophe
   7 Wolfhugel <wolf@pasteur.fr>, and his work was in turn inspired by the
   8 existing TCL support.  Please send any bug reports to inn-bugs@isc.org,
   9 not to Christophe, as the code has been modified heavily since he
  10 originally wrote it.
  11
  12 The Perl filtering support is described in more detail below.  Basically,
  13 it allows you to supply a Perl function that is invoked on every article
  14 received by innd from a peer (the innd filter) or by nnrpd from a reader
  15 (the nnrpd filter).  This function can decide whether to accept or reject
  16 the article, and can optionally do other, more complicated processing
  17 (such as add history entries, cancel articles, spool local posts into a
  18 holding area, or even modify the headers of locally submitted posts).
  19 The Perl authentication hooks allow you to replace or supplement the
  20 readers.conf mechanism used by nnrpd.
  21
  22 For Perl filtering support, you need to have Perl version 5.004 or newer.
  23 Earlier versions of Perl will fail with a link error at compilation time.
  24 http://language.perl.com/info/software.html should have the latest Perl
  25 version.
  26
  27 To enable Perl support, you have to specify B<--with-perl> when you run
  28 configure.  See F<INSTALL> for more information.
  29
  30 =head1 The innd Perl Filter
  31
  32 When innd starts, it first loads the file _PATH_PERL_STARTUP_INND (defined
  33 in F<include/paths.h>, by default F<startup_innd.pl>) and then loads the
  34 file _PATH_PERL_FILTER_INND (also defined in F<include/paths.h>, by
  35 default F<filter_innd.pl>).  Both of these files must be located in the
  36 directory specified by pathfilter in F<inn.conf>
  37 (F</usr/local/news/bin/filter> by default).  The default directory for
  38 filter code can be specified at configure time by giving the flag
  39 B<--with-filter-dir> to configure.
  40
  41 INN doesn't care what Perl functions you define in which files.  The only
  42 thing that's different about the two files is when they're loaded.
  43 F<startup_innd.pl> is loaded only once, when innd first starts, and is
  44 never reloaded as long as innd is running.  Any modifications to that file
  45 won't be noticed by innd; only stopping and restarting innd can cause it
  46 to be reloaded.
  47
  48 F<filter_innd.pl>, on the other hand, can be reloaded on command (with
  49 C<ctlinnd reload filter.perl 'reason'>).  Whenever F<filter_innd.pl> is loaded,
  50 including the first time at innd startup, the Perl function
  51 filter_before_reload() is called before it's reloaded and the function
  52 filter_after_reload() is called after it's reloaded (if the functions
  53 exist).  Additionally, any code in either F<startup_innd.pl> or
  54 F<filter_innd.pl> at the top level (in other words, not inside a sub { })
  55 is automatically executed by Perl when the files are loaded.
  56
  57 This allows one to do things like write out filter statistics whenever the
  58 filter is reloaded, load a cache into memory, flush cached data to disk,
  59 or other similar operations that should only happen at particular times or
  60 with manual intervention.  Remember, any code not inside functions in
  61 F<startup_innd.pl> is executed when that file is loaded, and it's loaded
  62 only once when innd first starts.  That makes it the ideal place to put
  63 initialization code that should only run once, or code to load data that
  64 was preserved on disk across a stop and restart of innd (perhaps using
  65 filter_mode() -- see below).
  66
  67 As mentioned above, C<ctlinnd reload filter.perl 'reason'> (or C<ctlinnd reload
  68 all 'reason'>) will cause F<filter_innd.pl> to be reloaded.  If the function
  69 filter_art() is defined after the file has been reloaded, filtering is
  70 turned on.  Otherwise, filtering is turned off.  (Note that due to the way
  71 Perl stores functions, once you've defined filter_art(), you can't
  72 undefine it just by deleting it from the file and reloading the filter.
  73 You'll need to replace it with an empty sub.)
  74
  75 The Perl function filter_art() is the heart of a Perl filter.  Whenever an
  76 article is received from a peer, via either IHAVE or TAKETHIS,
  77 filter_art() is called if Perl filtering is turned on.  It receives no
  78 arguments, and should return a single scalar value.  That value should be
  79 the empty string to indicate that INN should accept the article, or some
  80 rejection message to indicate that the article should be rejected.
  81
  82 filter_art() has access to a global hash named %hdr, which contains all of
  83 the standard headers present in the article and their values.  The
  84 standard headers are:
  85
  86     Also-Control, Approved, Bytes, Cancel-Key, Cancel-Lock,
  87     Content-Base, Content-Disposition, Content-Transfer-Encoding,
  88     Content-Type, Control, Date, Date-Received, Distribution, Expires,
  89     Face, Followup-To, From, In-Reply-To, Injection-Date, Injection-Info,
  90     Keywords, Lines, List-ID, Message-ID, MIME-Version, Newsgroups,
  91     NNTP-Posting-Date, NNTP-Posting-Host, Organization, Originator,
  92     Path, Posted, Posting-Version, Received, References, Relay-Version,
  93     Reply-To, Sender, Subject, Supersedes, User-Agent,
  94     X-Auth, X-Canceled-By, X-Cancelled-By, X-Complaints-To, X-Face,
  95     X-HTTP-UserAgent, X-HTTP-Via, X-Mailer, X-Modbot, X-Modtrace,
  96     X-Newsposter, X-Newsreader, X-No-Archive, X-Original-Message-ID,
  97     X-Original-Trace, X-Originating-IP, X-PGP-Key, X-PGP-Sig,
  98     X-Poster-Trace, X-Postfilter, X-Proxy-User, X-Submissions-To,
  99     X-Trace, X-Usenet-Provider, Xref.
 100
 101 Note that all the above headers are as they arrived, not modified by
 102 your INN (especially, the Xref: header, if present, is the one of
 103 the remote site which sent you the article, and not yours).
 104
 105 For example, the Newsgroups: header of the article is accessible
 106 inside the Perl filter as C<$hdr{'Newsgroups'}>.  In addition,
 107 C<$hdr{'__BODY__'}> will contain the full body of the article and
 108 C<$hdr{'__LINES__'}> will contain the number of lines in the body of the
 109 article.
 110
 111 The contents of the %hdr hash for a typical article may therefore look
 112 something like this:
 113
 114     %hdr = (Subject      => 'MAKE MONEY FAST!!',
 115         From         => 'Joe Spamer <him@example.com>',
 116         Date         => '10 Sep 1996 15:32:28 UTC',
 117         Newsgroups   => 'alt.test',
 118         Path         => 'news.example.com!not-for-mail',
 119         Organization => 'Spammers Anonymous',
 120         Lines        => '5',
 121         Distribution => 'usa',
 122         'Message-ID' => '<6.20232.842369548@example.com>',
 123         __BODY__     => 'Send five dollars to the ISC, c/o ...',
 124         __LINES__    => 5
 125     );
 126
 127 Note that the value of C<$hdr{Lines}> is the contents of the Lines: header
 128 of the article and may bear no resemblence to the actual length of the
 129 article.  C<$hdr{__LINES__}> is the line count calculated by INN, and is
 130 guaranteed to be accurate.
 131
 132 The %hdr hash should not be modified inside filter_art().  Instead, if any
 133 of the contents need to be modified temporarily during filtering (smashing
 134 case, for example), copy them into a seperate variable first and perform
 135 the modifications on the copy.  Currently, C<$hdr{__BODY__}> is the only
 136 data that will cause your filter to die if you modify it, but in the
 137 future other keys may also contain live data.  Modifying live INN data in
 138 Perl will hopefully only cause a fatal exception in your Perl code that
 139 disables Perl filtering until you fix it, but it's possible for it to
 140 cause article munging or even core dumps in INN.  So always, always make a
 141 copy first.
 142
 143 As mentioned above, if filter_art() returns the empty string (''), the
 144 article is accepted.  Note that this must be the empty string, not 0 or
 145 undef.  Otherwise, the article is rejected, and whatever scalar
 146 filter_art() returns (typically a string) will be taken as the reason why
 147 the article was rejected.  This reason will be returned to the remote peer
 148 as well as logged to the news logs.  (innreport, in its nightly report,
 149 will summarize the number of articles rejected by the Perl filter and
 150 include a count of how many articles were rejected with each reason
 151 string.)
 152
 153 One other type of filtering is also supported.  If Perl filtering is
 154 turned on and the Perl function filter_messageid() is defined, that
 155 function will be called for each message ID received from a peer (via
 156 either CHECK or IHAVE).  The function receives a single argument, the
 157 message ID, and like filter_art() should return an empty string to accept
 158 the article or an error string to refuse the article.  This function is
 159 called before any history lookups and for every article offered to innd
 160 with CHECK or IHAVE (before the actual article is sent).  Accordingly, the
 161 message ID is the only information it has about the article (the %hdr hash
 162 will be empty).  This code would sit in a performance-critical hot path in
 163 a typical server, and therefore should be as fast as possible, but it can
 164 do things like refuse articles from certain hosts or cancels for already
 165 rejected articles (if they follow the $alz convention) without having to
 166 take the network bandwidth hit of accepting the entire article first.
 167
 168 Note that you cannot rely on filter_messageid() being called for every
 169 incoming article; articles sent via TAKETHIS without an earlier CHECK will
 170 never pass through filter_messageid() and will only go through
 171 filter_art().
 172
 173 Finally, whenever ctlinnd throttle, ctlinnd pause, or ctlinnd go is run,
 174 the Perl function filter_mode() is called if it exists.  It receives no
 175 arguments and returns no value, but it has access to a global hash %mode
 176 that contains three values:
 177
 178     Mode       The current server mode (throttled, paused, or running)
 179     NewMode    The new mode the server is going to
 180     reason     The reason that was given to ctlinnd
 181
 182 One possible use for this function is to save filter state across a
 183 restart of innd.  There isn't any Perl function which is called when INN
 184 shuts down, but using filter_mode() the Perl filter can dump it's state to
 185 disk whenever INN is throttled.  Then, if the news administrator follows
 186 the strongly recommended shutdown procedure of throttling the server
 187 before shutting it down, the filter state will be safely saved to disk and
 188 can be reloaded when innd restarts (possibly by F<startup_innd.pl>).
 189
 190 The state of the Perl interpretor in which all of these Perl functions run
 191 is preserved over the lifetime of innd.  In other words, it's permissible for
 192 the Perl code to create its own global Perl variables, data structures,
 193 saved state, and the like, and all of that will be available to
 194 filter_art() and filter_messageid() each time they're called.  The only
 195 variable INN fiddles with (or pays any attention to at all) is %hdr, which
 196 is cleared after each call to filter_art().
 197
 198 Perl filtering can be turned off with C<ctlinnd perl n> and back on again
 199 with C<ctlinnd perl y>.  Perl filtering is turned off automatically if
 200 loading of the filter fails or if the filter code returns any sort of a
 201 fatal error (either due to Perl itself or due to a C<die> in the Perl code).
 202
 203 =head1 Supported innd Callbacks
 204
 205 innd makes seven functions available to any of its embedded Perl code.
 206 Those are:
 207
 208 =over 4
 209
 210 =item INN::addhist(I<messageid>, I<arrival>, I<articledate>, I<expire>, I<paths>)
 211
 212 Adds I<messageid> to the history database.  All of the arguments except
 213 the first one are optional; the times default to the current time and the
 214 paths field defaults to the empty string.  (For those unfamiliar with the
 215 fields of a history(5) database entry, the I<arrival> is normally the time at
 216 which the server accepts the article, the I<articledate> is from the Date
 217 header of the article, the I<expire> is from the Expires header of the
 218 article, and the I<paths> field is the storage API token.  All three times
 219 as measured as a time_t since the epoch.)  Returns true on success, false
 220 otherwise.
 221
 222 =item INN::article(I<messageid>)
 223
 224 Returns the full article (as a simple string) identified by I<messageid>,
 225 or undef if it isn't found.  Each line will end with a simple \n, but
 226 leading periods may still be doubled if the article is stored in wire
 227 format.
 228
 229 =item INN::cancel(I<messageid>)
 230
 231 Cancels I<messageid>.  (This is equivalent to C<ctlinnd cancel>; it
 232 cancels the message on the local server, but doesn't post a cancel message
 233 or do anything else that affects anything other than the local server.)
 234 Returns true on success, false otherwise.
 235
 236 =item INN::filesfor(I<messageid>)
 237
 238 Returns the I<paths> field of the history entry for the given
 239 I<messageid>.  This will be the storage API token for the message.  If
 240 I<messageid> isn't found in the history database, returns undef.
 241
 242 =item INN::havehist(I<messageid>)
 243
 244 Looks up I<messageid> in the history database and returns true if it's
 245 found, false otherwise.
 246
 247 =item INN::head(I<messageid>)
 248
 249 Returns the header (as a simple string) of the article identified by
 250 I<messageid>, or undef if it isn't found.  Each line will end with a
 251 simple \n (in other words, regardless of the format of article storage,
 252 the returned string won't be in wire format).
 253
 254 =item INN::newsgroup(I<newsgroup>)
 255
 256 Returns the status of I<newsgroup> (the last field of the active file
 257 entry for that newsgroup).  See active(5) for a description of the
 258 possible values and their meanings (the most common are "y" for an
 259 unmoderated group and "m" for a moderated group).  If I<newsgroup> isn't
 260 in the active file, returns undef.
 261
 262 =back
 263
 264 These functions can only be used from inside the innd Perl filter; they're
 265 not available in the nnrpd filter.
 266
 267 =head1 Common Callbacks
 268
 269 The following additional function is available from inside filters
 270 embedded in innd, and is also available from filters embedded in nnrpd
 271 (see below):
 272
 273 =over 4
 274
 275 =item INN::syslog(level, message)
 276
 277 Logs a message via syslog(2).  This is quite a bit more reliable and
 278 portable than trying to use Sys::Syslog from inside the Perl filter.  Only
 279 the first character of the level argument matters; the valid letters are
 280 the first letters of ALERT, CRIT, ERR, WARNING, NOTICE, INFO, and DEBUG
 281 (case-insensitive) and specify the priority at which the message is
 282 logged.  If a level that doesn't match any of those levels is given, the
 283 default priority level is LOG_NOTICE.  The second argument is the message
 284 to log; it will be prefixed by "filter: " and logged to syslog with
 285 facility LOG_NEWS.
 286
 287 =back
 288
 289 =head1 The nnrpd Posting Filter
 290
 291 Whenever Perl support is needed in nnrpd, it first loads the file
 292 _PATH_PERL_FILTER_NNRPD (defined in F<include/paths.h>, by default
 293 F<filter_nnrpd.pl>).  This file must be located in the directory
 294 specified by pathfilter in F<inn.conf> (F</usr/local/news/bin/filter>
 295 by default).  The default directory for filter code can be specified
 296 at configure time by giving the flag B<--with-filter-dir> to
 297 configure.
 298
 299 If F<filter_nnrpd.pl> loads successfully and defines the Perl function
 300 filter_post(), Perl filtering is turned on.  Otherwise, it's turned off.
 301 If filter_post() ever returns a fatal error (either from Perl or from a
 302 C<die> in the Perl code), Perl filtering is turned off for the life of that
 303 nnrpd process and any further posts made during that session won't go
 304 through the filter.
 305
 306 While Perl filtering is on, every article received by nnrpd via the POST
 307 command is passed to the filter_post() Perl function before it is passed
 308 to INN (or mailed to the moderator of a moderated newsgroup).  If
 309 filter_post() returns an empty string (''), the article is accepted and
 310 normal processing of it continues.  Otherwise, the article is rejected and
 311 the string returned by filter_post() is returned to the client as the
 312 error message (with some exceptions; see below).
 313
 314 filter_post() has access to a global hash %hdr, which contains all of the
 315 headers of the article.  (Unlike the innd Perl filter, %hdr for the nnrpd
 316 Perl filter contains *all* of the headers, not just the standard ones.  If
 317 any of the headers are duplicated, though, %hdr will contain only the
 318 value of the last occurance of the header.  nnrpd will reject the
 319 article before the filter runs if any of the standard headers are
 320 duplicated.)  It also has access to the full body of the article in the
 321 variable $body, and if the poster authenticated via AUTHINFO (or if either
 322 Perl authentication or a readers.conf authentication method is used and
 323 produces user information), it has access to the authenticated username of
 324 the poster in the variable $user.
 325
 326 Unlike the innd Perl filter, the nnrpd Perl filter can modify the %hdr
 327 hash.  In fact, if the Perl variable $modify_headers is set to true after
 328 filter_post() returns, the contents of the %hdr hash will be written back
 329 to the article replacing the original headers.  filter_post() can
 330 therefore make any modifications it wishes to the headers and those
 331 modifications will be reflected in the article as it's finally posted.
 332 The article body cannot be modified in this way; any changes to $body will
 333 just be ignored.
 334
 335 Be careful when using the ability to modify headers.  filter_post() runs
 336 after all the normal consistency checks on the headers and after server
 337 supplied headers (like Message-ID: and Date:) are filled in.  Deleting
 338 required headers or modifying headers that need to follow a strict format
 339 can result in nnrpd trying to post nonsense articles (which will probably
 340 then be rejected by innd).  If $modify_headers is set, I<everything> in
 341 the %hdr hash is taken to be article headers and added to the article.
 342
 343 If filter_post() returns something other than the empty string, this
 344 message is normally returned to the client as an error.  There are two
 345 exceptions:  If the string returned begins with "DROP", the post will be
 346 silently discarded and success returned to the client.  If the string
 347 begins with "SPOOL", success is returned to the client, but the post is
 348 saved in a directory named "spam" under the directory specified by
 349 pathincoming in F<inn.conf> (in a directory named "spam/mod" if the post
 350 is to a moderated group).  This is intended to allow manual inspection of
 351 the suspect messages; if they should be posted, they can be manually moved
 352 out of the subdirectory to the directory specified by pathincoming in
 353 F<inn.conf>, where they can be posted by running C<rnews -U>.  If you use
 354 this functionality, make sure those directories exist.
 355
 356 =head1 Changes to Perl Authentication Support for nnrpd
 357
 358 The old authentication functionality has been combined with the new
 359 readers.conf mechanism by Erik Klavon <erik@eriq.org>; bug reports
 360 should however go to inn-bugs@isc.org, not Erik.
 361
 362 The remainder of this section is an introduction to the new mechanism
 363 (which uses the perl_auth: and perl_access: F<readers.conf> parameters)
 364 with porting/migration suggestions for people familiar with the old
 365 mechanism (identifiable by the nnrpperlauth: parameter in F<inn.conf>).
 366
 367 Other people should skip this section.
 368
 369 The perl_auth parameter allows the use of Perl to authenticate a user.
 370 Scripts (like those from the old mechanism) are listed in F<readers.conf>
 371 using perl_auth in the same manner other authenticators are using auth:
 372
 373     perl_auth: "/path/to/script/auth1.pl"
 374
 375 The file given as argument to perl_auth should contain the same
 376 procedures as before. The global hash %attributes remains the same,
 377 except for the removal of the "type" entry which is no longer needed
 378 in this modification and the addition of several new entries (port,
 379 intipaddr, intport) described below. The return array now only
 380 contains either two or three elements, the first of which is the NNTP
 381 return code. The second is an error string which is passed to the
 382 client if the error code indicates that the authentication attempt has
 383 failed. This allows a specific error message to be generated by the
 384 perl script in place of "Authentication failed". An optional third
 385 return element if present will be used to match the connection with
 386 the users: parameter in access groups and will also be the username
 387 logged. If this element is absent, the username supplied by the client
 388 during authentication will be used as was the previous behavior.
 389
 390 The perl_access parameter (described below) is also new; it allows the
 391 dynamic generation of an access group for an incoming connection using
 392 a Perl script.  If a connection matches an auth group which has a
 393 perl_access parameter, all access groups in readers.conf are ignored;
 394 instead the procedure described below is used to generate an access group.
 395 This concept is due to Jeffrey M. Vinocur.
 396
 397 The new functionality should provide all of the existing capabilities
 398 of the Perl hook, in combination with the flexibility of readers.conf
 399 and the use of other authentication and resolving programs.  To use
 400 Perl authentication code that predates the readers.conf mechanism, you
 401 would need to modify the code slightly (see below for the new
 402 specification) and supply a simple readers.conf file.  If you don't want
 403 to modify your code, the samples directory has F<nnrpd_auth_wrapper.pl>
 404 and F<nnrpd_access_wrapper.pl> which should allow you to use your old
 405 code without needing to change it.
 406
 407 However, before trying to use your old Perl code, you may want to
 408 consider replacing it entirely with non-Perl authentication.  (With
 409 readers.conf and the regular authenticator and resolver programs, much
 410 of what once required Perl can be done directly.)  Even if the
 411 functionality is not available directly, you may wish to write a new
 412 authenticator or resolver (which can be done in whatever language you
 413 prefer to work in).
 414
 415
 416 =head1 Perl Authentication Support for nnrpd
 417
 418 Support for authentication via Perl is provided in nnrpd by the
 419 inclusion of a perl_auth: parameter in a F<readers.conf> auth
 420 group. perl_auth: works exactly like the auth: parameter in
 421 F<readers.conf>, except that it calls the script given as argument using
 422 the Perl hook rather then treating it as an external program.
 423
 424 If the processing of readers.conf requires that a perl_auth: statement
 425 be used for authentication, Perl is loaded (if it has yet to be) and
 426 the file given as argument to the perl_auth: parameter is loaded as
 427 well. If a Perl function auth_init() is defined by that file, it is called
 428 immediately after the file is loaded.  It takes no arguments and returns
 429 nothing.
 430
 431 Provided the file loads without errors, auth_init() (if present) runs
 432 without fatal errors, and a Perl function authenticate() is defined,
 433 authenticate() will then be called. authenticate() takes no arguments,
 434 but it has access to a global hash %attributes which contains
 435 information about the connection as follows: C<$attributes{hostname}>
 436 will contain the hostname (or the IP address if it doesn't resolve) of
 437 the client machine, C<$attributes{ipaddress}> will contain its IP
 438 address (as a string), C<$attributes{port}> will contain the client
 439 port (as an integer), C<$attributes{interface}> contains the hostname
 440 of the interface the client connected on, C<$attributes{intipaddr}>
 441 contains the IP address (as a string) of the interface the client
 442 connected on, C<$attributes{intport}> contains the port (as an
 443 integer) on the interface the client connected on,
 444 C<$attributes{username}> will contain the provided username and
 445 C<$attributes{password}> the password.
 446
 447 authenticate() should return a two or three element array.  The first
 448 element is the NNTP response code to return to the client, the second
 449 element is an error string which is passed to the client if the
 450 response code indicates that the authentication attempt has failed. An
 451 optional third return element if present will be used to match the
 452 connection with the users: parameter in access groups and will also be
 453 the username logged. If this element is absent, the username supplied
 454 by the client during authentication will be used for matching and
 455 logging.
 456
 457 The NNTP response code should probably be either 281 (authentication
 458 successful) or 502 (authentication unsuccessful).  If the code
 459 returned is anything other than 281, nnrpd will print an
 460 authentication error message and drop the connection and exit.
 461
 462 If authenticate() dies (either due to a Perl error or due to calling die),
 463 or if it returns anything other than the two or three element array
 464 described above, an internal error will be reported to the client, the
 465 exact error will be logged to syslog, and nnrpd will drop the
 466 connection and exit.
 467
 468
 469 =head1 Dynamic Generation of Access Groups
 470
 471 A Perl script may be used to dynamically generate an access group
 472 which is then used to determine the access rights of the client. This
 473 occurs whenever the perl_access: is specified in an auth group which
 474 has successfully matched the client. Only one perl_access:
 475 statement is allowed in an auth group. This parameter should not be
 476 mixed with a python_access: statement in the same auth group.
 477
 478 When a perl_access: parameter is encountered, Perl is loaded (if it
 479 has yet to be) and the file given as argument is loaded as
 480 well. Provided the file loads without errors, and a Perl function
 481 access() is defined, access() will then be called. access() takes no
 482 arguments, but it has access to a global hash %attributes which
 483 contains information about the connection as follows:
 484 C<$attributes{hostname}> will contain the hostname (or the IP address
 485 if it doesn't resolve) of the client machine,
 486 C<$attributes{ipaddress}> will contain its IP address (as a string),
 487 C<$attributes{port}> will contain the client port (as an integer),
 488 C<$attributes{interface}> contains the hostname of the interface the
 489 client connected on, C<$attributes{intipaddr}> contains the IP address
 490 (as a string) of the interface the client connected on,
 491 C<$attributes{intport}> contains the port (as an integer) on the
 492 interface the client connected on, C<$attributes{username}> will
 493 contain the provided username and domain (in username@domain form).
 494
 495 access() returns a hash, containing the desired access parameters and
 496 values.  Here is an untested example showing how to dynamically generate a
 497 list of newsgroups based on the client's username and domain.
 498
 499      my %hosts = ( "example.com" => "example.*", "isc.org" => "isc.*" );
 500
 501      sub access {
 502         %return_hash = (
 503            "max_rate" => "10000",
 504            "addnntppostinghost" => "true",
 505      #     ...
 506         );
 507         if( defined $attributes{username} &&
 508             $attributes{username} =~ /.*@(.*)/ )
 509         {
 510            $return_hash{"virtualhost"} = "true";
 511            $return_hash{"path"} = $1;
 512            $return_hash{"newsgroups"} = $hosts{$1};
 513         } else {
 514            $return_hash{"read"} = "*";
 515            $return_hash{"post"} = "local.*"
 516         }
 517         return %return_hash;
 518      }
 519
 520 Note that both the keys and values are quoted strings. These values
 521 are to be returned to a C program and must be quoted strings. For
 522 values containing one or more spaces, it is not necessary to include
 523 extra quotes inside the string.
 524
 525 While you may include the users: parameter in a dynamically generated
 526 access group, some care should be taken (unless your pattern is just
 527 * which is equivalent to leaving the parameter out). The group created
 528 with the values returned from the Perl script is the only one
 529 considered when nnrpd attempts to find an access group matching the
 530 connection. If a users: parameter is included and it doesn't match the
 531 connection, then the client will be denied access since there are no
 532 other access groups which could match the connection.
 533
 534 If access() dies (either due to a Perl error or due to calling die),
 535 or if it returns anything other than a hash as described
 536 above, an internal error will be reported to the client, the exact error
 537 will be logged to syslog, and nnrpd will drop the connection and exit.
 538
 539 =head1 Notes on Writing Embedded Perl
 540
 541 All Perl evaluation is done inside an implicit eval block, so calling die
 542 in Perl code will not kill the innd or nnrpd process.  Neither will Perl
 543 errors (such as syntax errors).  However, such errors will have negative
 544 effects (fatal errors in the innd or nnrpd filter will cause filtering to
 545 be disabled, and fatal errors in the nnrpd authentication code will cause
 546 the client connection to be terminated).
 547
 548 Calling exit directly, however, *will* kill the innd or nnrpd process, so
 549 don't do that.  Similarly, you probably don't want to call fork (or any
 550 other function that results in a fork such as system, IPC::Open3::open3(),
 551 or any use of backticks) since there are possibly unflushed buffers that
 552 could get flushed twice, lots of open state that may not get closed
 553 properly, and innumerable other potential problems.  In general, be aware
 554 that all Perl code is running inside a large and complicated C program,
 555 and Perl code that impacts the process as a whole is best avoided.
 556
 557 You can use print and warn inside Perl code to send output to STDOUT or
 558 STDERR, but you probably shouldn't.  Instead, open a log file and print to
 559 it instead (or, in the innd filter, use INN::syslog() to write messages
 560 via syslog like the rest of INN).  If you write to STDOUT or STDERR, where
 561 that data will go depends on where the filter is running; inside innd, it
 562 will go to the news log or the errlog, and inside nnrpd it will probably
 563 go nowhere but could go to the client.  The nnrpd filter takes some steps
 564 to try to keep output from going across the network connection to the
 565 client (which would probably result in a very confused client), but best
 566 not to take the chance.
 567
 568 For similar reasons, try to make your Perl code -w clean, since Perl
 569 warnings are written to STDERR.  (INN won't run your code under -w, but
 570 better safe than sorry, and some versions of Perl have some mandatory
 571 warnings you can't turn off.)
 572
 573 You *can* use modules in your Perl code, just like you would in an
 574 ordinary Perl script.  You can even use modules that dynamically load C
 575 code.  Just make sure that none of the modules you use go off behind your
 576 back to do any of the things above that are best avoided.
 577
 578 Whenever you make any modifications to the Perl code, and particularly
 579 before starting INN or reloading filter.perl with new code, you should run
 580 perl -wc on the file.  This will at least make sure you don't have any
 581 glaring syntax errors.  Remember, if there are errors in your code,
 582 filtering will be disabled, which could mean that posts you really wanted
 583 to reject will leak through and authentication of readers may be totally
 584 broken.
 585
 586 The samples directory has example F<startup_innd.pl>, F<filter_innd.pl>,
 587 F<filter_nnrpd.pl>, and F<nnrpd_auth.pl> files that contain some
 588 simplistic examples.  Look them over as a starting point when writing your
 589 own.
 590
 591 =head1 Available Packages
 592
 593 This is an unofficial list of known filtering packages at the time of
 594 publication.  This is not an endorsement of these filters by the ISC or
 595 the INN developers, but is included as assistance in locating packages
 596 which make use of this filter mechanism.
 597
 598   CleanFeed               Jeremy Nixon <jeremy@exit109.com>
 599   <URL:http://www.exit109.com/~jeremy/news/cleanfeed.html>
 600         A spam filter catching excessive multi-posting and a host of
 601         other things.  Uses filter_innd.pl exclusively, requires the MD5
 602         Perl module.  Probably the most popular and widely-used Perl
 603         filter around.
 604
 605   Usenet II Filter        Edward S. Marshall <emarshal@xnet.com>
 606   <URL:http://www.xnet.com/~emarshal/inn/filter_nnrpd.pl>
 607         Checks for "soundness" according to Usenet II guidelines in the
 608         net.* hierarchy.  Designed to use filter_nnrpd.pl.
 609
 610   News Gizmo              Aidan Cully <aidan@panix.com>
 611   <URL:http://www.panix.com/gizmo/>
 612         A posting filter for helping a site enforce Usenet-II soundness,
 613         and for quotaing the number of messages any user can post to
 614         Usenet daily.