doc/hook-perl

   1 INN Perl Filtering and Authentication Support
   2
   3     This is $Revision: 7880 $ dated $Date: 2008-06-07 14:46:49 +0200 (Sat,
   4     07 Jun 2008) $.
   5
   6     This file documents INN's built-in support for Perl filtering and reader
   7     authentication.  The code is based very heavily on work by Christophe
   8     Wolfhugel <wolf@pasteur.fr>, and his work was in turn inspired by the
   9     existing TCL support.  Please send any bug reports to inn-bugs@isc.org,
  10     not to Christophe, as the code has been modified heavily since he
  11     originally wrote it.
  12
  13     The Perl filtering support is described in more detail below.
  14     Basically, it allows you to supply a Perl function that is invoked on
  15     every article received by innd from a peer (the innd filter) or by nnrpd
  16     from a reader (the nnrpd filter).  This function can decide whether to
  17     accept or reject the article, and can optionally do other, more
  18     complicated processing (such as add history entries, cancel articles,
  19     spool local posts into a holding area, or even modify the headers of
  20     locally submitted posts).  The Perl authentication hooks allow you to
  21     replace or supplement the readers.conf mechanism used by nnrpd.
  22
  23     For Perl filtering support, you need to have Perl version 5.004 or
  24     newer.  Earlier versions of Perl will fail with a link error at
  25     compilation time.  http://language.perl.com/info/software.html should
  26     have the latest Perl version.
  27
  28     To enable Perl support, you have to specify --with-perl when you run
  29     configure.  See INSTALL for more information.
  30
  31 The innd Perl Filter
  32
  33     When innd starts, it first loads the file _PATH_PERL_STARTUP_INND
  34     (defined in include/paths.h, by default startup_innd.pl) and then loads
  35     the file _PATH_PERL_FILTER_INND (also defined in include/paths.h, by
  36     default filter_innd.pl).  Both of these files must be located in the
  37     directory specified by pathfilter in inn.conf
  38     (/usr/local/news/bin/filter by default).  The default directory for
  39     filter code can be specified at configure time by giving the flag
  40     --with-filter-dir to configure.
  41
  42     INN doesn't care what Perl functions you define in which files.  The
  43     only thing that's different about the two files is when they're loaded.
  44     startup_innd.pl is loaded only once, when innd first starts, and is
  45     never reloaded as long as innd is running.  Any modifications to that
  46     file won't be noticed by innd; only stopping and restarting innd can
  47     cause it to be reloaded.
  48
  49     filter_innd.pl, on the other hand, can be reloaded on command (with
  50     "ctlinnd reload filter.perl 'reason'").  Whenever filter_innd.pl is
  51     loaded, including the first time at innd startup, the Perl function
  52     filter_before_reload() is called before it's reloaded and the function
  53     filter_after_reload() is called after it's reloaded (if the functions
  54     exist).  Additionally, any code in either startup_innd.pl or
  55     filter_innd.pl at the top level (in other words, not inside a sub { })
  56     is automatically executed by Perl when the files are loaded.
  57
  58     This allows one to do things like write out filter statistics whenever
  59     the filter is reloaded, load a cache into memory, flush cached data to
  60     disk, or other similar operations that should only happen at particular
  61     times or with manual intervention.  Remember, any code not inside
  62     functions in startup_innd.pl is executed when that file is loaded, and
  63     it's loaded only once when innd first starts.  That makes it the ideal
  64     place to put initialization code that should only run once, or code to
  65     load data that was preserved on disk across a stop and restart of innd
  66     (perhaps using filter_mode() -- see below).
  67
  68     As mentioned above, "ctlinnd reload filter.perl 'reason'" (or "ctlinnd
  69     reload all 'reason'") will cause filter_innd.pl to be reloaded.  If the
  70     function filter_art() is defined after the file has been reloaded,
  71     filtering is turned on.  Otherwise, filtering is turned off.  (Note that
  72     due to the way Perl stores functions, once you've defined filter_art(),
  73     you can't undefine it just by deleting it from the file and reloading
  74     the filter.  You'll need to replace it with an empty sub.)
  75
  76     The Perl function filter_art() is the heart of a Perl filter.  Whenever
  77     an article is received from a peer, via either IHAVE or TAKETHIS,
  78     filter_art() is called if Perl filtering is turned on.  It receives no
  79     arguments, and should return a single scalar value.  That value should
  80     be the empty string to indicate that INN should accept the article, or
  81     some rejection message to indicate that the article should be rejected.
  82
  83     filter_art() has access to a global hash named %hdr, which contains all
  84     of the standard headers present in the article and their values.  The
  85     standard headers are:
  86
  87         Also-Control, Approved, Bytes, Cancel-Key, Cancel-Lock,
  88         Content-Base, Content-Disposition, Content-Transfer-Encoding,
  89         Content-Type, Control, Date, Date-Received, Distribution, Expires,
  90         Face, Followup-To, From, In-Reply-To, Injection-Date, Injection-Info,
  91         Keywords, Lines, List-ID, Message-ID, MIME-Version, Newsgroups,
  92         NNTP-Posting-Date, NNTP-Posting-Host, Organization, Originator,
  93         Path, Posted, Posting-Version, Received, References, Relay-Version,
  94         Reply-To, Sender, Subject, Supersedes, User-Agent,
  95         X-Auth, X-Canceled-By, X-Cancelled-By, X-Complaints-To, X-Face,
  96         X-HTTP-UserAgent, X-HTTP-Via, X-Mailer, X-Modbot, X-Modtrace,
  97         X-Newsposter, X-Newsreader, X-No-Archive, X-Original-Message-ID,
  98         X-Original-Trace, X-Originating-IP, X-PGP-Key, X-PGP-Sig,
  99         X-Poster-Trace, X-Postfilter, X-Proxy-User, X-Submissions-To,
 100         X-Trace, X-Usenet-Provider, Xref.
 101
 102     Note that all the above headers are as they arrived, not modified by
 103     your INN (especially, the Xref: header, if present, is the one of the
 104     remote site which sent you the article, and not yours).
 105
 106     For example, the Newsgroups: header of the article is accessible inside
 107     the Perl filter as $hdr{'Newsgroups'}.  In addition, $hdr{'__BODY__'}
 108     will contain the full body of the article and $hdr{'__LINES__'} will
 109     contain the number of lines in the body of the article.
 110
 111     The contents of the %hdr hash for a typical article may therefore look
 112     something like this:
 113
 114         %hdr = (Subject      => 'MAKE MONEY FAST!!',
 115             From         => 'Joe Spamer <him@example.com>',
 116             Date         => '10 Sep 1996 15:32:28 UTC',
 117             Newsgroups   => 'alt.test',
 118             Path         => 'news.example.com!not-for-mail',
 119             Organization => 'Spammers Anonymous',
 120             Lines        => '5',
 121             Distribution => 'usa',
 122             'Message-ID' => '<6.20232.842369548@example.com>',
 123             __BODY__     => 'Send five dollars to the ISC, c/o ...',
 124             __LINES__    => 5
 125         );
 126
 127     Note that the value of $hdr{Lines} is the contents of the Lines: header
 128     of the article and may bear no resemblence to the actual length of the
 129     article.  $hdr{__LINES__} is the line count calculated by INN, and is
 130     guaranteed to be accurate.
 131
 132     The %hdr hash should not be modified inside filter_art().  Instead, if
 133     any of the contents need to be modified temporarily during filtering
 134     (smashing case, for example), copy them into a seperate variable first
 135     and perform the modifications on the copy.  Currently, $hdr{__BODY__} is
 136     the only data that will cause your filter to die if you modify it, but
 137     in the future other keys may also contain live data.  Modifying live INN
 138     data in Perl will hopefully only cause a fatal exception in your Perl
 139     code that disables Perl filtering until you fix it, but it's possible
 140     for it to cause article munging or even core dumps in INN.  So always,
 141     always make a copy first.
 142
 143     As mentioned above, if filter_art() returns the empty string (''), the
 144     article is accepted.  Note that this must be the empty string, not 0 or
 145     undef.  Otherwise, the article is rejected, and whatever scalar
 146     filter_art() returns (typically a string) will be taken as the reason
 147     why the article was rejected.  This reason will be returned to the
 148     remote peer as well as logged to the news logs.  (innreport, in its
 149     nightly report, will summarize the number of articles rejected by the
 150     Perl filter and include a count of how many articles were rejected with
 151     each reason string.)
 152
 153     One other type of filtering is also supported.  If Perl filtering is
 154     turned on and the Perl function filter_messageid() is defined, that
 155     function will be called for each message ID received from a peer (via
 156     either CHECK or IHAVE).  The function receives a single argument, the
 157     message ID, and like filter_art() should return an empty string to
 158     accept the article or an error string to refuse the article.  This
 159     function is called before any history lookups and for every article
 160     offered to innd with CHECK or IHAVE (before the actual article is sent).
 161     Accordingly, the message ID is the only information it has about the
 162     article (the %hdr hash will be empty).  This code would sit in a
 163     performance-critical hot path in a typical server, and therefore should
 164     be as fast as possible, but it can do things like refuse articles from
 165     certain hosts or cancels for already rejected articles (if they follow
 166     the $alz convention) without having to take the network bandwidth hit of
 167     accepting the entire article first.
 168
 169     Note that you cannot rely on filter_messageid() being called for every
 170     incoming article; articles sent via TAKETHIS without an earlier CHECK
 171     will never pass through filter_messageid() and will only go through
 172     filter_art().
 173
 174     Finally, whenever ctlinnd throttle, ctlinnd pause, or ctlinnd go is run,
 175     the Perl function filter_mode() is called if it exists.  It receives no
 176     arguments and returns no value, but it has access to a global hash %mode
 177     that contains three values:
 178
 179         Mode       The current server mode (throttled, paused, or running)
 180         NewMode    The new mode the server is going to
 181         reason     The reason that was given to ctlinnd
 182
 183     One possible use for this function is to save filter state across a
 184     restart of innd.  There isn't any Perl function which is called when INN
 185     shuts down, but using filter_mode() the Perl filter can dump it's state
 186     to disk whenever INN is throttled.  Then, if the news administrator
 187     follows the strongly recommended shutdown procedure of throttling the
 188     server before shutting it down, the filter state will be safely saved to
 189     disk and can be reloaded when innd restarts (possibly by
 190     startup_innd.pl).
 191
 192     The state of the Perl interpretor in which all of these Perl functions
 193     run is preserved over the lifetime of innd.  In other words, it's
 194     permissible for the Perl code to create its own global Perl variables,
 195     data structures, saved state, and the like, and all of that will be
 196     available to filter_art() and filter_messageid() each time they're
 197     called.  The only variable INN fiddles with (or pays any attention to at
 198     all) is %hdr, which is cleared after each call to filter_art().
 199
 200     Perl filtering can be turned off with "ctlinnd perl n" and back on again
 201     with "ctlinnd perl y".  Perl filtering is turned off automatically if
 202     loading of the filter fails or if the filter code returns any sort of a
 203     fatal error (either due to Perl itself or due to a "die" in the Perl
 204     code).
 205
 206 Supported innd Callbacks
 207
 208     innd makes seven functions available to any of its embedded Perl code.
 209     Those are:
 210
 211     INN::addhist(*messageid*, *arrival*, *articledate*, *expire*, *paths*)
 212         Adds *messageid* to the history database.  All of the arguments
 213         except the first one are optional; the times default to the current
 214         time and the paths field defaults to the empty string.  (For those
 215         unfamiliar with the fields of a history(5) database entry, the
 216         *arrival* is normally the time at which the server accepts the
 217         article, the *articledate* is from the Date header of the article,
 218         the *expire* is from the Expires header of the article, and the
 219         *paths* field is the storage API token.  All three times as measured
 220         as a time_t since the epoch.)  Returns true on success, false
 221         otherwise.
 222
 223     INN::article(*messageid*)
 224         Returns the full article (as a simple string) identified by
 225         *messageid*, or undef if it isn't found.  Each line will end with a
 226         simple \n, but leading periods may still be doubled if the article
 227         is stored in wire format.
 228
 229     INN::cancel(*messageid*)
 230         Cancels *messageid*.  (This is equivalent to "ctlinnd cancel"; it
 231         cancels the message on the local server, but doesn't post a cancel
 232         message or do anything else that affects anything other than the
 233         local server.) Returns true on success, false otherwise.
 234
 235     INN::filesfor(*messageid*)
 236         Returns the *paths* field of the history entry for the given
 237         *messageid*.  This will be the storage API token for the message.
 238         If *messageid* isn't found in the history database, returns undef.
 239
 240     INN::havehist(*messageid*)
 241         Looks up *messageid* in the history database and returns true if
 242         it's found, false otherwise.
 243
 244     INN::head(*messageid*)
 245         Returns the header (as a simple string) of the article identified by
 246         *messageid*, or undef if it isn't found.  Each line will end with a
 247         simple \n (in other words, regardless of the format of article
 248         storage, the returned string won't be in wire format).
 249
 250     INN::newsgroup(*newsgroup*)
 251         Returns the status of *newsgroup* (the last field of the active file
 252         entry for that newsgroup).  See active(5) for a description of the
 253         possible values and their meanings (the most common are "y" for an
 254         unmoderated group and "m" for a moderated group).  If *newsgroup*
 255         isn't in the active file, returns undef.
 256
 257     These functions can only be used from inside the innd Perl filter;
 258     they're not available in the nnrpd filter.
 259
 260 Common Callbacks
 261
 262     The following additional function is available from inside filters
 263     embedded in innd, and is also available from filters embedded in nnrpd
 264     (see below):
 265
 266     INN::syslog(level, message)
 267         Logs a message via syslog(2).  This is quite a bit more reliable and
 268         portable than trying to use Sys::Syslog from inside the Perl filter.
 269         Only the first character of the level argument matters; the valid
 270         letters are the first letters of ALERT, CRIT, ERR, WARNING, NOTICE,
 271         INFO, and DEBUG (case-insensitive) and specify the priority at which
 272         the message is logged.  If a level that doesn't match any of those
 273         levels is given, the default priority level is LOG_NOTICE.  The
 274         second argument is the message to log; it will be prefixed by
 275         "filter: " and logged to syslog with facility LOG_NEWS.
 276
 277 The nnrpd Posting Filter
 278
 279     Whenever Perl support is needed in nnrpd, it first loads the file
 280     _PATH_PERL_FILTER_NNRPD (defined in include/paths.h, by default
 281     filter_nnrpd.pl).  This file must be located in the directory specified
 282     by pathfilter in inn.conf (/usr/local/news/bin/filter by default).  The
 283     default directory for filter code can be specified at configure time by
 284     giving the flag --with-filter-dir to configure.
 285
 286     If filter_nnrpd.pl loads successfully and defines the Perl function
 287     filter_post(), Perl filtering is turned on.  Otherwise, it's turned off.
 288     If filter_post() ever returns a fatal error (either from Perl or from a
 289     "die" in the Perl code), Perl filtering is turned off for the life of
 290     that nnrpd process and any further posts made during that session won't
 291     go through the filter.
 292
 293     While Perl filtering is on, every article received by nnrpd via the POST
 294     command is passed to the filter_post() Perl function before it is passed
 295     to INN (or mailed to the moderator of a moderated newsgroup).  If
 296     filter_post() returns an empty string (''), the article is accepted and
 297     normal processing of it continues.  Otherwise, the article is rejected
 298     and the string returned by filter_post() is returned to the client as
 299     the error message (with some exceptions; see below).
 300
 301     filter_post() has access to a global hash %hdr, which contains all of
 302     the headers of the article.  (Unlike the innd Perl filter, %hdr for the
 303     nnrpd Perl filter contains *all* of the headers, not just the standard
 304     ones.  If any of the headers are duplicated, though, %hdr will contain
 305     only the value of the last occurance of the header.  nnrpd will reject
 306     the article before the filter runs if any of the standard headers are
 307     duplicated.)  It also has access to the full body of the article in the
 308     variable $body, and if the poster authenticated via AUTHINFO (or if
 309     either Perl authentication or a readers.conf authentication method is
 310     used and produces user information), it has access to the authenticated
 311     username of the poster in the variable $user.
 312
 313     Unlike the innd Perl filter, the nnrpd Perl filter can modify the %hdr
 314     hash.  In fact, if the Perl variable $modify_headers is set to true
 315     after filter_post() returns, the contents of the %hdr hash will be
 316     written back to the article replacing the original headers.
 317     filter_post() can therefore make any modifications it wishes to the
 318     headers and those modifications will be reflected in the article as it's
 319     finally posted.  The article body cannot be modified in this way; any
 320     changes to $body will just be ignored.
 321
 322     Be careful when using the ability to modify headers.  filter_post() runs
 323     after all the normal consistency checks on the headers and after server
 324     supplied headers (like Message-ID: and Date:) are filled in.  Deleting
 325     required headers or modifying headers that need to follow a strict
 326     format can result in nnrpd trying to post nonsense articles (which will
 327     probably then be rejected by innd).  If $modify_headers is set,
 328     *everything* in the %hdr hash is taken to be article headers and added
 329     to the article.
 330
 331     If filter_post() returns something other than the empty string, this
 332     message is normally returned to the client as an error.  There are two
 333     exceptions:  If the string returned begins with "DROP", the post will be
 334     silently discarded and success returned to the client.  If the string
 335     begins with "SPOOL", success is returned to the client, but the post is
 336     saved in a directory named "spam" under the directory specified by
 337     pathincoming in inn.conf (in a directory named "spam/mod" if the post is
 338     to a moderated group).  This is intended to allow manual inspection of
 339     the suspect messages; if they should be posted, they can be manually
 340     moved out of the subdirectory to the directory specified by pathincoming
 341     in inn.conf, where they can be posted by running "rnews -U".  If you use
 342     this functionality, make sure those directories exist.
 343
 344 Changes to Perl Authentication Support for nnrpd
 345
 346     The old authentication functionality has been combined with the new
 347     readers.conf mechanism by Erik Klavon <erik@eriq.org>; bug reports
 348     should however go to inn-bugs@isc.org, not Erik.
 349
 350     The remainder of this section is an introduction to the new mechanism
 351     (which uses the perl_auth: and perl_access: readers.conf parameters)
 352     with porting/migration suggestions for people familiar with the old
 353     mechanism (identifiable by the nnrpperlauth: parameter in inn.conf).
 354
 355     Other people should skip this section.
 356
 357     The perl_auth parameter allows the use of Perl to authenticate a user.
 358     Scripts (like those from the old mechanism) are listed in readers.conf
 359     using perl_auth in the same manner other authenticators are using auth:
 360
 361         perl_auth: "/path/to/script/auth1.pl"
 362
 363     The file given as argument to perl_auth should contain the same
 364     procedures as before. The global hash %attributes remains the same,
 365     except for the removal of the "type" entry which is no longer needed in
 366     this modification and the addition of several new entries (port,
 367     intipaddr, intport) described below. The return array now only contains
 368     either two or three elements, the first of which is the NNTP return
 369     code. The second is an error string which is passed to the client if the
 370     error code indicates that the authentication attempt has failed. This
 371     allows a specific error message to be generated by the perl script in
 372     place of "Authentication failed". An optional third return element if
 373     present will be used to match the connection with the users: parameter
 374     in access groups and will also be the username logged. If this element
 375     is absent, the username supplied by the client during authentication
 376     will be used as was the previous behavior.
 377
 378     The perl_access parameter (described below) is also new; it allows the
 379     dynamic generation of an access group for an incoming connection using a
 380     Perl script.  If a connection matches an auth group which has a
 381     perl_access parameter, all access groups in readers.conf are ignored;
 382     instead the procedure described below is used to generate an access
 383     group.  This concept is due to Jeffrey M. Vinocur.
 384
 385     The new functionality should provide all of the existing capabilities of
 386     the Perl hook, in combination with the flexibility of readers.conf and
 387     the use of other authentication and resolving programs.  To use Perl
 388     authentication code that predates the readers.conf mechanism, you would
 389     need to modify the code slightly (see below for the new specification)
 390     and supply a simple readers.conf file.  If you don't want to modify your
 391     code, the samples directory has nnrpd_auth_wrapper.pl and
 392     nnrpd_access_wrapper.pl which should allow you to use your old code
 393     without needing to change it.
 394
 395     However, before trying to use your old Perl code, you may want to
 396     consider replacing it entirely with non-Perl authentication.  (With
 397     readers.conf and the regular authenticator and resolver programs, much
 398     of what once required Perl can be done directly.)  Even if the
 399     functionality is not available directly, you may wish to write a new
 400     authenticator or resolver (which can be done in whatever language you
 401     prefer to work in).
 402
 403 Perl Authentication Support for nnrpd
 404
 405     Support for authentication via Perl is provided in nnrpd by the
 406     inclusion of a perl_auth: parameter in a readers.conf auth group.
 407     perl_auth: works exactly like the auth: parameter in readers.conf,
 408     except that it calls the script given as argument using the Perl hook
 409     rather then treating it as an external program.
 410
 411     If the processing of readers.conf requires that a perl_auth: statement
 412     be used for authentication, Perl is loaded (if it has yet to be) and the
 413     file given as argument to the perl_auth: parameter is loaded as well. If
 414     a Perl function auth_init() is defined by that file, it is called
 415     immediately after the file is loaded.  It takes no arguments and returns
 416     nothing.
 417
 418     Provided the file loads without errors, auth_init() (if present) runs
 419     without fatal errors, and a Perl function authenticate() is defined,
 420     authenticate() will then be called. authenticate() takes no arguments,
 421     but it has access to a global hash %attributes which contains
 422     information about the connection as follows: $attributes{hostname} will
 423     contain the hostname (or the IP address if it doesn't resolve) of the
 424     client machine, $attributes{ipaddress} will contain its IP address (as a
 425     string), $attributes{port} will contain the client port (as an integer),
 426     $attributes{interface} contains the hostname of the interface the client
 427     connected on, $attributes{intipaddr} contains the IP address (as a
 428     string) of the interface the client connected on, $attributes{intport}
 429     contains the port (as an integer) on the interface the client connected
 430     on, $attributes{username} will contain the provided username and
 431     $attributes{password} the password.
 432
 433     authenticate() should return a two or three element array.  The first
 434     element is the NNTP response code to return to the client, the second
 435     element is an error string which is passed to the client if the response
 436     code indicates that the authentication attempt has failed. An optional
 437     third return element if present will be used to match the connection
 438     with the users: parameter in access groups and will also be the username
 439     logged. If this element is absent, the username supplied by the client
 440     during authentication will be used for matching and logging.
 441
 442     The NNTP response code should probably be either 281 (authentication
 443     successful) or 502 (authentication unsuccessful).  If the code returned
 444     is anything other than 281, nnrpd will print an authentication error
 445     message and drop the connection and exit.
 446
 447     If authenticate() dies (either due to a Perl error or due to calling
 448     die), or if it returns anything other than the two or three element
 449     array described above, an internal error will be reported to the client,
 450     the exact error will be logged to syslog, and nnrpd will drop the
 451     connection and exit.
 452
 453 Dynamic Generation of Access Groups
 454
 455     A Perl script may be used to dynamically generate an access group which
 456     is then used to determine the access rights of the client. This occurs
 457     whenever the perl_access: is specified in an auth group which has
 458     successfully matched the client. Only one perl_access: statement is
 459     allowed in an auth group. This parameter should not be mixed with a
 460     python_access: statement in the same auth group.
 461
 462     When a perl_access: parameter is encountered, Perl is loaded (if it has
 463     yet to be) and the file given as argument is loaded as well. Provided
 464     the file loads without errors, and a Perl function access() is defined,
 465     access() will then be called. access() takes no arguments, but it has
 466     access to a global hash %attributes which contains information about the
 467     connection as follows: $attributes{hostname} will contain the hostname
 468     (or the IP address if it doesn't resolve) of the client machine,
 469     $attributes{ipaddress} will contain its IP address (as a string),
 470     $attributes{port} will contain the client port (as an integer),
 471     $attributes{interface} contains the hostname of the interface the client
 472     connected on, $attributes{intipaddr} contains the IP address (as a
 473     string) of the interface the client connected on, $attributes{intport}
 474     contains the port (as an integer) on the interface the client connected
 475     on, $attributes{username} will contain the provided username and domain
 476     (in username@domain form).
 477
 478     access() returns a hash, containing the desired access parameters and
 479     values.  Here is an untested example showing how to dynamically generate
 480     a list of newsgroups based on the client's username and domain.
 481
 482          my %hosts = ( "example.com" => "example.*", "isc.org" => "isc.*" );
 483
 484          sub access {
 485             %return_hash = (
 486                "max_rate" => "10000",
 487                "addnntppostinghost" => "true",
 488          #     ...
 489             );
 490             if( defined $attributes{username} &&
 491                 $attributes{username} =~ /.*@(.*)/ )
 492             {
 493                $return_hash{"virtualhost"} = "true";
 494                $return_hash{"path"} = $1;
 495                $return_hash{"newsgroups"} = $hosts{$1};
 496             } else {
 497                $return_hash{"read"} = "*";
 498                $return_hash{"post"} = "local.*"
 499             }
 500             return %return_hash;
 501          }
 502
 503     Note that both the keys and values are quoted strings. These values are
 504     to be returned to a C program and must be quoted strings. For values
 505     containing one or more spaces, it is not necessary to include extra
 506     quotes inside the string.
 507
 508     While you may include the users: parameter in a dynamically generated
 509     access group, some care should be taken (unless your pattern is just *
 510     which is equivalent to leaving the parameter out). The group created
 511     with the values returned from the Perl script is the only one considered
 512     when nnrpd attempts to find an access group matching the connection. If
 513     a users: parameter is included and it doesn't match the connection, then
 514     the client will be denied access since there are no other access groups
 515     which could match the connection.
 516
 517     If access() dies (either due to a Perl error or due to calling die), or
 518     if it returns anything other than a hash as described above, an internal
 519     error will be reported to the client, the exact error will be logged to
 520     syslog, and nnrpd will drop the connection and exit.
 521
 522 Notes on Writing Embedded Perl
 523
 524     All Perl evaluation is done inside an implicit eval block, so calling
 525     die in Perl code will not kill the innd or nnrpd process.  Neither will
 526     Perl errors (such as syntax errors).  However, such errors will have
 527     negative effects (fatal errors in the innd or nnrpd filter will cause
 528     filtering to be disabled, and fatal errors in the nnrpd authentication
 529     code will cause the client connection to be terminated).
 530
 531     Calling exit directly, however, *will* kill the innd or nnrpd process,
 532     so don't do that.  Similarly, you probably don't want to call fork (or
 533     any other function that results in a fork such as system,
 534     IPC::Open3::open3(), or any use of backticks) since there are possibly
 535     unflushed buffers that could get flushed twice, lots of open state that
 536     may not get closed properly, and innumerable other potential problems.
 537     In general, be aware that all Perl code is running inside a large and
 538     complicated C program, and Perl code that impacts the process as a whole
 539     is best avoided.
 540
 541     You can use print and warn inside Perl code to send output to STDOUT or
 542     STDERR, but you probably shouldn't.  Instead, open a log file and print
 543     to it instead (or, in the innd filter, use INN::syslog() to write
 544     messages via syslog like the rest of INN).  If you write to STDOUT or
 545     STDERR, where that data will go depends on where the filter is running;
 546     inside innd, it will go to the news log or the errlog, and inside nnrpd
 547     it will probably go nowhere but could go to the client.  The nnrpd
 548     filter takes some steps to try to keep output from going across the
 549     network connection to the client (which would probably result in a very
 550     confused client), but best not to take the chance.
 551
 552     For similar reasons, try to make your Perl code -w clean, since Perl
 553     warnings are written to STDERR.  (INN won't run your code under -w, but
 554     better safe than sorry, and some versions of Perl have some mandatory
 555     warnings you can't turn off.)
 556
 557     You *can* use modules in your Perl code, just like you would in an
 558     ordinary Perl script.  You can even use modules that dynamically load C
 559     code.  Just make sure that none of the modules you use go off behind
 560     your back to do any of the things above that are best avoided.
 561
 562     Whenever you make any modifications to the Perl code, and particularly
 563     before starting INN or reloading filter.perl with new code, you should
 564     run perl -wc on the file.  This will at least make sure you don't have
 565     any glaring syntax errors.  Remember, if there are errors in your code,
 566     filtering will be disabled, which could mean that posts you really
 567     wanted to reject will leak through and authentication of readers may be
 568     totally broken.
 569
 570     The samples directory has example startup_innd.pl, filter_innd.pl,
 571     filter_nnrpd.pl, and nnrpd_auth.pl files that contain some simplistic
 572     examples.  Look them over as a starting point when writing your own.
 573
 574 Available Packages
 575
 576     This is an unofficial list of known filtering packages at the time of
 577     publication.  This is not an endorsement of these filters by the ISC or
 578     the INN developers, but is included as assistance in locating packages
 579     which make use of this filter mechanism.
 580
 581       CleanFeed               Jeremy Nixon <jeremy@exit109.com>
 582       <URL:http://www.exit109.com/~jeremy/news/cleanfeed.html>
 583             A spam filter catching excessive multi-posting and a host of
 584             other things.  Uses filter_innd.pl exclusively, requires the MD5
 585             Perl module.  Probably the most popular and widely-used Perl
 586             filter around.
 587
 588       Usenet II Filter        Edward S. Marshall <emarshal@xnet.com>
 589       <URL:http://www.xnet.com/~emarshal/inn/filter_nnrpd.pl>
 590             Checks for "soundness" according to Usenet II guidelines in the
 591             net.* hierarchy.  Designed to use filter_nnrpd.pl.
 592
 593       News Gizmo              Aidan Cully <aidan@panix.com>
 594       <URL:http://www.panix.com/gizmo/>
 595             A posting filter for helping a site enforce Usenet-II soundness,
 596             and for quotaing the number of messages any user can post to
 597             Usenet daily.