1 =head1 INN Python Filtering and Authentication Support
3 This file documents INN's built-in optional support for Python article
4 filtering. It is patterned after the Perl and (now obsolete) TCL hooks
5 previously added by Bob Heiney and Christophe Wolfhugel.
7 For this filter to work successfully, you will need to have at least
8 S<Python 1.5.2> installed. You can obtain it from L<http://www.python.org/>.
10 The B<innd> Python interface and the original Python filtering documentation
11 were written by Greg Andruk (nee Fluffy) <gerglery@usa.net>. The Python
12 authentication and authorization support for B<nnrpd> as well as the original
13 documentation for it were written by Ilya Etingof <ilya@glas.net> in
18 Once you have built and installed Python, you can cause INN to use it by
19 adding the B<--with-python> switch to your C<configure> command. You will
20 need to have all the headers and libraries required for embedding Python
21 into INN; they can be found in Python development packages, which include
22 header files and static libraries.
24 You will then be able to use Python authentication, dynamic access group
25 generation and dynamic access control support in B<nnrpd> along with
26 filtering support in B<innd>.
28 See the ctlinnd(8) manual page to learn how to enable, disable and reload
29 Python filters on a running server (especially C<ctlinnd mode>,
30 C<ctlinnd python y|n> and C<ctlinnd reload filter.python 'reason'>).
32 Also, see the F<filter_innd.py>, F<nnrpd_auth.py>, F<nnrpd_access.py>
33 and F<nnrpd_dynamic.py> samples in your filters directory for
34 a demonstration of how to get all this working.
36 =head1 Writing an B<innd> Filter
40 You need to create a F<filter_innd.py> module in INN's filter directory
41 (see the I<pathfilter> setting in F<inn.conf>). A heavily-commented sample
42 is provided; you can use it as a template for your own filter. There is
43 also an F<INN.py> module there which is not actually used by INN; it is
44 there so you can test your module interactively.
46 First, define a class containing the methods you want to provide to B<innd>.
47 Methods B<innd> will use if present are:
51 =item __init__(I<self>)
53 Not explicitly called by B<innd>, but will run whenever the filter module is
54 (re)loaded. This is a good place to initialize constants or pick up where
55 C<filter_before_reload> or C<filter_close> left off.
57 =item filter_before_reload(I<self>)
59 This will execute any time a C<ctlinnd reload all 'reason'> or C<ctlinnd reload
60 filter.python 'reason'> command is issued. You can use it to save statistics or
61 reports for use after reloading.
63 =item filter_close(I<self>)
65 This will run when a C<ctlinnd shutdown 'reason'> command is received.
67 =item filter_art(I<self>, I<art>)
69 I<art> is a dictionary containing an article's headers and body. This method
70 is called every time B<innd> receives an article. The following can be
73 Also-Control, Approved, Bytes, Cancel-Key, Cancel-Lock,
74 Content-Base, Content-Disposition, Content-Transfer-Encoding,
75 Content-Type, Control, Date, Date-Received, Distribution, Expires,
76 Face, Followup-To, From, In-Reply-To, Injection-Date, Injection-Info,
77 Keywords, Lines, List-ID, Message-ID, MIME-Version, Newsgroups,
78 NNTP-Posting-Date, NNTP-Posting-Host, Organization, Originator,
79 Path, Posted, Posting-Version, Received, References, Relay-Version,
80 Reply-To, Sender, Subject, Supersedes, User-Agent,
81 X-Auth, X-Canceled-By, X-Cancelled-By, X-Complaints-To, X-Face,
82 X-HTTP-UserAgent, X-HTTP-Via, X-Mailer, X-Modbot, X-Modtrace,
83 X-Newsposter, X-Newsreader, X-No-Archive, X-Original-Message-ID,
84 X-Original-Trace, X-Originating-IP, X-PGP-Key, X-PGP-Sig,
85 X-Poster-Trace, X-Postfilter, X-Proxy-User, X-Submissions-To,
86 X-Trace, X-Usenet-Provider, Xref, __BODY__, __LINES__.
88 Note that all the above values are as they arrived, not modified
89 by your INN (especially, the Xref: header, if present, is the one
90 of the remote site which sent you the article, and not yours).
92 These values will be buffer objects holding the contents of the
93 same named article headers, except for the special C<__BODY__> and C<__LINES__>
94 items. Items not present in the article will contain C<None>.
96 C<art('__BODY__')> is a buffer object containing the article's entire body, and
97 C<art('__LINES__')> is an int holding B<innd>'s reckoning of the number of lines
98 in the article. All the other elements will be buffers with the contents
99 of the same-named article headers.
101 The Newsgroups: header of the article is accessible inside the Python
102 filter as C<art['Newsgroups']>.
104 If you want to accept an article, return C<None> or an empty string. To
105 reject, return a non-empty string. The rejection strings will be shown to
106 local clients and your peers, so keep that in mind when phrasing your
109 =item filter_messageid(I<self>, I<msgid>)
111 I<msgid> is a buffer object containing the ID of an article being offered by
112 IHAVE or CHECK. Like with C<filter_art>, the message will be refused if
113 you return a non-empty string. If you use this feature, keep it light
114 because it is called at a rather busy place in B<innd>'s main loop. Also, do
115 not rely on this function alone to reject by ID; you should repeat the
116 tests in C<filter_art> to catch articles sent with TAKETHIS but no CHECK.
118 =item filter_mode(I<self>, I<oldmode>, I<newmode>, I<reason>)
120 When the operator issues a B<ctlinnd> C<pause>, C<throttle>, C<go>, C<shutdown>
121 or C<xexec> command, this function can be used to do something sensible in accordance
122 with the state change. Stamp a log file, save your state on throttle,
123 etc. I<oldmode> and I<newmode> will be strings containing one of the values in
124 (C<running>, C<throttled>, C<paused>, C<shutdown>, C<unknown>). I<oldmode> is
125 the state B<innd> was in before B<ctlinnd> was run, I<newmode> is the state B<innd>
126 will be in after the command finishes. I<reason> is the comment string
127 provided on the B<ctlinnd> command line.
131 =head2 How to Use these Methods with B<innd>
133 To register your methods with B<innd>, you need to create an instance of your
134 class, import the built-in INN module, and pass the instance to
135 C<INN.set_filter_hook>. For example:
138 def filter_art(self, art):
143 def filter_messageid(self, id):
150 INN.set_filter_hook(myfilter)
152 When writing and testing your Python filter, don't be afraid to make use
153 of C<try:>/C<except:> and the provided C<INN.syslog> function. stdout and stderr
154 will be disabled, so your filter will die silently otherwise.
156 Also, remember to try importing your module interactively before loading
157 it, to ensure there are no obvious errors. One typo can ruin your whole
158 filter. A dummy F<INN.py> module is provided to facilitate testing outside
159 the server. To test, change into your filter directory and use a command
162 python -ic 'import INN, filter_innd'
164 You can define as many or few of the methods listed above as you want in
165 your filter class (it is fine to define more methods for your own use; B<innd>
166 will not be using them but your filter can). If you I<do> define the above
167 methods, GET THE PARAMETER COUNTS RIGHT. There are checks in B<innd> to see
168 whether the methods exist and are callable, but if you define one and get the
169 parameter counts wrong, B<innd> WILL DIE. You have been warned. Be careful
170 with your return values, too. The C<filter_art> and C<filter_messageid>
171 methods have to return strings, or C<None>. If you return something like an
172 int, B<innd> will I<not> be happy.
174 =head2 A Note regarding Buffer Objects
176 Buffer objects are cousins of strings, new in S<Python 1.5.2>. Using buffer
177 objects may take some getting used to, but we can create buffers much faster
178 and with less memory than strings.
180 For most of the operations you will perform in filters (like C<re.search>,
181 C<string.find>, C<md5.digest>) you can treat buffers just like strings, but
182 there are a few important differences you should know about:
184 # Make a string and two buffers.
189 s == bs # - This is false because the types differ...
190 buffer(s) == bs # - ...but this is true, the types now agree.
191 s == str(bs) # - This is also true, but buffer() is faster.
192 s[:2] == bs[:2] # - True. Buffer slices are strings.
194 # While most string methods will take either a buffer or a string,
195 # string.join (in the string module) insists on using only strings.
197 string.join([str(b), s], '.') # Returns 'def.abc'.
198 '.'.join([str(b), s]) # Returns 'def.abc' too.
199 '.'.join([b, s]) # This raises a TypeError.
201 e = s + b # This raises a TypeError, but...
203 # ...these two both return the string 'abcdef'. The first one
204 # is faster -- choose buffer() over str() whenever you can.
208 g = b + '>' # This is legal, returns the string 'def>'.
210 =head2 Functions Supplied by the Built-in B<innd> Module
212 Besides C<INN.set_filter_hook> which is used to register your methods
213 with B<innd> as it has already been explained above, the following functions
214 are available from Python scripts:
218 =item addhist(I<message-id>)
220 =item article(I<message-id>)
222 =item cancel(I<message-id>)
224 =item havehist(I<message-id>)
226 =item hashstring(I<string>)
228 =item head(I<message-id>)
230 =item newsgroup(I<groupname>)
232 =item syslog(I<level>, I<message>)
236 Therefore, not only can B<innd> use Python, but your filter can use some of
237 B<innd>'s features too. Here is some sample Python code to show what you get
238 with the previously listed functions.
242 # Python's native syslog module isn't compiled in by default,
243 # so the INN module provides a replacement. The first parameter
244 # tells the Unix syslogger what severity to use; you can
245 # abbreviate down to one letter and it's case insensitive.
246 # Available levels are (in increasing levels of seriousness)
247 # Debug, Info, Notice, Warning, Err, Crit, and Alert. (If you
248 # provide any other string, it will be defaulted to Notice.) The
249 # second parameter is the message text. The syslog entries will
250 # go to the same log files innd itself uses, with a 'python:'
252 syslog('warning', 'I will not buy this record. It is scratched.')
254 vehicle = 'hovercraft'
255 syslog('N', 'My %s is full of %s.' % (vehicle, animals))
257 # Let's cancel an article! This only deletes the message on the
258 # local server; it doesn't send out a control message or anything
259 # scary like that. Returns 1 if successful, else 0.
260 if INN.cancel('<meow$123.456@solvangpastries.edu>'):
265 # Check if a given message is in history. This doesn't
266 # necessarily mean the article is on your spool; cancelled and
267 # expired articles hang around in history for a while, and
268 # rejected articles will be in there if you have enabled
269 # remembertrash in inn.conf. Returns 1 if found, else 0.
270 if INN.havehist('<z456$789.abc@isc.org>'):
271 comment = "*yawn* I've already seen this article."
273 comment = 'Mmm, fresh news.'
275 # Here we are running a local spam filter, so why eat all those
276 # cancels? We can add fake entries to history so they'll get
277 # refused. Returns 1 on success, 0 on failure.
278 cancelled_id = buffer('<meow$123.456@isc.org>')
279 if INN.addhist("<cancel." + cancelled_id[1:]):
280 thought = "Eat my dust, roadkill!"
282 thought = "Darn, someone beat me to it."
284 # We can look at the header or all of an article already on spool,
285 # too. Might be useful for long-memory despamming or
286 # authentication things. Each is returned (if present) as a
287 # string object; otherwise you'll end up with an empty string.
288 artbody = INN.article('<foo$bar.baz@bungmunch.edu>')
289 artheader = INN.head('<foo$bar.baz@bungmunch.edu>')
291 # As we can compute a hash digest for a string, we can obtain one
292 # for artbody. It might be of help to detect spam.
293 digest = INN.hashstring(artbody)
295 # Finally, do you want to see if a given newsgroup is moderated or
296 # whatever? INN.newsgroup returns the last field of a group's
297 # entry in active as a string.
298 froupflag = INN.newsgroup('alt.fan.karl-malden.nose')
300 moderated = 'no such newsgroup'
301 elif froupflag == 'y':
303 elif froupflag == 'm':
306 moderated = "something else"
308 =head1 Writing an B<nnrpd> Filter
310 =head2 Changes to Python Authentication and Access Control Support for B<nnrpd>
312 The old authentication and access control functionality has been
313 combined with the new F<readers.conf> mechanism by Erik Klavon
314 <erik@eriq.org>; bug reports should however go to <inn-bugs@isc.org>,
317 The remainder of this section is an introduction to the new mechanism
318 (which uses the I<python_auth>, I<python_access>, and I<python_dynamic>
319 F<readers.conf> parameters) with porting/migration suggestions for
320 people familiar with the old mechanism (identifiable by the now
321 deprecated I<nnrpperlauth> parameter in F<inn.conf>).
323 Other people should skip this section.
325 The I<python_auth> parameter allows the use of Python to authenticate a
326 user. Authentication scripts (like those from the old mechanism) are
327 listed in F<readers.conf> using I<python_auth> in the same manner other
328 authenticators are using I<auth>:
330 python_auth: "nnrpd_auth"
332 It uses the script named F<nnrpd_auth.py> (note that C<.py> is not present
333 in the I<python_auth> value).
335 Scripts should be placed as before in the filter directory (see the
336 I<pathfilter> setting in F<inn.conf>). The new hook method C<authen_init>
337 takes no arguments and its return value is ignored; its purpose is to
338 provide a means for authentication specific initialization. The hook
339 method C<authen_close> is the more specific analogue to the old C<close>
340 method. These two method hooks are not required, contrary to
341 C<authenticate>, the main method.
343 The argument dictionary passed to C<authenticate> remains the same,
344 except for the removal of the I<type> entry which is no longer needed
345 in this modification and the addition of several new entries (I<port>,
346 I<intipaddr>, I<intport>) described below. The return tuple now only
347 contains either two or three elements, the first of which is the NNTP
348 response code. The second is an error string which is passed to the
349 client if the response code indicates that the authentication attempt
350 has failed. This allows a specific error message to be generated by
351 the Python script in place of the generic message C<Authentication
352 failed>. An optional third return element, if present, will be used to
353 match the connection with the I<user> parameter in access groups and
354 will also be the username logged. If this element is absent, the
355 username supplied by the client during authentication will be used, as
356 was the previous behaviour.
358 The I<python_access> parameter (described below) is new; it allows the
359 dynamic generation of an access group of an incoming connection using
360 a Python script. If a connection matches an auth group which has a
361 I<python_access> parameter, all access groups in F<readers.conf> are
362 ignored; instead the procedure described below is used to generate an
363 access group. This concept is due to Jeffrey S<M. Vinocur> and you can
364 add this line to F<readers.conf> in order to use the F<nnrpd_access.py>
365 Python script in I<pathfilter>:
367 python_access: "nnrpd_access"
369 In the old implementation, the authorization method allowed for access
370 control on a per-group basis. That functionality is preserved in the
371 new implementation by the inclusion of the I<python_dynamic> parameter in
372 F<readers.conf>. The only change is the corresponding method name of
373 C<dynamic> as opposed to C<authorize>. Additionally, the associated
374 optional housekeeping methods C<dynamic_init> and C<dynamic_close>
375 may be implemented if needed. In order to use F<nnrpd_dynamic.py> in
376 I<pathfilter>, you can add this line to F<readers.conf>:
378 python_dynamic: "nnrpd_dynamic"
380 This new implementation should provide all of the previous
381 capabilities of the Python hooks, in combination with the flexibility
382 of F<readers.conf> and the use of other authentication and resolving
383 programs (including the Perl hooks!). To use Python code that predates
384 the new mechanism, you would need to modify the code slightly (see
385 below for the new specification) and supply a simple F<readers.conf>
386 file. If you do not want to modify your code, the sample directory has
387 F<nnrpd_auth_wrapper.py>, F<nnrpd_access_wrapper.py> and
388 F<nnrpd_dynamic_wrapper.py> which should allow you to use your old
389 code without needing to change it.
391 However, before trying to use your old Python code, you may want to
392 consider replacing it entirely with non-Python authentication. (With
393 F<readers.conf> and the regular authenticator and resolver programs, much
394 of what once required Python can be done directly.) Even if the
395 functionality is not available directly, you may wish to write a new
396 authenticator or resolver (which can be done in whatever language you
399 =head2 Python Authentication Support for B<nnrpd>
401 Support for authentication via Python is provided in B<nnrpd> by the
402 inclusion of a I<python_auth> parameter in a F<readers.conf> auth
403 group. I<python_auth> works exactly like the I<auth> parameter in
404 F<readers.conf>, except that it calls the script given as argument
405 using the Python hook rather then treating it as an external
406 program. Multiple, mixed use of I<python_auth> with other I<auth>
407 statements including I<perl_auth> is permitted. Each I<auth> statement
408 will be tried in the order they appear in the auth group until either
409 one succeeds or all are exhausted.
411 If the processing of F<readers.conf> requires that a I<python_auth>
412 statement be used for authentication, Python is loaded (if it has yet
413 to be) and the file given as argument to the I<python_auth> parameter is
414 loaded as well (do not include the C<.py> extension of this file in
415 the value of I<python_auth>). If a Python object with a method
416 C<authen_init> is hooked in during the loading of that file, then
417 that method is called immediately after the file is loaded. If no
418 errors have occurred, the method C<authenticate> is called. Depending
419 on the NNTP response code returned by C<authenticate>, the authentication
420 hook either succeeds or fails, after which the processing of the
421 auth group continues as usual. When the connection with the client
422 is closed, the method C<authen_close> is called if it exists.
424 =head2 Dynamic Generation of Access Groups
426 A Python script may be used to dynamically generate an access group
427 which is then used to determine the access rights of the client. This
428 occurs whenever the I<python_access> parameter is specified in an auth group
429 which has successfully matched the client. Only one I<python_access>
430 statement is allowed in an auth group. This parameter should not be
431 mixed with a I<perl_access> statement in the same auth group.
433 When a I<python_access> parameter is encountered, Python is loaded (if
434 it has yet to be) and the file given as argument is loaded as well (do not
435 include the C<.py> extension of this file in the value of I<python_access>).
436 If a Python object with a method C<access_init> is hooked in during the
437 loading of that file, then that method is called immediately after the
438 file is loaded. If no errors have occurred, the method C<access> is
439 called. The dictionary returned by C<access> is used to generate an
440 access group that is then used to determine the access rights of the
441 client. When the connection with the client is closed, the method
442 C<access_close> is called, if it exists.
444 While you may include the I<users> parameter in a dynamically generated
445 access group, some care should be taken (unless your pattern is just
446 C<*> which is equivalent to leaving the parameter out). The group created
447 with the values returned from the Python script is the only one
448 considered when B<nnrpd> attempts to find an access group matching the
449 connection. If a I<users> parameter is included and it does not match the
450 connection, then the client will be denied access since there are no
451 other access groups which could match the connection.
453 =head2 Dynamic Access Control
455 If you need to have access control rules applied immediately without
456 having to restart all the B<nnrpd> processes, you may apply access
457 control on a per newsgroup basis using the Python dynamic hooks (as
458 opposed to F<readers.conf>, which does the same on per user
459 basis). These hooks are activated through the inclusion of the
460 I<python_dynamic> parameter in a F<readers.conf> auth group. Only one
461 I<python_dynamic> statement is allowed in an auth group.
463 When a I<python_dynamic> parameter is encountered, Python is loaded (if
464 it has yet to be) and the file given as argument is loaded as well (do not
465 include the C<.py> extension of this file in the value of I<python_dynamic>).
466 If a Python object with a method C<dynamic_init> is hooked in during the
467 loading of that file, then that method is called immediately after the
468 file is loaded. Every time a reader asks B<nnrpd> to read or post an
469 article, the Python method C<dynamic> is invoked before proceeding with
470 the requested operation. Based on the value returned by C<dynamic>, the
471 operation is either permitted or denied. When the connection with the
472 client is closed, the method C<access_close> is called if it exists.
474 =head2 Writing a Python B<nnrpd> Authentication Module
476 You need to create a F<nnrpd_auth.py> module in INN's filter directory
477 (see the I<pathfilter> setting in F<inn.conf>) where you should define a
478 class holding certain methods depending on which hooks you want to use.
480 Note that you will have to use different Python scripts for authentication
481 and access: the values of I<python_auth>, I<python_access> and I<python_dynamic>
482 have to be distinct for your scripts to work.
484 The following methods are known to B<nnrpd>:
488 =item __init__(I<self>)
490 Not explicitly called by B<nnrpd>, but will run whenever the auth module is
491 loaded. Use this method to initialize any general variables or open
492 a common database connection. This method may be omitted.
494 =item authen_init(I<self>)
496 Initialization function specific to authentication. This method may be
499 =item authenticate(I<self>, I<attributes>)
501 Called when a I<python_auth> statement is reached in the processing of
502 F<readers.conf>. Connection attributes are passed in the I<attributes>
503 dictionary. Returns a response code, an error string, and an optional
504 string to be used in place of the client-supplied username (both for
505 logging and for matching the connection with an access group).
507 =item authen_close(I<self>)
509 This method is invoked on B<nnrpd> termination. You can use it to save
510 state information or close a database connection. This method may be omitted.
512 =item access_init(I<self>)
514 Initialization function specific to generation of an access group. This
515 method may be omitted.
517 =item access(I<self>, I<attributes>)
519 Called when a I<python_access> statement is reached in the processing of
520 F<readers.conf>. Connection attributes are passed in the I<attributes>
521 dictionary. Returns a dictionary of values representing statements to
522 be included in an access group.
524 =item access_close(I<self>)
526 This method is invoked on B<nnrpd> termination. You can use it to save
527 state information or close a database connection. This method may be omitted.
529 =item dynamic_init(I<self>)
531 Initialization function specific to dynamic access control. This
532 method may be omitted.
534 =item dynamic(I<self>, I<attributes>)
536 Called when a client requests a newsgroup, an article or attempts to
537 post. Connection attributes are passed in the I<attributes> dictionary.
538 Returns C<None> to grant access, or a non-empty string (which will be
539 reported back to the client) otherwise.
541 =item dynamic_close(I<self>)
543 This method is invoked on B<nnrpd> termination. You can use it to save
544 state information or close a database connection. This method may be omitted.
548 =head2 The I<attributes> Dictionary
550 The keys and associated values of the I<attributes> dictionary are
557 C<read> or C<post> values specify the authentication type; only valid
558 for the C<dynamic> method.
562 It is the resolved hostname (or IP address if resolution fails) of
563 the connected reader.
567 The IP address of the connected reader.
571 The port of the connected reader.
575 The hostname of the local endpoint of the NNTP connection.
579 The IP address of the local endpoint of the NNTP connection.
583 The port of the local endpoint of the NNTP connection.
587 The username as passed with AUTHINFO command, or C<None> if not
592 The password as passed with AUTHINFO command, or C<None> if not
597 The name of the newsgroup to which the reader requests read or post access;
598 only valid for the C<dynamic> method.
602 All the above values are buffer objects (see the notes above on what
605 =head2 How to Use these Methods with B<nnrpd>
607 To register your methods with B<nnrpd>, you need to create an instance of
608 your class, import the built-in B<nnrpd> module, and pass the instance to
609 C<nnrpd.set_auth_hook>. For example:
612 def authen_init(self):
617 def authenticate(self, attributes):
624 nnrpd.set_auth_hook(myauth)
626 When writing and testing your Python filter, don't be afraid to make use
627 of C<try:>/C<except:> and the provided C<nnrpd.syslog> function. stdout and stderr
628 will be disabled, so your filter will die silently otherwise.
630 Also, remember to try importing your module interactively before loading
631 it, to ensure there are no obvious errors. One typo can ruin your whole
632 filter. A dummy F<nnrpd.py> module is provided to facilitate testing outside
633 the server. It is not actually used by B<nnrpd> but provides the same set
634 of functions as built-in B<nnrpd> module. This stub module may be used
635 when debugging your own module. To test, change into your filter directory
636 and use a command like:
638 python -ic 'import nnrpd, nnrpd_auth'
640 =head2 Functions Supplied by the Built-in B<nnrpd> Module
642 Besides C<nnrpd.set_auth_hook> used to pass a reference to the instance
643 of authentication and authorization class to B<nnrpd>, the B<nnrpd> built-in
644 module exports the following function:
648 =item syslog(I<level>, I<message>)
650 It is intended to be a replacement for a Python native syslog. It works
651 like C<INN.syslog>, seen above.
655 $Id: hook-python.pod 7926 2008-06-29 08:27:41Z iulius $