INN Python Filtering and Authentication Support
This file documents INN's built-in optional support for Python article
filtering. It is patterned after the Perl and (now obsolete) TCL hooks
previously added by Bob Heiney and Christophe Wolfhugel.
For this filter to work successfully, you will need to have at least
Python 1.5.2 installed. You can obtain it from
.
The innd Python interface and the original Python filtering
documentation were written by Greg Andruk (nee Fluffy)
. The Python authentication and authorization support
for nnrpd as well as the original documentation for it were written by
Ilya Etingof in December 1999.
Installation
Once you have built and installed Python, you can cause INN to use it by
adding the --with-python switch to your "configure" command. You will
need to have all the headers and libraries required for embedding Python
into INN; they can be found in Python development packages, which
include header files and static libraries.
You will then be able to use Python authentication, dynamic access group
generation and dynamic access control support in nnrpd along with
filtering support in innd.
See the ctlinnd(8) manual page to learn how to enable, disable and
reload Python filters on a running server (especially "ctlinnd mode",
"ctlinnd python y|n" and "ctlinnd reload filter.python 'reason'").
Also, see the filter_innd.py, nnrpd_auth.py, nnrpd_access.py and
nnrpd_dynamic.py samples in your filters directory for a demonstration
of how to get all this working.
Writing an innd Filter
Introduction
You need to create a filter_innd.py module in INN's filter directory
(see the *pathfilter* setting in inn.conf). A heavily-commented sample
is provided; you can use it as a template for your own filter. There is
also an INN.py module there which is not actually used by INN; it is
there so you can test your module interactively.
First, define a class containing the methods you want to provide to
innd. Methods innd will use if present are:
__init__(*self*)
Not explicitly called by innd, but will run whenever the filter
module is (re)loaded. This is a good place to initialize constants
or pick up where "filter_before_reload" or "filter_close" left off.
filter_before_reload(*self*)
This will execute any time a "ctlinnd reload all 'reason'" or
"ctlinnd reload filter.python 'reason'" command is issued. You can
use it to save statistics or reports for use after reloading.
filter_close(*self*)
This will run when a "ctlinnd shutdown 'reason'" command is
received.
filter_art(*self*, *art*)
*art* is a dictionary containing an article's headers and body.
This method is called every time innd receives an article. The
following can be defined:
Also-Control, Approved, Bytes, Cancel-Key, Cancel-Lock,
Content-Base, Content-Disposition, Content-Transfer-Encoding,
Content-Type, Control, Date, Date-Received, Distribution, Expires,
Face, Followup-To, From, In-Reply-To, Injection-Date, Injection-Info,
Keywords, Lines, List-ID, Message-ID, MIME-Version, Newsgroups,
NNTP-Posting-Date, NNTP-Posting-Host, Organization, Originator,
Path, Posted, Posting-Version, Received, References, Relay-Version,
Reply-To, Sender, Subject, Supersedes, User-Agent,
X-Auth, X-Canceled-By, X-Cancelled-By, X-Complaints-To, X-Face,
X-HTTP-UserAgent, X-HTTP-Via, X-Mailer, X-Modbot, X-Modtrace,
X-Newsposter, X-Newsreader, X-No-Archive, X-Original-Message-ID,
X-Original-Trace, X-Originating-IP, X-PGP-Key, X-PGP-Sig,
X-Poster-Trace, X-Postfilter, X-Proxy-User, X-Submissions-To,
X-Trace, X-Usenet-Provider, Xref, __BODY__, __LINES__.
Note that all the above values are as they arrived, not modified by
your INN (especially, the Xref: header, if present, is the one of
the remote site which sent you the article, and not yours).
These values will be buffer objects holding the contents of the same
named article headers, except for the special "__BODY__" and
"__LINES__" items. Items not present in the article will contain
"None".
"art('__BODY__')" is a buffer object containing the article's entire
body, and "art('__LINES__')" is an int holding innd's reckoning of
the number of lines in the article. All the other elements will be
buffers with the contents of the same-named article headers.
The Newsgroups: header of the article is accessible inside the
Python filter as "art['Newsgroups']".
If you want to accept an article, return "None" or an empty string.
To reject, return a non-empty string. The rejection strings will be
shown to local clients and your peers, so keep that in mind when
phrasing your rejection responses.
filter_messageid(*self*, *msgid*)
*msgid* is a buffer object containing the ID of an article being
offered by IHAVE or CHECK. Like with "filter_art", the message will
be refused if you return a non-empty string. If you use this
feature, keep it light because it is called at a rather busy place
in innd's main loop. Also, do not rely on this function alone to
reject by ID; you should repeat the tests in "filter_art" to catch
articles sent with TAKETHIS but no CHECK.
filter_mode(*self*, *oldmode*, *newmode*, *reason*)
When the operator issues a ctlinnd "pause", "throttle", "go",
"shutdown" or "xexec" command, this function can be used to do
something sensible in accordance with the state change. Stamp a log
file, save your state on throttle, etc. *oldmode* and *newmode*
will be strings containing one of the values in ("running",
"throttled", "paused", "shutdown", "unknown"). *oldmode* is the
state innd was in before ctlinnd was run, *newmode* is the state
innd will be in after the command finishes. *reason* is the comment
string provided on the ctlinnd command line.
How to Use these Methods with innd
To register your methods with innd, you need to create an instance of
your class, import the built-in INN module, and pass the instance to
"INN.set_filter_hook". For example:
class Filter:
def filter_art(self, art):
...
blah blah
...
def filter_messageid(self, id):
...
yadda yadda
...
import INN
myfilter = Filter()
INN.set_filter_hook(myfilter)
When writing and testing your Python filter, don't be afraid to make use
of "try:"/"except:" and the provided "INN.syslog" function. stdout and
stderr will be disabled, so your filter will die silently otherwise.
Also, remember to try importing your module interactively before loading
it, to ensure there are no obvious errors. One typo can ruin your whole
filter. A dummy INN.py module is provided to facilitate testing outside
the server. To test, change into your filter directory and use a
command like:
python -ic 'import INN, filter_innd'
You can define as many or few of the methods listed above as you want in
your filter class (it is fine to define more methods for your own use;
innd will not be using them but your filter can). If you *do* define
the above methods, GET THE PARAMETER COUNTS RIGHT. There are checks in
innd to see whether the methods exist and are callable, but if you
define one and get the parameter counts wrong, innd WILL DIE. You have
been warned. Be careful with your return values, too. The "filter_art"
and "filter_messageid" methods have to return strings, or "None". If
you return something like an int, innd will *not* be happy.
A Note regarding Buffer Objects
Buffer objects are cousins of strings, new in Python 1.5.2. Using
buffer objects may take some getting used to, but we can create buffers
much faster and with less memory than strings.
For most of the operations you will perform in filters (like
"re.search", "string.find", "md5.digest") you can treat buffers just
like strings, but there are a few important differences you should know
about:
# Make a string and two buffers.
s = "abc"
b = buffer("def")
bs = buffer("abc")
s == bs # - This is false because the types differ...
buffer(s) == bs # - ...but this is true, the types now agree.
s == str(bs) # - This is also true, but buffer() is faster.
s[:2] == bs[:2] # - True. Buffer slices are strings.
# While most string methods will take either a buffer or a string,
# string.join (in the string module) insists on using only strings.
import string
string.join([str(b), s], '.') # Returns 'def.abc'.
'.'.join([str(b), s]) # Returns 'def.abc' too.
'.'.join([b, s]) # This raises a TypeError.
e = s + b # This raises a TypeError, but...
# ...these two both return the string 'abcdef'. The first one
# is faster -- choose buffer() over str() whenever you can.
e = buffer(s) + b
f = s + str(b)
g = b + '>' # This is legal, returns the string 'def>'.
Functions Supplied by the Built-in innd Module
Besides "INN.set_filter_hook" which is used to register your methods
with innd as it has already been explained above, the following
functions are available from Python scripts:
addhist(*message-id*)
article(*message-id*)
cancel(*message-id*)
havehist(*message-id*)
hashstring(*string*)
head(*message-id*)
newsgroup(*groupname*)
syslog(*level*, *message*)
Therefore, not only can innd use Python, but your filter can use some of
innd's features too. Here is some sample Python code to show what you
get with the previously listed functions.
import INN
# Python's native syslog module isn't compiled in by default,
# so the INN module provides a replacement. The first parameter
# tells the Unix syslogger what severity to use; you can
# abbreviate down to one letter and it's case insensitive.
# Available levels are (in increasing levels of seriousness)
# Debug, Info, Notice, Warning, Err, Crit, and Alert. (If you
# provide any other string, it will be defaulted to Notice.) The
# second parameter is the message text. The syslog entries will
# go to the same log files innd itself uses, with a 'python:'
# prefix.
syslog('warning', 'I will not buy this record. It is scratched.')
animals = 'eels'
vehicle = 'hovercraft'
syslog('N', 'My %s is full of %s.' % (vehicle, animals))
# Let's cancel an article! This only deletes the message on the
# local server; it doesn't send out a control message or anything
# scary like that. Returns 1 if successful, else 0.
if INN.cancel(''):
cancelled = "yup"
else:
cancelled = "nope"
# Check if a given message is in history. This doesn't
# necessarily mean the article is on your spool; cancelled and
# expired articles hang around in history for a while, and
# rejected articles will be in there if you have enabled
# remembertrash in inn.conf. Returns 1 if found, else 0.
if INN.havehist(''):
comment = "*yawn* I've already seen this article."
else:
comment = 'Mmm, fresh news.'
# Here we are running a local spam filter, so why eat all those
# cancels? We can add fake entries to history so they'll get
# refused. Returns 1 on success, 0 on failure.
cancelled_id = buffer('')
if INN.addhist("')
artheader = INN.head('')
# As we can compute a hash digest for a string, we can obtain one
# for artbody. It might be of help to detect spam.
digest = INN.hashstring(artbody)
# Finally, do you want to see if a given newsgroup is moderated or
# whatever? INN.newsgroup returns the last field of a group's
# entry in active as a string.
froupflag = INN.newsgroup('alt.fan.karl-malden.nose')
if froupflag == '':
moderated = 'no such newsgroup'
elif froupflag == 'y':
moderated = "nope"
elif froupflag == 'm':
moderated = "yep"
else:
moderated = "something else"
Writing an nnrpd Filter
Changes to Python Authentication and Access Control Support for nnrpd
The old authentication and access control functionality has been
combined with the new readers.conf mechanism by Erik Klavon
; bug reports should however go to ,
not Erik.
The remainder of this section is an introduction to the new mechanism
(which uses the *python_auth*, *python_access*, and *python_dynamic*
readers.conf parameters) with porting/migration suggestions for people
familiar with the old mechanism (identifiable by the now deprecated
*nnrpperlauth* parameter in inn.conf).
Other people should skip this section.
The *python_auth* parameter allows the use of Python to authenticate a
user. Authentication scripts (like those from the old mechanism) are
listed in readers.conf using *python_auth* in the same manner other
authenticators are using *auth*:
python_auth: "nnrpd_auth"
It uses the script named nnrpd_auth.py (note that ".py" is not present
in the *python_auth* value).
Scripts should be placed as before in the filter directory (see the
*pathfilter* setting in inn.conf). The new hook method "authen_init"
takes no arguments and its return value is ignored; its purpose is to
provide a means for authentication specific initialization. The hook
method "authen_close" is the more specific analogue to the old "close"
method. These two method hooks are not required, contrary to
"authenticate", the main method.
The argument dictionary passed to "authenticate" remains the same,
except for the removal of the *type* entry which is no longer needed in
this modification and the addition of several new entries (*port*,
*intipaddr*, *intport*) described below. The return tuple now only
contains either two or three elements, the first of which is the NNTP
response code. The second is an error string which is passed to the
client if the response code indicates that the authentication attempt
has failed. This allows a specific error message to be generated by the
Python script in place of the generic message "Authentication failed".
An optional third return element, if present, will be used to match the
connection with the *user* parameter in access groups and will also be
the username logged. If this element is absent, the username supplied
by the client during authentication will be used, as was the previous
behaviour.
The *python_access* parameter (described below) is new; it allows the
dynamic generation of an access group of an incoming connection using a
Python script. If a connection matches an auth group which has a
*python_access* parameter, all access groups in readers.conf are
ignored; instead the procedure described below is used to generate an
access group. This concept is due to Jeffrey M. Vinocur and you can add
this line to readers.conf in order to use the nnrpd_access.py Python
script in *pathfilter*:
python_access: "nnrpd_access"
In the old implementation, the authorization method allowed for access
control on a per-group basis. That functionality is preserved in the
new implementation by the inclusion of the *python_dynamic* parameter in
readers.conf. The only change is the corresponding method name of
"dynamic" as opposed to "authorize". Additionally, the associated
optional housekeeping methods "dynamic_init" and "dynamic_close" may be
implemented if needed. In order to use nnrpd_dynamic.py in
*pathfilter*, you can add this line to readers.conf:
python_dynamic: "nnrpd_dynamic"
This new implementation should provide all of the previous capabilities
of the Python hooks, in combination with the flexibility of readers.conf
and the use of other authentication and resolving programs (including
the Perl hooks!). To use Python code that predates the new mechanism,
you would need to modify the code slightly (see below for the new
specification) and supply a simple readers.conf file. If you do not
want to modify your code, the sample directory has
nnrpd_auth_wrapper.py, nnrpd_access_wrapper.py and
nnrpd_dynamic_wrapper.py which should allow you to use your old code
without needing to change it.
However, before trying to use your old Python code, you may want to
consider replacing it entirely with non-Python authentication. (With
readers.conf and the regular authenticator and resolver programs, much
of what once required Python can be done directly.) Even if the
functionality is not available directly, you may wish to write a new
authenticator or resolver (which can be done in whatever language you
prefer).
Python Authentication Support for nnrpd
Support for authentication via Python is provided in nnrpd by the
inclusion of a *python_auth* parameter in a readers.conf auth group.
*python_auth* works exactly like the *auth* parameter in readers.conf,
except that it calls the script given as argument using the Python hook
rather then treating it as an external program. Multiple, mixed use of
*python_auth* with other *auth* statements including *perl_auth* is
permitted. Each *auth* statement will be tried in the order they appear
in the auth group until either one succeeds or all are exhausted.
If the processing of readers.conf requires that a *python_auth*
statement be used for authentication, Python is loaded (if it has yet to
be) and the file given as argument to the *python_auth* parameter is
loaded as well (do not include the ".py" extension of this file in the
value of *python_auth*). If a Python object with a method "authen_init"
is hooked in during the loading of that file, then that method is called
immediately after the file is loaded. If no errors have occurred, the
method "authenticate" is called. Depending on the NNTP response code
returned by "authenticate", the authentication hook either succeeds or
fails, after which the processing of the auth group continues as usual.
When the connection with the client is closed, the method "authen_close"
is called if it exists.
Dynamic Generation of Access Groups
A Python script may be used to dynamically generate an access group
which is then used to determine the access rights of the client. This
occurs whenever the *python_access* parameter is specified in an auth
group which has successfully matched the client. Only one
*python_access* statement is allowed in an auth group. This parameter
should not be mixed with a *perl_access* statement in the same auth
group.
When a *python_access* parameter is encountered, Python is loaded (if it
has yet to be) and the file given as argument is loaded as well (do not
include the ".py" extension of this file in the value of
*python_access*). If a Python object with a method "access_init" is
hooked in during the loading of that file, then that method is called
immediately after the file is loaded. If no errors have occurred, the
method "access" is called. The dictionary returned by "access" is used
to generate an access group that is then used to determine the access
rights of the client. When the connection with the client is closed,
the method "access_close" is called, if it exists.
While you may include the *users* parameter in a dynamically generated
access group, some care should be taken (unless your pattern is just "*"
which is equivalent to leaving the parameter out). The group created
with the values returned from the Python script is the only one
considered when nnrpd attempts to find an access group matching the
connection. If a *users* parameter is included and it does not match
the connection, then the client will be denied access since there are no
other access groups which could match the connection.
Dynamic Access Control
If you need to have access control rules applied immediately without
having to restart all the nnrpd processes, you may apply access control
on a per newsgroup basis using the Python dynamic hooks (as opposed to
readers.conf, which does the same on per user basis). These hooks are
activated through the inclusion of the *python_dynamic* parameter in a
readers.conf auth group. Only one *python_dynamic* statement is allowed
in an auth group.
When a *python_dynamic* parameter is encountered, Python is loaded (if
it has yet to be) and the file given as argument is loaded as well (do
not include the ".py" extension of this file in the value of
*python_dynamic*). If a Python object with a method "dynamic_init" is
hooked in during the loading of that file, then that method is called
immediately after the file is loaded. Every time a reader asks nnrpd to
read or post an article, the Python method "dynamic" is invoked before
proceeding with the requested operation. Based on the value returned by
"dynamic", the operation is either permitted or denied. When the
connection with the client is closed, the method "access_close" is
called if it exists.
Writing a Python nnrpd Authentication Module
You need to create a nnrpd_auth.py module in INN's filter directory (see
the *pathfilter* setting in inn.conf) where you should define a class
holding certain methods depending on which hooks you want to use.
Note that you will have to use different Python scripts for
authentication and access: the values of *python_auth*, *python_access*
and *python_dynamic* have to be distinct for your scripts to work.
The following methods are known to nnrpd:
__init__(*self*)
Not explicitly called by nnrpd, but will run whenever the auth
module is loaded. Use this method to initialize any general
variables or open a common database connection. This method may be
omitted.
authen_init(*self*)
Initialization function specific to authentication. This method may
be omitted.
authenticate(*self*, *attributes*)
Called when a *python_auth* statement is reached in the processing
of readers.conf. Connection attributes are passed in the
*attributes* dictionary. Returns a response code, an error string,
and an optional string to be used in place of the client-supplied
username (both for logging and for matching the connection with an
access group).
authen_close(*self*)
This method is invoked on nnrpd termination. You can use it to save
state information or close a database connection. This method may
be omitted.
access_init(*self*)
Initialization function specific to generation of an access group.
This method may be omitted.
access(*self*, *attributes*)
Called when a *python_access* statement is reached in the processing
of readers.conf. Connection attributes are passed in the
*attributes* dictionary. Returns a dictionary of values
representing statements to be included in an access group.
access_close(*self*)
This method is invoked on nnrpd termination. You can use it to save
state information or close a database connection. This method may
be omitted.
dynamic_init(*self*)
Initialization function specific to dynamic access control. This
method may be omitted.
dynamic(*self*, *attributes*)
Called when a client requests a newsgroup, an article or attempts to
post. Connection attributes are passed in the *attributes*
dictionary. Returns "None" to grant access, or a non-empty string
(which will be reported back to the client) otherwise.
dynamic_close(*self*)
This method is invoked on nnrpd termination. You can use it to save
state information or close a database connection. This method may
be omitted.
The *attributes* Dictionary
The keys and associated values of the *attributes* dictionary are
described below.
*type*
"read" or "post" values specify the authentication type; only valid
for the "dynamic" method.
*hostname*
It is the resolved hostname (or IP address if resolution fails) of
the connected reader.
*ipaddress*
The IP address of the connected reader.
*port*
The port of the connected reader.
*interface*
The hostname of the local endpoint of the NNTP connection.
*intipaddr*
The IP address of the local endpoint of the NNTP connection.
*intport*
The port of the local endpoint of the NNTP connection.
*user*
The username as passed with AUTHINFO command, or "None" if not
applicable.
*pass*
The password as passed with AUTHINFO command, or "None" if not
applicable.
*newsgroup*
The name of the newsgroup to which the reader requests read or post
access; only valid for the "dynamic" method.
All the above values are buffer objects (see the notes above on what
buffer objects are).
How to Use these Methods with nnrpd
To register your methods with nnrpd, you need to create an instance of
your class, import the built-in nnrpd module, and pass the instance to
"nnrpd.set_auth_hook". For example:
class AUTH:
def authen_init(self):
...
blah blah
...
def authenticate(self, attributes):
...
yadda yadda
...
import nnrpd
myauth = AUTH()
nnrpd.set_auth_hook(myauth)
When writing and testing your Python filter, don't be afraid to make use
of "try:"/"except:" and the provided "nnrpd.syslog" function. stdout
and stderr will be disabled, so your filter will die silently otherwise.
Also, remember to try importing your module interactively before loading
it, to ensure there are no obvious errors. One typo can ruin your whole
filter. A dummy nnrpd.py module is provided to facilitate testing
outside the server. It is not actually used by nnrpd but provides the
same set of functions as built-in nnrpd module. This stub module may be
used when debugging your own module. To test, change into your filter
directory and use a command like:
python -ic 'import nnrpd, nnrpd_auth'
Functions Supplied by the Built-in nnrpd Module
Besides "nnrpd.set_auth_hook" used to pass a reference to the instance
of authentication and authorization class to nnrpd, the nnrpd built-in
module exports the following function:
syslog(*level*, *message*)
It is intended to be a replacement for a Python native syslog. It
works like "INN.syslog", seen above.
$Id: hook-python 7926 2008-06-29 08:27:41Z iulius $