+++ /dev/null
-This is a rough and informal list of suggested improvements to INN, parts
-of INN that need work, and other tasks yet undone. Some of these may be
-in progress, in which case the person working on them will be noted in
-square brackets and should be contacted if you want to help. Otherwise,
-let inn-workers@isc.org know if you'd like to work on any item listed
-below.
-
-The list is divided into changes already tentatively scheduled for a
-particular release, higher priority changes that will hopefully be done in
-the near future, small or medium-scale projects for the future, and
-long-term, large-scale problems. Note that just because a particular
-feature is scheduled for a later release doesn't mean it can't be
-completed earlier if someone decides to take it on. The association of
-features with releases is intended to be a rough guide for prioritization
-and a set of milestones to use to judge when a new major release is
-justified.
-
-Also, one major thing that is *always* welcome is additions to the test
-suite, which is currently very minimal. Any work done on the test suite
-to allow more portions of INN to be automatically tested will make all
-changes easier and will be *greatly* appreciated.
-
-Last modified $Id: TODO 7575 2006-09-11 22:59:38Z eagle $.
-
-
-Scheduled for INN 2.5
-
-* Rewrite configure, breaking all of the tests out into separate files
- using the new capabilities in autoconf 2.5x. Replace our local macros
- with the more general features provided by autoconf. At the same time,
- configure.in and Makefile.global.in should be fixed to use the same
- names as each other for various parameters. [Russ plans to work on
- this.]
-
-* Add support for groups, nesting, and vectors to the new configuration
- parsing code. [Russ plans on doing this.]
-
-* Convert readers.conf and storage.conf (and related configuration files)
- to use the new parsing system and break out program-specific sections
- of inn.conf into their own groups.
-
-* The current WIP cache and history cache should be integrated into the
- history API, things like message ID hashing should become a selectable
- property of the history file, and the history API should support
- multiple backend storage formats and automatically select the right one
- for an existing history file based on stored metainformation.
-
-* The interface to embedded filters needs to be reworked. The information
- about which filters are enabled should be isolated in the filtering API,
- and there should be standard API calls for filtering message IDs, remote
- posts, and local posts. As part of this revision, all of the Perl
- callbacks should be defined before any of the user code is loaded, and
- the Perl loading code needs considerable cleanup. At the same time as
- this is done, the implementation should really be documented; we do some
- interesting things with embedded filters and it would be nice to have a
- general document describing how we do it. [Russ is planning on working
- on this at some point, but won't get upset if someone starts first.]
-
-* All of INN's documentation should be written in POD, with text and man
- pages generated from the POD source. Anyone is encouraged to work on
- this by just taking any existing documentation in man format and convert
- it to POD while checking that it's still accurate and adding any
- additional useful information that was missed.
-
-* Replace the current innshellvars.pl file with a real INN Perl module for
- Perl programs, and include the necessary glue so that other Perl modules
- can be added to INN's build tree and installed with INN, allowing their
- capabilities to be available to the portions of INN written in Perl.
-
-* Switch nnrpd over to using the new wildmat routines rather than breaking
- apart strings on commas and matching each expression separately. This
- involves a lot of surgery, since PERMmatch is used all over the place,
- and may change the interpretation of ! and @ in group permission
- wildmats.
-
-* Rework and clean up the storage API. The major change is that the
- initialization function should return a pointer to an opaque struct
- which stores all of the state of the storage subsystem, rather than all
- of that being stored in static variables, and then all other functions
- should take that pointer. More of the structures should also be opaque,
- all-caps structure names should be avoided in favor of named structures,
- SMsetup and SMinit should be combined into one function that takes
- flags, SMerrno and SMerrorstr should be replaced with functions that
- return that information, and the wire format utilities should be moved
- into libinn.
-
-* Rework and clean up the overview API. The major change is that the
- initialization function should return a pointer to an opaque struct
- which stores all of the state of the overview subsystem, rather than all
- of that being stored in static variables, and then all other functions
- should take that pointer. OVctl possibly should instead take and return
- a struct rather than using an ioctl-style interface. Currently, the
- overview functions do a lot of breaking apart of Xref headers and
- parsing them, which is very ugly; consider having the overview interface
- always key off a newsgroup name and article number, even for storing.
- OVadd should probably take a structure and OVsearch should probably
- return a structure.
-
-
-Scheduled for INN 2.6
-
-* Add a generic, modular anti-spam and anti-abuse filter, off by default,
- but coming with INN and prominently mentioned in the INSTALL
- documentation. [Andrew Gierth has work in progress that may be usable
- for this.]
-
-* A unified configuration file combining the facilities of newsfeeds,
- incoming.conf, and innfeed.conf, but hopefully more readable and easier
- for new INN users to edit. This should have all of the capabilities of
- the existing configuration files, but specifying common things (such as
- file feeds or innfeed feeds) should be very simple and straightforward.
- This configuration file should use the new parsing infrastructure.
-
-* Convert all remaining INN configuration files to the new parsing
- infrastructure.
-
-* INN really should be capable of both sending and receiving a
- headers-only feed (or even an overview-only feed) similar to Diablo and
- using it for the same things that Diablo does, namely clustering,
- pull-on-demand for articles, and the like. This should be implementable
- as a new backend, although the API may need a few more hooks. Both a
- straight headers-only feed that only pulls articles down via NNTP from a
- remote server and a caching feed where some articles are pre-fed, some
- articles are pulled down at first read, and some articles are never
- stored locally should be possible. [Patches for a header-only feed have
- already been written and submitted to inn-workers.]
-
-* The libinn, libstorage, and other library interfaces should be treated
- as stable libraries and properly versioned using libtool's
- recommendation for library versioning when changes are made so that they
- can be installed as shared libraries and work properly through releases
- of INN. This is currently waiting on a systematic review of the
- interface and removal of things that we don't want to support long-term.
-
-* The include files necessary to use libinn, libstorage, and other
- libraries should be installed in a suitable directory so that other
- programs can link against them. All such include files should be under
- include/inn and included with <inn/header.h>. All such include files
- should only depend on other inn/* header files and not on, e.g.,
- config.h. All such include files should be careful about namespace to
- avoid conflicts with other include files used by applications.
-
-
-High Priority Projects
-
-* Modulo warnings from system headers and warnings where the compiler is
- simply wrong and there's no equally readable way to rewrite the code,
- INN should compile cleanly under "make warnings". It should be possible
- for maintainers to routinely compile INN with make warnings to catch
- problems. Note that -Wcast-qual warnings cannot be avoided entirely
- because we don't want to write redundant functions for regular and const
- strings and because of such things as struct iovec; -Wcast-qual will be
- removed from make warnings when this task is reasonably complete.
-
-* INN shouldn't flush all feeds (particularly all program feeds) on
- newgroup or rmgroup. Currently it reloads newsfeeds to reparse all of
- the wildmat patterns and rebuild the peer lists associated with the
- active file on group changes, and this forces a flush of all feeds.
- The best fix is probably to stash the wildmat pattern (and flags) for
- each peer when newsfeeds is read and then just using the stashed copy on
- newgroup or rmgroup, since otherwise the newsfeeds loading code would
- need significant modification. But in general, innd is too
- reload-happy; it should be better at making incremental changes without
- reloading everything.
-
-* Add authenticated Path support, based on the current USEFOR draft or the
- behavior of some other servers (such as Diablo). [Andrew Gierth wrote a
- patch for part of this a while back, which Russ has. Marco d'Itri
- expressed some interest in working on this.]
-
-* Various parts of INN are using write or writev; they should all use
- xwrite or xwritev instead. Even for writes that are unlikely to ever be
- partial, on some systems system calls aren't restartable and xwrite and
- xwritev properly handle EINTR returns.
-
-* Apparently on Solaris open can also be interrupted by a signal; we may
- need to have an xopen wrapper that checks for EINTR and retries.
-
-* tradspool has a few annoying problems. Deleted newsgroups never have
- their last articles expired, and there is no way of forcibly
- resynchronizing the articles stored on disk with what overview knows
- about unless tradindexed is used. Some sort of utility program to take
- care of these and to do things like analyze the tradspool.map file
- should be provided.
-
-* Rewrite inndstart as a helper program that only binds the relevant
- sockets and then returns them to innd. Since file descriptors are
- shared by child processes, this can be done with a program spawned by
- innd. This may have gotten more complicated with IPv6. Drop
- startinnfeed entirely in favor of recommending people use ulimit in the
- news init script.
-
-* contrib/mkbuf and contrib/reset-cnfs.c should be combined into a utility
- for creating and clearing cycbuffs, perhaps combined with cnfsheadconf,
- and the whole thing moved into storage/cnfs rather than frontends (along
- with cnfsstat). pullart.c may also stand to be merged into the same
- utility (cnfs-util might not be a bad name).
-
-
-Documentation Projects
-
-* Add man pages for all libinn interfaces. There should be a subdirectory
- of doc/pod for this since there will be a lot of them; installing them
- as libinn_<section>.3 seems to make the most sense (so, for example,
- error handling routines would be documented in libinn_error.3).
-
-* Better documentation of and support for UUCP feeds. send-uucp is now
- easier to use, but there's still a paucity of documentation covering the
- whole theory and mechanisms of UUCP feeding.
-
-* Everything installed by INN should have a man page. Currently, there
- are several binaries and configuration files that don't have man pages.
- (In some cases, the best thing to do with the configuration file may be
- to merge it into another one or find a way to eliminate it.)
-
-* Document the internal formats of the various overview methods, CNFS,
- timehash, and timecaf. A lot of this documentation already exists in
- various forms, but it needs to be cleaned up and collected in one place
- for each format, preferrably as a man page.
-
-* Add documentation for slave servers. [Russ has articles from
- inn-workers that can be used as a beginning.]
-
-* Write complete documentation for all of our extensions to RFC 977 or RFC
- 1036, preferrably in a format that could be suitable for future
- inclusion into new revisions of the RFCs.
-
-* Audit readers.conf.5 against perm.c for missing options ("include" at
- least is missing from the documentation).
-
-* The distributions file is undocumented.
-
-
-Code Cleanup Projects
-
-* Eliminate everything in the LEGACY section of config.h.
-
-* Move all compile-time configuration in config.h either into a separate
- header (such as inn/options.h) or turn it into a configuration file
- directive or a command-line option. In particular, the rnews
- configuration should probably be an rnews-specific section of inn.conf.
-
-* Move include/paths.h to include/inn/paths.h and change _PATH as a prefix
- to INN_PATH to move the identifiers out of the C reserved namespace.
- Check to be sure we still need all of the #defines and look at adding
- anything needed by innfeed (and eliminating the separate innfeed header
- serving the same purpose).
-
-* Move include/nntp.h to include/inn/nntp.h and at the same time look at
- standardizing the names of all of the #defines it provides, including
- the message class. [Russ has a start on this.]
-
-* Get rid of GetTimeInfo and TIMEINFO. All the struct is is a struct
- timeval plus time zone information. All of the parts of INN that deal
- with time zone information are isolated in lib/date.c. The rest of INN
- uses GetTimeInfo where a plain call to time would often work fine, or
- at most gettimeofday, and there's no reason to compute the time zone
- everywhere. Plus, it makes the code more readable to use standard
- functions and data types.
-
-* putman.sh should be merged into support/install-sh (which would mean
- giving up any pretext of using the standard install-sh script, but that
- should be fine).
-
-* Use vectors or cvectors everywhere that argify and friends are currently
- used and eliminate the separate implementation in nnrpd/misc.c.
-
-* Break up the remainder of libinn.h into multiple inn/* include files for
- specific functions (such as memory management, wildmat, date handling,
- NNTP commands, etc.), with an inn/util.h header to collect the remaining
- random utilities. Consider adding some sort of prefix, like inn_, to all
- functions that aren't part of some other logical set with its own prefix.
-
-* Break the CNFS and tradspool code into multiple source files to make it
- easier to understand the logical divisions of the code and consider
- doing the same with the other overview and storage methods.
-
-* Examine the (mostly socket) code that currently should probably be
- compiled with -fno-strict-aliasing on gcc and move the relevant casts
- to within function calls. [Russ knows about this.]
-
-* Clean up the use of #ifdef for sockets and IPv6, perhaps involving
- addition of more to include/portable/socket.h.
-
-
-Needed Bug Fixes
-
-* tradspool currently uses stdio to write out tradspool.map, which can
- cause problems if more than 256 file descriptors are in use for other
- things (such as incoming connections or tradindexed overview cache).
- It should use write() instead.
-
-* LIST NEWSGROUPS should probably only list newsgroups that are marked in
- the active file as valid groups.
-
-* INN's startup script should be sure to clean out old lock files and PID
- files for innfeed. Be careful, though, since innfeed may still be
- running, spawned from a previous innd.
-
-* makedbz should be more robust in the presence of malformed history
- lines, discarding with them or otherwise dealing with them.
-
-* CNFS, if the cycbuff is larger than 2GB and it doesn't have large file
- support, reports a mysterious file not found error because it assumes
- all errors from stat are the result of the cycbuff not being found.
-
-* Some servers reject some IHAVE, TAKETHIS, or CHECK commands with 500
- syntax errors (particularly for long message IDs), and innfeed doesn't
- handle this particularly well at the moment. It really should have an
- error handler for this case. [Sven Paulus has a preliminary patch that
- needs testing.]
-
-* Editing the active file by hand can currently munge it fairly badly even
- if the server is throttled unless you reload active before restarting
- the server. This could be avoidable for at least that particular case
- by checking the mtime of active before and after the server was
- throttled.
-
-* innreport silently discards news.notice entries about most of the errors
- innfeed generates. It should ideally generate some summary, or at least
- note that some error has occurred and the logs should be examined.
-
-* INN's message ID parser should be more forgiving about surrounding
- whitespace. Right now, it will reject messages with a trailing space in
- the Message-ID header.
-
-* nnrpd doesn't check the message ID of a posted article for syntactic
- validity before remailing it to the moderator, since normally it relies
- on innd to check the message ID. The message ID checking code from
- innd/art.c should be moved into lib so that nnrpd can use it as well.
-
-* Currently, if the list of newsgroups on an Xref slave is out of sync
- with the newsgroups on the master, receiving an article crossposted to
- one of the groups that doesn't exist on the slave will cause the slave
- to throttle. This isn't the best behavior; the server should either
- optionally create the missing newsgroup or just ignore that crossposted
- group (and modify Xref accordingly?).
-
-* Handling of compressed batches needs to be thoroughly reviewed by
- someone who understands how they're supposed to work. It's not clear
- that _PATH_GZIP is being used correctly at the moment and that
- compressed batch handling will work right now on systems that don't have
- gzip installed (but that do have uncompress).
-
-* innfeed's statistics don't add up properly all the time. All of the
- article dispositions don't add up to the offered count like they should.
- Some article handling must not be recorded properly.
-
-* innd's counting of article size doesn't always work properly, and it can
- accept articles that are larger than its configured limit. It's not
- clear exactly where this is happening.
-
-* If a channel feed exits immediately, innd respawns it immediately,
- causing thrashing of the system and a huge spew of errors in syslog. It
- should mark the channel as dormant for some period of time before
- respawning it, perhaps only if it's already died multiple times in a
- short interval.
-
-* ctlinnd begin <site-name> was causing innd to core dump.
-
-* Handling of innfeed's dropped batches needs looking at. There are three
- places where articles can fall between the cracks: an innfeed.togo file
- written by innd when the feed can't be spawned, a batch file named after
- the feed name which can be created under similar circumstances, and the
- dropped files written by innfeed itself. procbatch can clean these up,
- but has to be run by hand.
-
-* When using tradspool, groups are not immediately added to tradspool.map
- when created, making innfeed unable to find the articles until after
- some period of time. Part of the problem here is that tradspool only
- updates tradspool.map on a lazy basis, when it sees an article in that
- group, since there is no storage hook for creation of a new group.
-
-* nntpget doesn't handle long lines in messages.
-
-* WP feeds break if there are spaces in the Path header, and the inn.conf
- parser doesn't check for this case and will allow people to configure
- their server that way. (It's not clear that the latter is actually a
- bug, given the new USEFOR attempt to allow folding of Path headers, but
- the space needs to be removed for WP feeds.)
-
-* Error handling in the history backend needs to be reviewed, since it
- currently is always printing out errno regardless of whether it's
- meaningful. The error handling needs to record errno if it's useful and
- the reporting function should only print it out if it's useful for that
- error.
-
-* innd returns 437 for articles that were accepted but filed in the junk
- group. It should probably return the appropriate 2xx status code in
- that case instead.
-
-* Someone should go through the BUGS sections of all of the manpages and
- fix those for which the current behavior is unacceptable.
-
-
-Requested New Features
-
-* Consider implementing the HEADERS command as discussed rather
- extensively in news.software.nntp. [Greg Andruk has a preliminary
- patch.]
-
-* There have been a few requests for the ability to programmatically set
- the subject of the report generated by news.daily, with escapes that are
- filled in by the various pieces of information that might be useful.
-
-* A bulk cancel command using the MODE CANCEL interface. Possibly through
- ctlinnd, although it may be a bit afield of what ctlinnd is currently
- for.
-
-* Sven Paulus's patch for nnrpd volume reports should be integrated. See
- <ftp://ftp.tin.org/pub/news/servers/inn/unofficial-patches/
- patch-inn-2.2.x-artstat+list+overstat>.
-
-* Lots of people encrypt X-Trace in various ways. Should that be offered
- as a standard option? The first data element should probably remain
- unencrypted so that the O flag in newsfeeds doesn't break.
-
- Should there also be an option not to generate X-Trace? And this whole
- area may change if USEFOR ever standardizes poster trace information;
- it's been proposed to put it in the path tail instead. The current
- USEFOR trend as of January, 2001 appears to be towards an Injector-Info
- header with this information, allowing a token or an injecting hostname.
- For a token, one really wants it to be hierarchically structured for
- spam filtering even if it's encrypted (in other words, to get a "group"
- of clients, one could just match the first n bytes of the token instead
- of the whole thing).
-
- Olaf Titz suggests:
-
- This can be done by formatting the (rest of) the header in a way
- that fields are always a multiple of 8 bytes and applying a 64 bit
- block cipher in ECB mode on it. But then we would be better off
- using binary fields, as the timestamp is 9 bytes and an IP address
- 10-12 bytes.
-
- Combining the timestamp and PID into one block, adding an
- authenticated user field and omitting the redundant formatted time
- would give the following format:
-
- X-Trace: g212.hadiko.de [395109AA000016FF] [AC14302A00000000] [...]
- time | pid ip |reserved user
-
-* ctlinnd flushlogs currently renames all of the log files. It would be
- nice to support the method of log rotation that most other daemons
- support, namely to move the logs aside and then tell innd to reopen its
- log files. Ideally, that behavior would be triggered with a SIGHUP.
- scanlogs would have to be modified to handle this.
-
- The best way to support this seems to be to leave scanlogs as is by
- default, but also add two additional modes. One would flush all the
- logs and prepare for the syslog logs to be rotated, and the other would
- do all the work needed after the logs have been rotated. That way, if
- someone wanted to plug in a separate log rotation handler, they could do
- so and just call scanlogs on either side of it. The reporting portions
- of scanlogs should be in a separate program.
-
-* Several people have Perl interfaces to pieces of INN that should ideally
- be part of the INN source tree in some fashion. Greg Andruk has a bunch
- of stuff that Russ has copies of, for example.
-
-* Investigate using the new, stricter date parsing code in libinn for
- nnrpd rather than the extremely lenient parsedate routine.
-
-* There are various available patches for Cancel-Lock and an Internet
- draft; support should be added to INN for both generation and
- verification (definitely optional and not on by default at this point).
-
-* It would be nice to be able to reload inn.conf (although difficult, due
- to the amount of data that's generated from it and stashed in various
- places). This will need to wait for the new configuration parsing
- library and an inn.conf parser that uses it.
-
-* remembertrash currently rejects and remembers articles with syntax
- errors as well as things like unwanted newsgroups and unwanted
- distributions, which means that if a peer sends you a bunch of mangled
- articles, you'll then also reject the correct versions of the articles
- from other peers. This should probably be rethought.
-
-* Additional limits for readers.conf: Limit on concurrent parallel reader
- streams, limit on KB/second download (preliminary support for this is
- already in), and a limit on maximum posted articles per day (tied in
- with the backoff stuff?). These should be per-IP or per-user, but
- possibly also per-access group. (Consider pulling the -H, -T, -X, and
- -i code out from innd and using it here.)
-
-* timecaf should have more configurable parameters (at the least, how
- frequently to switch to a new CAF file should be an option).
- storage.conf should really be extended to allow method-specific
- configuration for things like this (and to allow the cycbuff.conf file
- to be merged into storage.conf).
-
-* Allow generation of arbitrary additional information that could go in
- overview by using embedded Perl or Python code. This might be a cleaner
- way to do the keywords code, which really wants Perl's regex engine
- ideally. It would also let one do something like doing MD5 hashes of
- each article and putting that in the overview if you care a lot about
- making sure that articles aren't corrupted.
-
-* Allow some way of accepting articles regardless of the Date header, even
- if it's far into the future. Some people are running into articles that
- are dated years into the future for some reason that they still want to
- store on the server.
-
-* There was a request to make --program-suffix and the other name
- transformation options to autoconf work. The standard GNU package does
- this with really ugly sed commands in the Makefile rules; we could
- probably do better, perhaps by substituting the autoconf results into
- support/install-sh.
-
-* INN currently uses hash tables to store the active file internally. It
- would be worth trying ternary search trees to see if they're faster; the
- data structure is simpler, performance may be comparable for hits and
- significantly better for misses, sizing and resizing becomes a non-issue,
- and the space penalty isn't too bad. A generic implementation is already
- available in libinn. (An even better place to use ternary search trees
- may be the configuration parser.)
-
-* Provide an innshellvars equivalent for Python.
-
-* inncheck should check the syntax of all the various files that are
- returned by LIST commands, since having those files present with the
- wrong syntax could result in non-compliant responses from the server.
- Possibly the server should also refuse to send malformatted lines to
- the client.
-
-* ctlinnd reload incoming.conf could return a count of the hosts that
- failed, or even better a list of them. This would make pruning old
- stuff out of incoming.conf much easier.
-
-* nnrpd could use sendfile(2), if available, to send articles directly
- to the socket (for those storage methods where to-wire conversion is
- not needed). This would need to be added to the storage API.
-
-* Somebody should look at keeping the "newsgroups" file more accurate
- (e.g. newgroups for existing groups should change description, better
- checkgroups handling, checking for duplicates)
-
-* The by-domain statistics innreport generates for nnrpd count all local
- connections (those with no "." in the hostname) in with the errors as
- just "?". The host2dom function could be updated to group these as
- something like "Local".
-
-* news.daily could detect if expire segfaults and unpause the server.
-
-* When using SSL, track the amount of data that's been transferred to the
- client and periodically renegotiate the session key.
-
-* When using SSL, use SSL_get_peer to get a verified client certificate,
- if available, and use it to create an additional header line when
- posting articles (X-Auth-Poster?). This header could use:
-
- X509_NAME_oneline(X509_get_subject_name(peer),...)
-
- for the full distinguished name, or
-
- X509_name_get_text_by_NID(X509_get_subject_name(peer),
- NID_commonName, ...)
-
- for the client's "common name" alone.
-
-* When using SSL, use the server's key to generate an HMAC of the body of
- the message (and most headers?), then include that digest in the
- headers. This allows a news administrator to determine if a complaint
- about the content of a message is fradulent since the message was
- changed after transmission.
-
-
-General Projects
-
-* All the old packages in unoff-contrib should be reviewed for integration
- into INN.
-
-* It may be better for INN on SysV-derived systems to use poll rather than
- select. The semantics are better, and on some systems (such as Solaris)
- select is limited to 1024 file descriptors whereas poll can handle any
- number. Unfortunately, the API is drastically different between the
- two and poll isn't portable, so supporting both cleanly would require a
- bit of thought.
-
-* Currently only innd and innfeed increase their file descriptor limits.
- Other parts of INN, notably makehistory, may benefit from doing the same
- thing if they can without root privileges.
-
-* The Tcl filtering support code has undergone serious bitrot and needs
- some work to fix it and make it work with modern versions of Tcl and the
- current version of INN. It also lacks a lot of the functionality of the
- Perl and Python filters, if anyone cares.
-
-* Revisit support for aliased groups and what nnrpd does with them.
- Should posts to the alias automatically be redirected to the real group?
- Regardless, the error return should provide useful information about
- where to post instead. Also, the new overview API, for at least some of
- the overview methods, truncated the group status at one character and
- lost the name of the group to which a group is aliased; that needs to be
- fixed.
-
-* More details as to why a message ID is bad would be useful to return to
- the user, particularly for rnews, inews, etc. innd also rejects message
- IDs with trailing spaces, which can be hard to check.
-
-* Support putting the active file and history file in different
- directories without hand-editing a bunch of files.
-
-* nnrpd's NNTP command parsing interacts poorly with AUTHINFO and
- passwords containing spaces. The correct solution isn't clear; check
- with the current NNTP RFC draft and how existing clients handle it?
-
-* frontends/pullnews and contrib/backupfeed solve the same problem; the
- best ideas of both should be unified into one script.
-
-* actsyncd could stand a rewrite and cleaner handling of both
- configuration and syncing against multiple sources which are canonical
- for different sets of groups.
-
-* send-nntp and nntpsend basically do the same thing; send-nntp could
- probably be removed (possibly with some extra support in nntpsend for
- doing simpler things).
-
-
-Long-Term Projects
-
-* Look at turning header parsing into a library of some sort. Lots of INN
- does this, but different parts of INN need subtly different things, so
- the best best API is unclear.
-
-* INN's header handling needs to be checked against the current USEFOR
- draft. This may want wait until after we have a header parsing library.
-
-* The innd filter should be able to specify additional or replacement
- groups into which an article should be filed, or even spool the article
- to a local disk file rather than storing it. (See the stuff that the
- nnrpd filter can already do.)
-
-* Add authentication via SASL to nnrpd. This is a boatload of additional
- issues, particularly if we want to add authentication methods like
- Kerberos that require their own separate libraries (although we should
- use Cyrus's SASL libraries, which will simplify a lot of that).
- [Jeffrey Vinocur is working on a standard for this.]
-
-* When articles expire out of a storage method with self-expire
- functionality, the overview and history entries for those articles
- should also be expired immediately. Otherwise, things like the GROUP
- command don't give the correct results. This will likely require a
- callback that can be passed to CNFS that is called to do the overview
- and history cleanup for each article overwritten. It will also require
- the new history API.
-
-* Feed control, namely allowing your peers to set policy on what articles
- you feed them (not just newsgroups but max article size and perhaps even
- filter properties like "non-binary"). Every site does this a bit
- differently. Some people have web interfaces, some people use GUP, some
- people roll their own alternate things. It would really be nice to have
- some good way of doing this as part of INN. It's worth considering an
- NNTP extension for this purpose, although the first step is to build a
- generic interface that an NNTP extension, a web page, etc. could all
- use. (An alternate way of doing this would be to extend IHAVE to pass
- the list of newsgroups as part of the command, although this doesn't
- seem as generally useful.)
-
-* Traffic classification as an extension of filtering. The filter should
- be able to label traffic as binary (e.g.) without rejecting it, and
- newsfeeds should be extended to allow feeding only non-binary articles
- (e.g.) to a peer.
-
-* External authenticators should also be able to do things like return a
- list of groups that a person is allowed to read or post to. Currently,
- maintaining a set of users and a set of groups, each of which some
- subset of the users is allowed to access, is far too difficult. For a
- good starting list of additional functionality that should be made
- available, look at everything the Perl authentication hooks can do.
- This should probably wait for the configuration file parsing rewrite.
-
-* Allow nnrpd to spawn long-running helper processes. Not only would this
- be useful for handling authentication (so that the auth hooks could work
- without execing a program on every connection), but it may allow for
- other architectures for handling requests (such as a pool of helpers
- that deal only with overview requests). More than that, nnrpd should
- *be* a long-running helper process that innd can feed open file
- descriptors to. [Aidan Culley has ideas along these lines.]
-
-* The tradspool storage method requires assigning a number to every
- newsgroup (for use in a token). Currently this is maintained in a
- separate tradspool.map file, but it would be much better to keep that
- information in the active file where it can't drop out of sync. A code
- assigned to each newsgroup would be useful for other things as well,
- such as hashing the directories for the tradindexed overview. For use
- for that purpose, though, the active file would have to be extended to
- include removed groups, since they'd need to be kept in the active file
- to reserve their numbers until the last articles expired.
-
-* The locking of the active file leaves something to be desired; in
- general, the locking in INN (for the active file, the history file,
- spool updates, overview updates, and the like) needs a thorough
- inspection and some cleanup. A good place to start would be tracing
- through the pause and throttle code and write up a clear description of
- what gets locked where and what is safely restarted and what isn't.
- Long term, there needs to be a library locking routine used by
- *everything* that needs to write to the history file, active file, etc.
- and that keeps track of the PID of the process locking things and is
- accessible via ctlinnd.
-
-* There is a fundamental problem with the current design of the
- control.ctl file. It combines two things: A database of hierarchies,
- their maintainers, and related information, and a list of which
- hierarchies the local server should honor. These should be separated
- out into the database (which could mostly be updated from a remote
- source like ftp.isc.org and then combined with local additions) and a
- configured list of hierarchies (or sub-hierarchies within hierarchies)
- that control messages should be honored for. This should be reasonably
- simple although correct handling of checkgroups could get a mite tricky.
-
-* Possible NNTP extension: Compression of the protocol, using gzip,
- bzip2, or some other technique. Particularly useful for long lists like
- the active file information or the overview information, but possibly
- useful in general for other things.
-
-* Install wizards. Configuring INN is currently very complex even for an
- experienced news admin, and there are several fairly standard
- configurations that shouldn't be nearly that complicated to get running
- out of the box. A little interactive Perl script asking some simple
- questions could probably get a lot of cases easily right.
-
-* One ideally wants to be able to easily convert between different
- overview formats or storage methods, refiling articles in place. This
- should be possible once we have a history API that allows changing the
- storage location of an article in-place.
-
-* Set up the infrastructure required so that INN can use alloca. This
- would significantly decrease the number of calls to malloc needed and
- would be a lot more convenient.
-
-* A serious investigation into whether INN could use a garbage collector
- is probably a good idea. The network buffers probably need to be
- handled with decidated code, but there are a lot of other incidental
- allocations and deallocations that may be much more efficient and safer
- using a garbage collector.
-
-* Look at integrating asprintf and vasprintf. Russ already tried this
- once and couldn't see a good way of doing it (particularly vasprintf)
- without hooking deep into an sprintf implementation, because the simple
- hack of calling vsnprintf first, allocating that much memory, and then
- calling it again on the new buffer doesn't work for vasprintf (you can't
- reprocess the arguments).
-
-* Support building in a separate directory than the source tree. It may
- be best to just support this via lndir rather than try to do it in
- configure, but it would be ideal to add support for this to the autoconf
- system. Unfortunately, the standard method requires letting configure
- generate all of the makefiles, which would make running configure and
- config.status take much longer than it does currently.
-
-* Look at adding some kind of support for MODE CANCEL via network sockets
- and fixing up the protocol so that it could possibly be standardized
- (the easiest thing to do would probably be to change it into a CANCEL
- command). If we want to get to the point where INN can accept and even
- propagate such feeds from dedicated spam filters or the like, there must
- also be some mechanism of negotiating policy in order to decide what
- cancels the server wants to be fed.
-
-* The "possibly signed" char data type is one of the inherent flaws of C.
- Some other projects have successfully gotten completely away from this
- by declaring all of their strings to be unsigned char, defining a macro
- like U that casts strings to unsigned char for use with literal strings,
- and always using unsigned char everywhere. Unfortunately, this also
- requires wrappering all of the standard libc string functions, since
- they're prototyped as taking char rather than unsigned char. The
- benefits include cleaner and consistent handling of characters over 127,
- better warnings from the compiler, consistent behavior across platforms
- with different notions about the signedness of char, and the elimination
- of warnings from the <ctype.h> macros on platforms like Solaris where
- those macros can't handle signed characters. We should look at doing
- this for INN.
-
-* It would clean up a lot of code considerably if we could just use mmap
- semantics regardless of whether the system has mmap. It may be possible
- to emulate mmap on systems that don't have it by reading the entirety of
- the file into memory and setting the flags that require things to call
- mmap_flush and mmap_invalidate on a regular basis, but it's not clear
- where to stash the file descriptor that corresponds to the mapped file.
-
-* Figure out some Samba library that we can link against for the Samba
- authenticator so that we can get all the Samba code back out of INN's
- source tree; we don't want to maintain it.
-
-* Consider replacing the awkward access: parameter in readers.conf with
- separate commands (e.g. "allow_newnews: true") or otherwise cleaning up
- the interaction between access: and read:/post:. Note that at least
- allownewnews: can be treated as a setting for overriding inn.conf and
- should be very easy to add.
-
-* Add a localport: parameter (similar to localaddress:) to readers.conf
- auth groups. With those two parameters (and ssl_required:) we
- essentially eliminate the need to run multiple instances of nnrpd just to
- use different configurations.
-
-* Various things may break when trying to use data written while compiled
- with large file support using a server that wasn't so compiled (and vice
- versa). The main one is the history file, but tradindexed is also
- affected and buffindexed has been reported to have problems with this
- as well. Ideally, all of INN's data files should be as portable as
- possible.
-
-
-Complete Code Reorganization
-
-At some point, we should probably abandon and archive the current CVS
-repository, reimport all of the current source files, and start with a
-fresh repository with a better revision control system such as Subversion.
-A better revision control system would let us rename and move things
-around arbitrarily, something CVS doesn't handle at all well. Should this
-ever be done, we should consider doing all of the following at the same
-time:
-
-* Don't include any generated files in the CVS tree. Maintainers should
- have autoconf and friends, pod2text and pod2man, and bison around anyway.
- This would save a bunch of extra check-ins, remove the danger of the
- generated files getting out of sync, and drastically reduce the
- repository size in the case of configure.
-
-* Don't include any of the generated man pages in the CVS tree, as an
- additional case of the above. All of the documentation should be in POD
- and we can generate the man pages as part of the snapshot process.
-
-* storage should be reserved just for article storage; the overview
- methods should be in a separate overview tree.
-
-* The split between frontends and backends is highly non-intuitive. Some
- better organization scheme should be arrived at. Perhaps something
- related to incoming and outgoing, with programs like cnfsstat moved into
- the storage directory with the other storage-related code?
-
-* Add a separate utils directory for things like convdate, shlock,
- shrinkfile, and the like. Some of the scripts may possibly want to go
- into that directory too.
-
-* The lib directory possibly should be split so that it contains only code
- always compiled and part of INN, and the various replacements for
- possibly missing system routines are in a separate directory (such as
- replace). These should possibly be separate libraries; there are things
- that currently link against libinn that only need the portability
- pieces.
-
-* The doc directory really should be broken down further by type of
- documentation or section or something; it's getting a bit unwieldy.
-
-* Untabify and reformat all of the code according to a consistent coding
- style which would then be enforced for all future check-ins.