1 This is a rough and informal list of suggested improvements to INN, parts
2 of INN that need work, and other tasks yet undone. Some of these may be
3 in progress, in which case the person working on them will be noted in
4 square brackets and should be contacted if you want to help. Otherwise,
5 let inn-workers@isc.org know if you'd like to work on any item listed
8 The list is divided into changes already tentatively scheduled for a
9 particular release, higher priority changes that will hopefully be done in
10 the near future, small or medium-scale projects for the future, and
11 long-term, large-scale problems. Note that just because a particular
12 feature is scheduled for a later release doesn't mean it can't be
13 completed earlier if someone decides to take it on. The association of
14 features with releases is intended to be a rough guide for prioritization
15 and a set of milestones to use to judge when a new major release is
18 Also, one major thing that is *always* welcome is additions to the test
19 suite, which is currently very minimal. Any work done on the test suite
20 to allow more portions of INN to be automatically tested will make all
21 changes easier and will be *greatly* appreciated.
23 Last modified $Id: TODO 7575 2006-09-11 22:59:38Z eagle $.
28 * Rewrite configure, breaking all of the tests out into separate files
29 using the new capabilities in autoconf 2.5x. Replace our local macros
30 with the more general features provided by autoconf. At the same time,
31 configure.in and Makefile.global.in should be fixed to use the same
32 names as each other for various parameters. [Russ plans to work on
35 * Add support for groups, nesting, and vectors to the new configuration
36 parsing code. [Russ plans on doing this.]
38 * Convert readers.conf and storage.conf (and related configuration files)
39 to use the new parsing system and break out program-specific sections
40 of inn.conf into their own groups.
42 * The current WIP cache and history cache should be integrated into the
43 history API, things like message ID hashing should become a selectable
44 property of the history file, and the history API should support
45 multiple backend storage formats and automatically select the right one
46 for an existing history file based on stored metainformation.
48 * The interface to embedded filters needs to be reworked. The information
49 about which filters are enabled should be isolated in the filtering API,
50 and there should be standard API calls for filtering message IDs, remote
51 posts, and local posts. As part of this revision, all of the Perl
52 callbacks should be defined before any of the user code is loaded, and
53 the Perl loading code needs considerable cleanup. At the same time as
54 this is done, the implementation should really be documented; we do some
55 interesting things with embedded filters and it would be nice to have a
56 general document describing how we do it. [Russ is planning on working
57 on this at some point, but won't get upset if someone starts first.]
59 * All of INN's documentation should be written in POD, with text and man
60 pages generated from the POD source. Anyone is encouraged to work on
61 this by just taking any existing documentation in man format and convert
62 it to POD while checking that it's still accurate and adding any
63 additional useful information that was missed.
65 * Replace the current innshellvars.pl file with a real INN Perl module for
66 Perl programs, and include the necessary glue so that other Perl modules
67 can be added to INN's build tree and installed with INN, allowing their
68 capabilities to be available to the portions of INN written in Perl.
70 * Switch nnrpd over to using the new wildmat routines rather than breaking
71 apart strings on commas and matching each expression separately. This
72 involves a lot of surgery, since PERMmatch is used all over the place,
73 and may change the interpretation of ! and @ in group permission
76 * Rework and clean up the storage API. The major change is that the
77 initialization function should return a pointer to an opaque struct
78 which stores all of the state of the storage subsystem, rather than all
79 of that being stored in static variables, and then all other functions
80 should take that pointer. More of the structures should also be opaque,
81 all-caps structure names should be avoided in favor of named structures,
82 SMsetup and SMinit should be combined into one function that takes
83 flags, SMerrno and SMerrorstr should be replaced with functions that
84 return that information, and the wire format utilities should be moved
87 * Rework and clean up the overview API. The major change is that the
88 initialization function should return a pointer to an opaque struct
89 which stores all of the state of the overview subsystem, rather than all
90 of that being stored in static variables, and then all other functions
91 should take that pointer. OVctl possibly should instead take and return
92 a struct rather than using an ioctl-style interface. Currently, the
93 overview functions do a lot of breaking apart of Xref headers and
94 parsing them, which is very ugly; consider having the overview interface
95 always key off a newsgroup name and article number, even for storing.
96 OVadd should probably take a structure and OVsearch should probably
100 Scheduled for INN 2.6
102 * Add a generic, modular anti-spam and anti-abuse filter, off by default,
103 but coming with INN and prominently mentioned in the INSTALL
104 documentation. [Andrew Gierth has work in progress that may be usable
107 * A unified configuration file combining the facilities of newsfeeds,
108 incoming.conf, and innfeed.conf, but hopefully more readable and easier
109 for new INN users to edit. This should have all of the capabilities of
110 the existing configuration files, but specifying common things (such as
111 file feeds or innfeed feeds) should be very simple and straightforward.
112 This configuration file should use the new parsing infrastructure.
114 * Convert all remaining INN configuration files to the new parsing
117 * INN really should be capable of both sending and receiving a
118 headers-only feed (or even an overview-only feed) similar to Diablo and
119 using it for the same things that Diablo does, namely clustering,
120 pull-on-demand for articles, and the like. This should be implementable
121 as a new backend, although the API may need a few more hooks. Both a
122 straight headers-only feed that only pulls articles down via NNTP from a
123 remote server and a caching feed where some articles are pre-fed, some
124 articles are pulled down at first read, and some articles are never
125 stored locally should be possible. [Patches for a header-only feed have
126 already been written and submitted to inn-workers.]
128 * The libinn, libstorage, and other library interfaces should be treated
129 as stable libraries and properly versioned using libtool's
130 recommendation for library versioning when changes are made so that they
131 can be installed as shared libraries and work properly through releases
132 of INN. This is currently waiting on a systematic review of the
133 interface and removal of things that we don't want to support long-term.
135 * The include files necessary to use libinn, libstorage, and other
136 libraries should be installed in a suitable directory so that other
137 programs can link against them. All such include files should be under
138 include/inn and included with <inn/header.h>. All such include files
139 should only depend on other inn/* header files and not on, e.g.,
140 config.h. All such include files should be careful about namespace to
141 avoid conflicts with other include files used by applications.
144 High Priority Projects
146 * Modulo warnings from system headers and warnings where the compiler is
147 simply wrong and there's no equally readable way to rewrite the code,
148 INN should compile cleanly under "make warnings". It should be possible
149 for maintainers to routinely compile INN with make warnings to catch
150 problems. Note that -Wcast-qual warnings cannot be avoided entirely
151 because we don't want to write redundant functions for regular and const
152 strings and because of such things as struct iovec; -Wcast-qual will be
153 removed from make warnings when this task is reasonably complete.
155 * INN shouldn't flush all feeds (particularly all program feeds) on
156 newgroup or rmgroup. Currently it reloads newsfeeds to reparse all of
157 the wildmat patterns and rebuild the peer lists associated with the
158 active file on group changes, and this forces a flush of all feeds.
159 The best fix is probably to stash the wildmat pattern (and flags) for
160 each peer when newsfeeds is read and then just using the stashed copy on
161 newgroup or rmgroup, since otherwise the newsfeeds loading code would
162 need significant modification. But in general, innd is too
163 reload-happy; it should be better at making incremental changes without
164 reloading everything.
166 * Add authenticated Path support, based on the current USEFOR draft or the
167 behavior of some other servers (such as Diablo). [Andrew Gierth wrote a
168 patch for part of this a while back, which Russ has. Marco d'Itri
169 expressed some interest in working on this.]
171 * Various parts of INN are using write or writev; they should all use
172 xwrite or xwritev instead. Even for writes that are unlikely to ever be
173 partial, on some systems system calls aren't restartable and xwrite and
174 xwritev properly handle EINTR returns.
176 * Apparently on Solaris open can also be interrupted by a signal; we may
177 need to have an xopen wrapper that checks for EINTR and retries.
179 * tradspool has a few annoying problems. Deleted newsgroups never have
180 their last articles expired, and there is no way of forcibly
181 resynchronizing the articles stored on disk with what overview knows
182 about unless tradindexed is used. Some sort of utility program to take
183 care of these and to do things like analyze the tradspool.map file
186 * Rewrite inndstart as a helper program that only binds the relevant
187 sockets and then returns them to innd. Since file descriptors are
188 shared by child processes, this can be done with a program spawned by
189 innd. This may have gotten more complicated with IPv6. Drop
190 startinnfeed entirely in favor of recommending people use ulimit in the
193 * contrib/mkbuf and contrib/reset-cnfs.c should be combined into a utility
194 for creating and clearing cycbuffs, perhaps combined with cnfsheadconf,
195 and the whole thing moved into storage/cnfs rather than frontends (along
196 with cnfsstat). pullart.c may also stand to be merged into the same
197 utility (cnfs-util might not be a bad name).
200 Documentation Projects
202 * Add man pages for all libinn interfaces. There should be a subdirectory
203 of doc/pod for this since there will be a lot of them; installing them
204 as libinn_<section>.3 seems to make the most sense (so, for example,
205 error handling routines would be documented in libinn_error.3).
207 * Better documentation of and support for UUCP feeds. send-uucp is now
208 easier to use, but there's still a paucity of documentation covering the
209 whole theory and mechanisms of UUCP feeding.
211 * Everything installed by INN should have a man page. Currently, there
212 are several binaries and configuration files that don't have man pages.
213 (In some cases, the best thing to do with the configuration file may be
214 to merge it into another one or find a way to eliminate it.)
216 * Document the internal formats of the various overview methods, CNFS,
217 timehash, and timecaf. A lot of this documentation already exists in
218 various forms, but it needs to be cleaned up and collected in one place
219 for each format, preferrably as a man page.
221 * Add documentation for slave servers. [Russ has articles from
222 inn-workers that can be used as a beginning.]
224 * Write complete documentation for all of our extensions to RFC 977 or RFC
225 1036, preferrably in a format that could be suitable for future
226 inclusion into new revisions of the RFCs.
228 * Audit readers.conf.5 against perm.c for missing options ("include" at
229 least is missing from the documentation).
231 * The distributions file is undocumented.
234 Code Cleanup Projects
236 * Eliminate everything in the LEGACY section of config.h.
238 * Move all compile-time configuration in config.h either into a separate
239 header (such as inn/options.h) or turn it into a configuration file
240 directive or a command-line option. In particular, the rnews
241 configuration should probably be an rnews-specific section of inn.conf.
243 * Move include/paths.h to include/inn/paths.h and change _PATH as a prefix
244 to INN_PATH to move the identifiers out of the C reserved namespace.
245 Check to be sure we still need all of the #defines and look at adding
246 anything needed by innfeed (and eliminating the separate innfeed header
247 serving the same purpose).
249 * Move include/nntp.h to include/inn/nntp.h and at the same time look at
250 standardizing the names of all of the #defines it provides, including
251 the message class. [Russ has a start on this.]
253 * Get rid of GetTimeInfo and TIMEINFO. All the struct is is a struct
254 timeval plus time zone information. All of the parts of INN that deal
255 with time zone information are isolated in lib/date.c. The rest of INN
256 uses GetTimeInfo where a plain call to time would often work fine, or
257 at most gettimeofday, and there's no reason to compute the time zone
258 everywhere. Plus, it makes the code more readable to use standard
259 functions and data types.
261 * putman.sh should be merged into support/install-sh (which would mean
262 giving up any pretext of using the standard install-sh script, but that
265 * Use vectors or cvectors everywhere that argify and friends are currently
266 used and eliminate the separate implementation in nnrpd/misc.c.
268 * Break up the remainder of libinn.h into multiple inn/* include files for
269 specific functions (such as memory management, wildmat, date handling,
270 NNTP commands, etc.), with an inn/util.h header to collect the remaining
271 random utilities. Consider adding some sort of prefix, like inn_, to all
272 functions that aren't part of some other logical set with its own prefix.
274 * Break the CNFS and tradspool code into multiple source files to make it
275 easier to understand the logical divisions of the code and consider
276 doing the same with the other overview and storage methods.
278 * Examine the (mostly socket) code that currently should probably be
279 compiled with -fno-strict-aliasing on gcc and move the relevant casts
280 to within function calls. [Russ knows about this.]
282 * Clean up the use of #ifdef for sockets and IPv6, perhaps involving
283 addition of more to include/portable/socket.h.
288 * tradspool currently uses stdio to write out tradspool.map, which can
289 cause problems if more than 256 file descriptors are in use for other
290 things (such as incoming connections or tradindexed overview cache).
291 It should use write() instead.
293 * LIST NEWSGROUPS should probably only list newsgroups that are marked in
294 the active file as valid groups.
296 * INN's startup script should be sure to clean out old lock files and PID
297 files for innfeed. Be careful, though, since innfeed may still be
298 running, spawned from a previous innd.
300 * makedbz should be more robust in the presence of malformed history
301 lines, discarding with them or otherwise dealing with them.
303 * CNFS, if the cycbuff is larger than 2GB and it doesn't have large file
304 support, reports a mysterious file not found error because it assumes
305 all errors from stat are the result of the cycbuff not being found.
307 * Some servers reject some IHAVE, TAKETHIS, or CHECK commands with 500
308 syntax errors (particularly for long message IDs), and innfeed doesn't
309 handle this particularly well at the moment. It really should have an
310 error handler for this case. [Sven Paulus has a preliminary patch that
313 * Editing the active file by hand can currently munge it fairly badly even
314 if the server is throttled unless you reload active before restarting
315 the server. This could be avoidable for at least that particular case
316 by checking the mtime of active before and after the server was
319 * innreport silently discards news.notice entries about most of the errors
320 innfeed generates. It should ideally generate some summary, or at least
321 note that some error has occurred and the logs should be examined.
323 * INN's message ID parser should be more forgiving about surrounding
324 whitespace. Right now, it will reject messages with a trailing space in
325 the Message-ID header.
327 * nnrpd doesn't check the message ID of a posted article for syntactic
328 validity before remailing it to the moderator, since normally it relies
329 on innd to check the message ID. The message ID checking code from
330 innd/art.c should be moved into lib so that nnrpd can use it as well.
332 * Currently, if the list of newsgroups on an Xref slave is out of sync
333 with the newsgroups on the master, receiving an article crossposted to
334 one of the groups that doesn't exist on the slave will cause the slave
335 to throttle. This isn't the best behavior; the server should either
336 optionally create the missing newsgroup or just ignore that crossposted
337 group (and modify Xref accordingly?).
339 * Handling of compressed batches needs to be thoroughly reviewed by
340 someone who understands how they're supposed to work. It's not clear
341 that _PATH_GZIP is being used correctly at the moment and that
342 compressed batch handling will work right now on systems that don't have
343 gzip installed (but that do have uncompress).
345 * innfeed's statistics don't add up properly all the time. All of the
346 article dispositions don't add up to the offered count like they should.
347 Some article handling must not be recorded properly.
349 * innd's counting of article size doesn't always work properly, and it can
350 accept articles that are larger than its configured limit. It's not
351 clear exactly where this is happening.
353 * If a channel feed exits immediately, innd respawns it immediately,
354 causing thrashing of the system and a huge spew of errors in syslog. It
355 should mark the channel as dormant for some period of time before
356 respawning it, perhaps only if it's already died multiple times in a
359 * ctlinnd begin <site-name> was causing innd to core dump.
361 * Handling of innfeed's dropped batches needs looking at. There are three
362 places where articles can fall between the cracks: an innfeed.togo file
363 written by innd when the feed can't be spawned, a batch file named after
364 the feed name which can be created under similar circumstances, and the
365 dropped files written by innfeed itself. procbatch can clean these up,
366 but has to be run by hand.
368 * When using tradspool, groups are not immediately added to tradspool.map
369 when created, making innfeed unable to find the articles until after
370 some period of time. Part of the problem here is that tradspool only
371 updates tradspool.map on a lazy basis, when it sees an article in that
372 group, since there is no storage hook for creation of a new group.
374 * nntpget doesn't handle long lines in messages.
376 * WP feeds break if there are spaces in the Path header, and the inn.conf
377 parser doesn't check for this case and will allow people to configure
378 their server that way. (It's not clear that the latter is actually a
379 bug, given the new USEFOR attempt to allow folding of Path headers, but
380 the space needs to be removed for WP feeds.)
382 * Error handling in the history backend needs to be reviewed, since it
383 currently is always printing out errno regardless of whether it's
384 meaningful. The error handling needs to record errno if it's useful and
385 the reporting function should only print it out if it's useful for that
388 * innd returns 437 for articles that were accepted but filed in the junk
389 group. It should probably return the appropriate 2xx status code in
392 * Someone should go through the BUGS sections of all of the manpages and
393 fix those for which the current behavior is unacceptable.
396 Requested New Features
398 * Consider implementing the HEADERS command as discussed rather
399 extensively in news.software.nntp. [Greg Andruk has a preliminary
402 * There have been a few requests for the ability to programmatically set
403 the subject of the report generated by news.daily, with escapes that are
404 filled in by the various pieces of information that might be useful.
406 * A bulk cancel command using the MODE CANCEL interface. Possibly through
407 ctlinnd, although it may be a bit afield of what ctlinnd is currently
410 * Sven Paulus's patch for nnrpd volume reports should be integrated. See
411 <ftp://ftp.tin.org/pub/news/servers/inn/unofficial-patches/
412 patch-inn-2.2.x-artstat+list+overstat>.
414 * Lots of people encrypt X-Trace in various ways. Should that be offered
415 as a standard option? The first data element should probably remain
416 unencrypted so that the O flag in newsfeeds doesn't break.
418 Should there also be an option not to generate X-Trace? And this whole
419 area may change if USEFOR ever standardizes poster trace information;
420 it's been proposed to put it in the path tail instead. The current
421 USEFOR trend as of January, 2001 appears to be towards an Injector-Info
422 header with this information, allowing a token or an injecting hostname.
423 For a token, one really wants it to be hierarchically structured for
424 spam filtering even if it's encrypted (in other words, to get a "group"
425 of clients, one could just match the first n bytes of the token instead
430 This can be done by formatting the (rest of) the header in a way
431 that fields are always a multiple of 8 bytes and applying a 64 bit
432 block cipher in ECB mode on it. But then we would be better off
433 using binary fields, as the timestamp is 9 bytes and an IP address
436 Combining the timestamp and PID into one block, adding an
437 authenticated user field and omitting the redundant formatted time
438 would give the following format:
440 X-Trace: g212.hadiko.de [395109AA000016FF] [AC14302A00000000] [...]
441 time | pid ip |reserved user
443 * ctlinnd flushlogs currently renames all of the log files. It would be
444 nice to support the method of log rotation that most other daemons
445 support, namely to move the logs aside and then tell innd to reopen its
446 log files. Ideally, that behavior would be triggered with a SIGHUP.
447 scanlogs would have to be modified to handle this.
449 The best way to support this seems to be to leave scanlogs as is by
450 default, but also add two additional modes. One would flush all the
451 logs and prepare for the syslog logs to be rotated, and the other would
452 do all the work needed after the logs have been rotated. That way, if
453 someone wanted to plug in a separate log rotation handler, they could do
454 so and just call scanlogs on either side of it. The reporting portions
455 of scanlogs should be in a separate program.
457 * Several people have Perl interfaces to pieces of INN that should ideally
458 be part of the INN source tree in some fashion. Greg Andruk has a bunch
459 of stuff that Russ has copies of, for example.
461 * Investigate using the new, stricter date parsing code in libinn for
462 nnrpd rather than the extremely lenient parsedate routine.
464 * There are various available patches for Cancel-Lock and an Internet
465 draft; support should be added to INN for both generation and
466 verification (definitely optional and not on by default at this point).
468 * It would be nice to be able to reload inn.conf (although difficult, due
469 to the amount of data that's generated from it and stashed in various
470 places). This will need to wait for the new configuration parsing
471 library and an inn.conf parser that uses it.
473 * remembertrash currently rejects and remembers articles with syntax
474 errors as well as things like unwanted newsgroups and unwanted
475 distributions, which means that if a peer sends you a bunch of mangled
476 articles, you'll then also reject the correct versions of the articles
477 from other peers. This should probably be rethought.
479 * Additional limits for readers.conf: Limit on concurrent parallel reader
480 streams, limit on KB/second download (preliminary support for this is
481 already in), and a limit on maximum posted articles per day (tied in
482 with the backoff stuff?). These should be per-IP or per-user, but
483 possibly also per-access group. (Consider pulling the -H, -T, -X, and
484 -i code out from innd and using it here.)
486 * timecaf should have more configurable parameters (at the least, how
487 frequently to switch to a new CAF file should be an option).
488 storage.conf should really be extended to allow method-specific
489 configuration for things like this (and to allow the cycbuff.conf file
490 to be merged into storage.conf).
492 * Allow generation of arbitrary additional information that could go in
493 overview by using embedded Perl or Python code. This might be a cleaner
494 way to do the keywords code, which really wants Perl's regex engine
495 ideally. It would also let one do something like doing MD5 hashes of
496 each article and putting that in the overview if you care a lot about
497 making sure that articles aren't corrupted.
499 * Allow some way of accepting articles regardless of the Date header, even
500 if it's far into the future. Some people are running into articles that
501 are dated years into the future for some reason that they still want to
504 * There was a request to make --program-suffix and the other name
505 transformation options to autoconf work. The standard GNU package does
506 this with really ugly sed commands in the Makefile rules; we could
507 probably do better, perhaps by substituting the autoconf results into
510 * INN currently uses hash tables to store the active file internally. It
511 would be worth trying ternary search trees to see if they're faster; the
512 data structure is simpler, performance may be comparable for hits and
513 significantly better for misses, sizing and resizing becomes a non-issue,
514 and the space penalty isn't too bad. A generic implementation is already
515 available in libinn. (An even better place to use ternary search trees
516 may be the configuration parser.)
518 * Provide an innshellvars equivalent for Python.
520 * inncheck should check the syntax of all the various files that are
521 returned by LIST commands, since having those files present with the
522 wrong syntax could result in non-compliant responses from the server.
523 Possibly the server should also refuse to send malformatted lines to
526 * ctlinnd reload incoming.conf could return a count of the hosts that
527 failed, or even better a list of them. This would make pruning old
528 stuff out of incoming.conf much easier.
530 * nnrpd could use sendfile(2), if available, to send articles directly
531 to the socket (for those storage methods where to-wire conversion is
532 not needed). This would need to be added to the storage API.
534 * Somebody should look at keeping the "newsgroups" file more accurate
535 (e.g. newgroups for existing groups should change description, better
536 checkgroups handling, checking for duplicates)
538 * The by-domain statistics innreport generates for nnrpd count all local
539 connections (those with no "." in the hostname) in with the errors as
540 just "?". The host2dom function could be updated to group these as
541 something like "Local".
543 * news.daily could detect if expire segfaults and unpause the server.
545 * When using SSL, track the amount of data that's been transferred to the
546 client and periodically renegotiate the session key.
548 * When using SSL, use SSL_get_peer to get a verified client certificate,
549 if available, and use it to create an additional header line when
550 posting articles (X-Auth-Poster?). This header could use:
552 X509_NAME_oneline(X509_get_subject_name(peer),...)
554 for the full distinguished name, or
556 X509_name_get_text_by_NID(X509_get_subject_name(peer),
559 for the client's "common name" alone.
561 * When using SSL, use the server's key to generate an HMAC of the body of
562 the message (and most headers?), then include that digest in the
563 headers. This allows a news administrator to determine if a complaint
564 about the content of a message is fradulent since the message was
565 changed after transmission.
570 * All the old packages in unoff-contrib should be reviewed for integration
573 * It may be better for INN on SysV-derived systems to use poll rather than
574 select. The semantics are better, and on some systems (such as Solaris)
575 select is limited to 1024 file descriptors whereas poll can handle any
576 number. Unfortunately, the API is drastically different between the
577 two and poll isn't portable, so supporting both cleanly would require a
580 * Currently only innd and innfeed increase their file descriptor limits.
581 Other parts of INN, notably makehistory, may benefit from doing the same
582 thing if they can without root privileges.
584 * The Tcl filtering support code has undergone serious bitrot and needs
585 some work to fix it and make it work with modern versions of Tcl and the
586 current version of INN. It also lacks a lot of the functionality of the
587 Perl and Python filters, if anyone cares.
589 * Revisit support for aliased groups and what nnrpd does with them.
590 Should posts to the alias automatically be redirected to the real group?
591 Regardless, the error return should provide useful information about
592 where to post instead. Also, the new overview API, for at least some of
593 the overview methods, truncated the group status at one character and
594 lost the name of the group to which a group is aliased; that needs to be
597 * More details as to why a message ID is bad would be useful to return to
598 the user, particularly for rnews, inews, etc. innd also rejects message
599 IDs with trailing spaces, which can be hard to check.
601 * Support putting the active file and history file in different
602 directories without hand-editing a bunch of files.
604 * nnrpd's NNTP command parsing interacts poorly with AUTHINFO and
605 passwords containing spaces. The correct solution isn't clear; check
606 with the current NNTP RFC draft and how existing clients handle it?
608 * frontends/pullnews and contrib/backupfeed solve the same problem; the
609 best ideas of both should be unified into one script.
611 * actsyncd could stand a rewrite and cleaner handling of both
612 configuration and syncing against multiple sources which are canonical
613 for different sets of groups.
615 * send-nntp and nntpsend basically do the same thing; send-nntp could
616 probably be removed (possibly with some extra support in nntpsend for
617 doing simpler things).
622 * Look at turning header parsing into a library of some sort. Lots of INN
623 does this, but different parts of INN need subtly different things, so
624 the best best API is unclear.
626 * INN's header handling needs to be checked against the current USEFOR
627 draft. This may want wait until after we have a header parsing library.
629 * The innd filter should be able to specify additional or replacement
630 groups into which an article should be filed, or even spool the article
631 to a local disk file rather than storing it. (See the stuff that the
632 nnrpd filter can already do.)
634 * Add authentication via SASL to nnrpd. This is a boatload of additional
635 issues, particularly if we want to add authentication methods like
636 Kerberos that require their own separate libraries (although we should
637 use Cyrus's SASL libraries, which will simplify a lot of that).
638 [Jeffrey Vinocur is working on a standard for this.]
640 * When articles expire out of a storage method with self-expire
641 functionality, the overview and history entries for those articles
642 should also be expired immediately. Otherwise, things like the GROUP
643 command don't give the correct results. This will likely require a
644 callback that can be passed to CNFS that is called to do the overview
645 and history cleanup for each article overwritten. It will also require
648 * Feed control, namely allowing your peers to set policy on what articles
649 you feed them (not just newsgroups but max article size and perhaps even
650 filter properties like "non-binary"). Every site does this a bit
651 differently. Some people have web interfaces, some people use GUP, some
652 people roll their own alternate things. It would really be nice to have
653 some good way of doing this as part of INN. It's worth considering an
654 NNTP extension for this purpose, although the first step is to build a
655 generic interface that an NNTP extension, a web page, etc. could all
656 use. (An alternate way of doing this would be to extend IHAVE to pass
657 the list of newsgroups as part of the command, although this doesn't
658 seem as generally useful.)
660 * Traffic classification as an extension of filtering. The filter should
661 be able to label traffic as binary (e.g.) without rejecting it, and
662 newsfeeds should be extended to allow feeding only non-binary articles
665 * External authenticators should also be able to do things like return a
666 list of groups that a person is allowed to read or post to. Currently,
667 maintaining a set of users and a set of groups, each of which some
668 subset of the users is allowed to access, is far too difficult. For a
669 good starting list of additional functionality that should be made
670 available, look at everything the Perl authentication hooks can do.
671 This should probably wait for the configuration file parsing rewrite.
673 * Allow nnrpd to spawn long-running helper processes. Not only would this
674 be useful for handling authentication (so that the auth hooks could work
675 without execing a program on every connection), but it may allow for
676 other architectures for handling requests (such as a pool of helpers
677 that deal only with overview requests). More than that, nnrpd should
678 *be* a long-running helper process that innd can feed open file
679 descriptors to. [Aidan Culley has ideas along these lines.]
681 * The tradspool storage method requires assigning a number to every
682 newsgroup (for use in a token). Currently this is maintained in a
683 separate tradspool.map file, but it would be much better to keep that
684 information in the active file where it can't drop out of sync. A code
685 assigned to each newsgroup would be useful for other things as well,
686 such as hashing the directories for the tradindexed overview. For use
687 for that purpose, though, the active file would have to be extended to
688 include removed groups, since they'd need to be kept in the active file
689 to reserve their numbers until the last articles expired.
691 * The locking of the active file leaves something to be desired; in
692 general, the locking in INN (for the active file, the history file,
693 spool updates, overview updates, and the like) needs a thorough
694 inspection and some cleanup. A good place to start would be tracing
695 through the pause and throttle code and write up a clear description of
696 what gets locked where and what is safely restarted and what isn't.
697 Long term, there needs to be a library locking routine used by
698 *everything* that needs to write to the history file, active file, etc.
699 and that keeps track of the PID of the process locking things and is
700 accessible via ctlinnd.
702 * There is a fundamental problem with the current design of the
703 control.ctl file. It combines two things: A database of hierarchies,
704 their maintainers, and related information, and a list of which
705 hierarchies the local server should honor. These should be separated
706 out into the database (which could mostly be updated from a remote
707 source like ftp.isc.org and then combined with local additions) and a
708 configured list of hierarchies (or sub-hierarchies within hierarchies)
709 that control messages should be honored for. This should be reasonably
710 simple although correct handling of checkgroups could get a mite tricky.
712 * Possible NNTP extension: Compression of the protocol, using gzip,
713 bzip2, or some other technique. Particularly useful for long lists like
714 the active file information or the overview information, but possibly
715 useful in general for other things.
717 * Install wizards. Configuring INN is currently very complex even for an
718 experienced news admin, and there are several fairly standard
719 configurations that shouldn't be nearly that complicated to get running
720 out of the box. A little interactive Perl script asking some simple
721 questions could probably get a lot of cases easily right.
723 * One ideally wants to be able to easily convert between different
724 overview formats or storage methods, refiling articles in place. This
725 should be possible once we have a history API that allows changing the
726 storage location of an article in-place.
728 * Set up the infrastructure required so that INN can use alloca. This
729 would significantly decrease the number of calls to malloc needed and
730 would be a lot more convenient.
732 * A serious investigation into whether INN could use a garbage collector
733 is probably a good idea. The network buffers probably need to be
734 handled with decidated code, but there are a lot of other incidental
735 allocations and deallocations that may be much more efficient and safer
736 using a garbage collector.
738 * Look at integrating asprintf and vasprintf. Russ already tried this
739 once and couldn't see a good way of doing it (particularly vasprintf)
740 without hooking deep into an sprintf implementation, because the simple
741 hack of calling vsnprintf first, allocating that much memory, and then
742 calling it again on the new buffer doesn't work for vasprintf (you can't
743 reprocess the arguments).
745 * Support building in a separate directory than the source tree. It may
746 be best to just support this via lndir rather than try to do it in
747 configure, but it would be ideal to add support for this to the autoconf
748 system. Unfortunately, the standard method requires letting configure
749 generate all of the makefiles, which would make running configure and
750 config.status take much longer than it does currently.
752 * Look at adding some kind of support for MODE CANCEL via network sockets
753 and fixing up the protocol so that it could possibly be standardized
754 (the easiest thing to do would probably be to change it into a CANCEL
755 command). If we want to get to the point where INN can accept and even
756 propagate such feeds from dedicated spam filters or the like, there must
757 also be some mechanism of negotiating policy in order to decide what
758 cancels the server wants to be fed.
760 * The "possibly signed" char data type is one of the inherent flaws of C.
761 Some other projects have successfully gotten completely away from this
762 by declaring all of their strings to be unsigned char, defining a macro
763 like U that casts strings to unsigned char for use with literal strings,
764 and always using unsigned char everywhere. Unfortunately, this also
765 requires wrappering all of the standard libc string functions, since
766 they're prototyped as taking char rather than unsigned char. The
767 benefits include cleaner and consistent handling of characters over 127,
768 better warnings from the compiler, consistent behavior across platforms
769 with different notions about the signedness of char, and the elimination
770 of warnings from the <ctype.h> macros on platforms like Solaris where
771 those macros can't handle signed characters. We should look at doing
774 * It would clean up a lot of code considerably if we could just use mmap
775 semantics regardless of whether the system has mmap. It may be possible
776 to emulate mmap on systems that don't have it by reading the entirety of
777 the file into memory and setting the flags that require things to call
778 mmap_flush and mmap_invalidate on a regular basis, but it's not clear
779 where to stash the file descriptor that corresponds to the mapped file.
781 * Figure out some Samba library that we can link against for the Samba
782 authenticator so that we can get all the Samba code back out of INN's
783 source tree; we don't want to maintain it.
785 * Consider replacing the awkward access: parameter in readers.conf with
786 separate commands (e.g. "allow_newnews: true") or otherwise cleaning up
787 the interaction between access: and read:/post:. Note that at least
788 allownewnews: can be treated as a setting for overriding inn.conf and
789 should be very easy to add.
791 * Add a localport: parameter (similar to localaddress:) to readers.conf
792 auth groups. With those two parameters (and ssl_required:) we
793 essentially eliminate the need to run multiple instances of nnrpd just to
794 use different configurations.
796 * Various things may break when trying to use data written while compiled
797 with large file support using a server that wasn't so compiled (and vice
798 versa). The main one is the history file, but tradindexed is also
799 affected and buffindexed has been reported to have problems with this
800 as well. Ideally, all of INN's data files should be as portable as
804 Complete Code Reorganization
806 At some point, we should probably abandon and archive the current CVS
807 repository, reimport all of the current source files, and start with a
808 fresh repository with a better revision control system such as Subversion.
809 A better revision control system would let us rename and move things
810 around arbitrarily, something CVS doesn't handle at all well. Should this
811 ever be done, we should consider doing all of the following at the same
814 * Don't include any generated files in the CVS tree. Maintainers should
815 have autoconf and friends, pod2text and pod2man, and bison around anyway.
816 This would save a bunch of extra check-ins, remove the danger of the
817 generated files getting out of sync, and drastically reduce the
818 repository size in the case of configure.
820 * Don't include any of the generated man pages in the CVS tree, as an
821 additional case of the above. All of the documentation should be in POD
822 and we can generate the man pages as part of the snapshot process.
824 * storage should be reserved just for article storage; the overview
825 methods should be in a separate overview tree.
827 * The split between frontends and backends is highly non-intuitive. Some
828 better organization scheme should be arrived at. Perhaps something
829 related to incoming and outgoing, with programs like cnfsstat moved into
830 the storage directory with the other storage-related code?
832 * Add a separate utils directory for things like convdate, shlock,
833 shrinkfile, and the like. Some of the scripts may possibly want to go
834 into that directory too.
836 * The lib directory possibly should be split so that it contains only code
837 always compiled and part of INN, and the various replacements for
838 possibly missing system routines are in a separate directory (such as
839 replace). These should possibly be separate libraries; there are things
840 that currently link against libinn that only need the portability
843 * The doc directory really should be broken down further by type of
844 documentation or section or something; it's getting a bit unwieldy.
846 * Untabify and reformat all of the code according to a consistent coding
847 style which would then be enforced for all future check-ins.