This is version 1.0 of the INN feeder program `innfeed.' It is written in ANSI C and tries to be POSIX.1 compliant. This software was originally written and maintained by James Brister and is now part of INN. The features of it are: 1. Handles the STREAM extenstion to NNTP. 2. Will open multiple connections to the remote host to parallel feed. 3. Will handle multiple remote hosts. 4. Will tear down idle connections. 5. Runs as a channel/funnel feed from INN, or by reading a funnel file. 6. Will stop issuing CHECK commands and go straight to TAKETHIS if the remote responds affermatively to enough CHECKs. 7. It will go back to issuing CHECKs if enough TAKETHIS commands fail. Changes for 0.10.1 1. Config file inclusion now works via the syntax: $INCLUDE pathname Config files can be included up to a nesting depth of 10. Line numbers and file names are not properly reported yet on errors when includes are used, though. 2. Signal handling is tidied up a bit. If your compiler doesn't support the ``volatile'' keyword, then see the comment in sysconfig.h. 3. If you have a stdio library that hash limit on open files lower then the process limit for open plain files (all flavours of SunOS), then a new config file variable ``stdio-fdmax'' can be used to give that upper bound. When set, all new network connections will be limited to file descriptors over this value, leaving the lower file descriptors free for stdio. See innfeed.conf(5) for more details. Remember that the config file value overrides any compiled in value. Changes for 0.10 1. A major change has been made to the config file. The new format is quite extensible and will let new data items be added in the future without changing the basic format. There's a new option ``-C'' (for ``check'') that will make innfeed read the config file and report on any errors and then exit. This will let you verify things before locking them into a newsfeeds file entry. A program has been added ``innfeed-convcfg'' that will read your old config off the command line (or stdin), and will write a new version to stdout. The new config file structure permits non-peer-specific items to be declared (like the location of the status file, or whether to wrap the generated status file in HTML). This is part of the included sample: use-mmap: true news-spool: /var/news/spool/articles backlog-directory: /var/news/spool/innfeed pid-file: innfeed.pid status-file: innfeed.status gen-html: false log-file: innfeed.log backlog-factor: 1.10 connection-stats: false max-reconnect-time: 3600 so only option you'll probably need now is the ``-c'' option to locate the config file, and as this is also compiled in, you may not even need that. See the innfeed.conf(5) man page for more details on config file structure. 2. The backlog file handling is changed slightly: - The .output file is always kept open (until rotation time). - The .output file is allowed to grow for at least 30 seconds (or the value defined by the key backlog-rotate-period in the config file). This prevents thrashing of backlog files. - The hand-prepared file is checked for only every 600 seconds maximum (or the value defined by the key backlog-new), not every time the files are rotated. - The stating of the three backlog files is reduced dramatically. 3. Signal handling is changed so that they are more synchronous with other activity. This should stop the frequent core-dumps that occured when running in funnel file mode and sending SIGTERM or SIGALRM. 4. A bug related to zero-length articles was fixed. They will now be logged. 5. More information is in the innfeed.status file, including the reasons return by the remote when it is throttled. 6. SIGEMT is now a trigger for closing and reopening all the backlog files. If you have scripts that need to fool with the backlogs, then have the scripts move the backlogs out of the way and then send the SIGEMT. Changes for 0.9.3 1. If your system supports mmap() *and* if you have your articles stored on disk in NNTP-ready format (rare), then you can have innfeed mmap article data to save on memory (thanks to Dave Lawrence). There is an important issue with this: if you try to have innfeed handle too many articles (by running many connections and/or high max-check values in innfeed.conf) at once, then your system (not your process) may run out of free vnodes (global file descriptors), as a vnode is used as long as the file is mmaped. So be careful. If your articles are not in NNTP format then this will be noticed and the article will be pulled into memory for fixing up (and then immediately munmap'd). You can disable use of MMAP if you've built it in by using the '-M' flag. I tried mixing mmap'ing and articles not in NNTP format and it was a real performance loss. I'll be trying it differently later. 2. If innfeed is asked to send an article to a host it knows nothing about, or which it cannot acquire the required lock for (which causes the "ME locked cannot setup peer ..." and "ME unconfigured peer" syslog messages), then innfeed will deposit the article information into a file matching the pattern innfeed-dropped.* in the backlog directory (TAPE_DIRECTORY in config.h). This file will not be processed in any manner -- it's up to you to decide what to do with it (wait for innfeed to exit before doing anything with it, or send innfeed a SIGHUP to get it to reread its config file, which will roll this file). 4. The output backlog files will now be kept below a certain byte limit. This happens via the ``-e'' option. If, after writing to an output file, the new length is bigger than the given limit (multiplied by a fudge factor defined in config.h -- default of 1.10) then the file will be shrunk down to this size (or slightly smaller to find the end of line boundary). The front of the file will be removed to do this. This means lost articles for the entries removed. 3. A SIGHUP will make the config be reloaded. 4. The .checkpoint files have been dropped in favour of scribbling the offset into the input file itself. 5. When the process exits normally a final syslog entry covering all of the peers over the life of the process is written. It looks like: Jan 12 15:51:53 data innfeed.tester[24189]: ME global seconds 2472 offered 43820 accepted 10506 refused 31168 rejected 1773 missing 39 6. SIGALARM now rolls the input file, rather than the log file. This is useful in funnel file mode when you move the input file and tell innd to flush it, then send innfeed the signal. 7. The location of the pid file, config file and status file, can now be relative, in which case they're relative to the backlog directory. 8. stdin stdout and stderr are initialized properly when innfeed is started by a process that has closed them. 9. Various values in config.h have changed (paths to look more like values used in inn 1.5 and others to support point #7 above more easily.) 10. procbatch.pl can now 'require' innshellvars.pl that comes with 1.5. The default is not to. You nead to do a one line tweak if you want it to. The defaults in procbatch.pl match the new defaults of innfeed. 11. Core files that get generated on purpose will be done so in CORE_DIRECTORY (as defined in config.h), if that is defined to a pathname. If CORE_DIRECTORY is defined to be NULL (the default now), then the core will be generated in the backlog directory (as possibly modified by the '-b' option). Changes for 0.9.2 1. Now includes David Lawrence's patches to handle funnel files. 2. EAGAIN errors on read and writes are caught and dealt with (of interest to Solaris `victims'). 3. It is now much faster at servicing the file descriptor attached to innd. This means it is faster at recognising it has been flushed and at dropping connections. This means fewer conflicts with new innfeeds starting before the old one has finished up. It is still a good net-citizen and it finishes the commands already started, so the fast response is only as fast as your slowest peer, but it no longer tries to send everything it had queued internally, and locks get released much quicker. 4. Includes Michael Hucka's patch to make the innfeed.status output neater. 5. Includes Andy Vasilyev's HTML-in-innfeed.status patch (but you have to enable it in config.h). 6. Added a '-a' (top of news spool) and a '-p' (pid file path) option. Changes for 0.9 1. Format of innfeed.conf file changed slightly (for per-peer streaming specs). 2. Including Greg Patten's innlog.pl (taken from ftp://loose.apana.org.au/pub/local/innlog/innlog.pl) 3. Added Christophe Wolfhugel's patch to permit a per-peer restriction on using streaming. 4. More robust handling of peers that return bad responses (no long just calling abort). Changes for 0.8.5 1. Massive syslog messages cleanup courtesy of kre. 2. The innlog.awk-patch hash been dropped from the distribution until the new syslog messages are dealt with. 3. State machine more robust in the face of unexpected responses from remote. Connection gets torn down and bad response's logged. 4. The fixed timers (article reception timeout, read timeout, and flush timeout) are all adjusted by up to +/-10% so that things aren't quite so synchronised. 5. The innfeed.status file has been expanded and reformatted to include more information. Changes for 0.8.4 1. A change in the handing off of articles to connections in order to encourage connections that were opened due to activity spikes, to close down sooner. 2. The backlog files are no longer concatenated together at process startup, but the .input is simply used if it exists, and if not then the hand-dropped file is used first and the .output file second. 3. The innfeed.status is no longer updated by a innfeed that is in its death-throws. 4. Specifically catch the 480 response code from NNRPD when we try to give it an IHAVE. 5. The connection reestablishment time gets properly increased when the connection fails to go through (up to and including the reading of the banner message). 6. Bug fix that occasionally had articles sit in a connection and never get processed. 7. Bug fix in the counter of number of sleeping connections. 8. Bug fix in config file parsing. 9. Procbatch.pl included. Changes for version 0.8.1 1. various bug fixes. 2. core files generated by ASSERT are (possibly) put in a seperate directory to ease debugging are Changes for version 0.8 1. The implicit state machine in the Connection objects has been made explicit. 2. Various bug fixes. Changes for version 0.7.1 1. Pulled the source to inet_addr.c from the bind distribution. (Solaris and some others don't have it). Changes for version 0.7 1. The backlog file mechanism has been completely reworked. There are now only two backlog files: one for output and on for input. The output file becomes the input file when the input file is exhausted. 2. Much less strenuous use of writev. Solaris and other sv4r machines have an amazingly low value for the maximum number of iovecs that can be passed into writev. 3. Proper connection cleanup (QUIT issued) at shutdown. 4. A lock is taken out in the backlog directory for each peer. To feed the same peer from two different instances of innfeed (with a batch file for example), then you must use another directory. 5. Creating a file in the backlog directory with the same name as the peer, the that file will be used next time backlog files are processed. Its format must be: pathname msgid where pathname is absolute, or relative to the top of the news spool. 6. More command line options. 7. Dynamic peer creation. If the proper command line option is used (-y) and innfeed is to told to feed a peer that it doesn't have in its config file, then it will create a new binding to the new peer. The ip name must be the same as the peername, i.e. if innd tells innfeed about a peer fooBarBat, then gethostbyname("fooBarBat") better work. 8. Connections will be periodically torn down (1 hour is the default), even if they're active, so that non-innd peers don't have problems with their history files being kept open for too long. 9. The input backlog files are checkpointed every 30 seconds so that a crash while processing a large backlog doesn't require starting over from the beginning. Changes for version 0.6 1. Logging of spooling of backlog only happens once per stats-logging period. Bugs/Problems/Notes etc: 1. There is no graceful handling of file descriptor exhaustion. 2. If the following situation occurs: - articles on disk are NOT in NNTP-ready format. - innfeed was built with HAVE_MMAP defined. - memory usage is higher than expected try running innfeed with the '-M' flag (or recompiling with HAVE_MMAP undefined). Solaris, and possibly other SVR4 machines, waste a lot of swap space. 3. On the stats logging the 'offered' may not equal the sum of the other fields. This is because the stats at that moment were generated while waiting for a response to a command to come back. Innfeed considers an article ``offered'' when it sends the command, not when it gets a response back. Perhaps this should change. 4. If all the Connections for a peer are idle and a new backlog file is dropped in by hand, then it will not be picked up until the next time it gets an article from innd for that peer. This will be fixed in a later version, but for now, if the peer is likely to be idle for a long time, then flush the process. 5. Adding a backlog file by hand does not cause extra Connections to be automatically created, only the existing Connections will use the file. If the extra load requires new Connections to be built when innd delivers new articles for tranmission, then they too will use the file, but this a side effect and not a direct consequence. This means if you want to run in '-x' mode, then make sure your config file entry states the correct number of initial connections, as they're all the Connections that will be created. 6. If '-x' is used and the config file has an entry for a peer that has no batch file to process, then innfeed will not exit after all batch files have been finished--it will just site there idle. 7. If the remote is running inn and only has you in the nnrp.access file, then innfeed will end up talking to nnrpd. Innfeed will try every 30 seconds to reconnect to a server that will accept IHAVE commands. i.e. there is no exponential back of retry attempt. This is because the connection is considered good once the MODE STREAM command has been accepted or rejected (and nnrpd rejects it). Futures: 1. Innfeed will eventually take exploder commands. 2. The config file will be revamped to allow for more global options etc and run-time configuration. Too much is compile-time dependant at the moment. 3. The connection retry time will get more sophisticated to catch problems like the nnrpd issue mentioned above. 4. Include the number of takesthis/check/ihave commands issued in the log entries. 5. Heaps more stuff requested that's buried in my mail folders. Any compliments, complaints, requests, porting issues etc. should go to inn-bugs@isc.org. Many thanks to the following people for extra help (above and beyond the call of duty) with pateches, beta testing and/or suggestions: Christophe Wolfhugel Robert Elz Russell Vincent Paul Vixie Stephen Stuart John T. Stapleton Alan Barrett Lee McLoughlin Dan Ellis Katsuhiro Kondou Marc G. Fournier Steven Bauer Richard Perini Per Hedeland Clayton O'Neill Dave Pascoe Michael Handler Petr Lampa David Lawrence Don Lewis Landon Curt Noll If I've forgotten anybody, please let me know. Thanks also to the ISC for sponsoring this work.