X-Git-Url: http://www.chiark.greenend.org.uk/ucgi/~ian/git?p=inn-innduct.git;a=blobdiff_plain;f=doc%2Fman%2Finnduct.8;h=a294372dd022f3a2e877282789dcd1f435cf3d54;hp=87c602ad0ffa2a165a2fec7d33e9a2ad736bc9c1;hb=c59255eee1ca87939343464651729776e833bb53;hpb=8b12e9db444d3a0f66ae0f2422d082cec186f9b9 diff --git a/doc/man/innduct.8 b/doc/man/innduct.8 index 87c602a..a294372 100644 --- a/doc/man/innduct.8 +++ b/doc/man/innduct.8 @@ -4,202 +4,540 @@ innduct \- quickly and reliably stream Usenet articles to remote site .SH SYNOPSIS .B innduct .RI [ options ] -[ -.I sitename -.I fqdn -] +.I site +.RI [ fqdn ] .SH DESCRIPTION -.I Innduct -is a +.B innduct +implements NNTP peer-to-peer news transmission including the streaming +extensions, for sending news articles to a remote site. -front-end that invokes -.IR innxmit (1) -to send Usenet articles to a remote NNTP site. -.PP -The sites to be fed may be specified by giving -.I sitename -.I fqdn -pairs on the command line. -If no such pairs are given, -.I innduct -defaults to the information given in the -.I innduct.ctl -config file. -.PP -The -.I sitename -should be the name of the site as specified in the -.IR newsfeeds (5) -file. -The -.I fqdn -should be the hostname or IP address of the remote site. -.PP -An -.I innxmit -is launched for sites with queued news. -All -.I innxmit -processes are spawned in the background and the script waits for -them all to finish before returning. -Output is sent to the file -.IR /innduct.log . -In order to keep from overwhelming the local system, -.I innduct -waits five seconds before spawning each child. -.PP -.I Innduct -expects that the batchfile for a site is named -.IR /sitename . -To prevent batchfile corruption, -.IR shlock (1) -is used to ``lock'' these files. -.PP -When -.I sitename +You need to run one instance of innduct for each peer site. innduct +manages its interaction with innd, including flushing the feed as +appropriate, etc., so that articles are transmitted quickly, and +manages the retransmission of its own backlog. innduct includes the +locking necessary to avoid multiple simutaneous invocations. + +By default, innduct reads the default feedfile corresponding to +the site +.I site +(is +.IR pathoutgoing / site ) +and feeds it via NNTP, streaming if possible, to the host +.IR fqdn . + +If .I fqdn -pairs are given on the command line, -any flags given on the command completely describe how -.I innxmit +is not specified, it defaults to +.IR site . + +innduct daemonises after argument parsing, and all logging (including +error messages) are sent to syslog (facility +.BR news ). + +The best way to run innduct is probably to periodically invoke innduct +for each feed (e.g. from cron), passing innduct it the +.B \-q +option to arrange that it silently exits if an innduct is already +running for that site. +.SH INNDUCT VS INNFEED/NNTPSEND/INNXMIT +.TP +.B innfeed +does roughly the same thing as innduct. However, the way it receives +information from innd can result in articles being lost (not offered +to peers) if innfeed crashes for any reason. This is an inherent +defect in the innd channel feed protocol. innduct uses a file feed, +constantly "tailing" the feed file, and where implemented uses +.BR inotify (2) +to reduce the latency which would come from having to constantly poll +the feed file. innfeed is capable of feeding multiple peers from a +single innfeed instance, whereas each innduct process handles exactly +one peer. innduct is much smaller and simpler, at <4kloc to innfeed's +25kloc. innfeed needs a separate wrapper script or similar +infrastructure (of which there is an example in its manpage), whereas +innduct can be run directly and doesn't need help from shell scripts. +.TP +.B nntpsend +processes feed files in batch mode. That is, you have to periodically +invoke nntpsend, and when you do, the feed is flushed and articles +which arrived before the flush are sent to the peer. This introduces +a batching delay, and also means that the NNTP connection to the peer +needs to be remade at each batch. nntpsend (which uses innxmit) +cannot make use of multiple connections to a single peer site. +However, nntpsend can be left to find automatically which sites need +feeding by looking in +.IR pathoutgoing . +.TP +.B innxmit +is the actual NNTP feeder program used by nntpsend. +.SH GENERAL OPTIONS +.TP +.BR \-f | \-\-feedfile= \fIpath\fR +Specifies the +.I feedfile +to read, and indirectly specifies the paths to +be used for various ancillary files (see FILES, below). +If +.I path +ends in a +.B / +it is taken as a directory to use, and the actual feed file used is +.IR path / site . +If +.I path +does not start with a +.BR / , +it is taken to be relative to +.IR pathoutgoing +from inn.conf. +The default is +.IR site . +.TP +.BR \-q | \-\-quiet-multiple +Makes innduct silently exit (with status 0) if another innduct holds +the lock for the site. Without \fB-q\fR, this causes a fatal error to +be logged and a nonzero exit. +.TP +.BR \-\-no-daemon +Do not daemonise. innduct runs in the foreground and all messages +(including all debug messages) are written to stderr. A control +command line is also available on stdin/stdout. +.TP +.BI \-\-no-streaming +Do not try to use the streaming extensions to NNTP (for use eg if the +peer can't cope when we send MODE STREAM). +.TP +.BI \-\-no-filemon +Do not try to use the file change monitoring support to watch for +writes by innd to the feed file; poll it instead. (If file monitoring +is not compiled in, this option just downgrades the log message which +warns about this situation.) +.TP +.BR \-C | \-\-inndconf= \fIFILE\fR +Read +.I FILE +instead of the default +.BR inn.conf . +.TP +.BI \-\-port= PORT +Connect to port +.I PORT +at the remote site rather than to the NNTP port (119). +.TP +.BI \-\-chdir= pathrun +Change directory to +.IR pathrun +at startup. The default is +.I pathrun +from inn.conf. +.TP +.BR \-\-cli= \fICLI-DIR\fR / |\fICLI-PATH\fR| none +Listen for control command line connections on +.IB CLI-DIR / site +(if the value ends with a +.BR /) +or +.I CLI-PATH +(if it doesn't). See CONTROLLING INNDUCT, below. +Note that there is a fairly short limit on the paths to AF_UNIX +sockets. If specified as +.IR CLI-DIR \fB/\fR, +the directory will be created with mode 700 if necessary. +The default is +.B innduct/ +which means to create that directory in +.I pathrun +and listen on +.RB \fIpathrun\fR /innduct/ \fIsite\fR. +.TP +.BI \-\-help +Just print a brief usage message and list of the options to stdout. +.SH TUNING OPTIONS +You should not normally need to adjust these. Time intervals may +specified in seconds, or as a number followed by one of the following +units: +.BR "s m h d" , +.BR "sec min hour day" , +.BR "das hs ks Ms" . +.TP +.BI \-\-max-connections= max +Restricts the maximum number of simultaneous NNTP connections used by +for each site to +.IR max . +The default is +.BR 10 . +There is no global limit on the number of connections. +.TP +.BI \-\-max-queue-per-conn= per-conn-max +Restricts the maximum number of outstanding articles queued on any +particular connection to +.IR max . +(Non-streaming connections can only handle one article at a time.) +The default is +.BR 200 . +.TP +.BI \-\-max-queue-per-file= max +Restricts the maximum number articles read into core from any one +input file to +.IR max . +The default is twice the value of per-conn-max. +.TP +.BI \-\-feedfile-flush-size= bytes +Specifies that innduct should flush the feed and start a new feedfile +when the existing feedfile size exceeds +.IR bytes ; +the effect is that the innduct will try to avoid the various +batchfiles growing much beyond this size while the link to the peer is +working. The default is +.BR 100000 . +.TP +.BI \-\-period-interval= PERIOD-INTERVAL +Specifies wakup interval and period granularity. +innduct wakes up every +.I PERIOD-INTERVAL +to do various housekeeping checks. Also, many of the timeout and +rescan intervals (those specified in this manual as +.IR PERIOD ) +are rounded up to the next multiple of +.IR PERIOD-INTERVAL . +.TP +.BI \-\-connection-timeout= TIME +How long to allow for a connection setup attempt before giving up. +The default is +.BR 200s . +.TP +.BI \-\-stuck-flush-timeout= TIME +How long to wait for innd to respond to a flush request before giving +up. The default is +.BR 100s . +.TP +.BI \-\-feedfile-poll= TIME +How often to poll the feedfile for new articles written by innd +if file monitoring +.RI ( inotify +or equivalent) is not available. (When file monitoring is available, +there is no need for periodic checks and we wake immediately up +whenever the feedfile changes.) +The default is +.BR 5s . +.TP +.BI \-\-no-check-proportion= PERCENT +If the moving average of the proportion of articles being accepted +(rather than declined) by the peer exceeds this value, innduct uses +"no check mode" - ie it just sends the peer the articles with TAKETHIS +rather than checking first with CHECK whether the article is wanted. +This only affects streaming connections. The default is +.B 95 +(ie, 95%). +.TP +.BI \-\-no-check-response-time= ARTICLES +The moving average mentioned above is an alpha-smoothed value with a +half-life of +.IR ARTICLES . +The default is +.BR 100 . +.TP +.BI \-\-reconnect-interval= RECONNECT-PERIOD +Limits initiation of new connections to one each +.IR RECONNECT-PERIOD . +This applies to reconnections if the peer has been down, and also to +ramping up the number of connections we are using after startup or in +response to an article flood. The default is +.BR 1000s . +.TP +.BI \-\-flush-retry-interval= PERIOD +If our attempt to flush the feed failed (usually this will be because +innd is not running), try again after +.IR PERIOD . +The default is +.BR 1000s . +.TP +.BI \-\-earliest-deferred-retry= PERIOD +When the peer responds to our offer of an article with a 431 or 436 +NNTP response code, indicating that the article has already been +offered to it by another of its peers, and that we should try again, +we wait at least +.IR PERIOD . +before offering the article again. The default is +.BR 50s . +.TP +.BI \-\-backlog-rescan-interval= BACKLOG-SCAN-PERIOD +We scan the directory containing +.I feedfile +for backlog files at least every +.IR BACKLOG-SCAN-PERIOD , +in case the administrator has manually dropped in a file there for +processing. +The default is +.TP +.BI \-\-max-flush-interval= PERIOD +We flush the feed at least every +.IR PERIOD +even if the current instance of the feedfile has not reached the size +threshold. +The default is +.BR 100000s . +.TP +.BI \-\-max-flush-interval= PERIOD +We flush the feed and start a new feedfile at least every +.IR PERIOD +even if the current instance of the feedfile has not reached the size +threshold. +The default is +.BR 100000s . +.TP +.BI \-\-flush-finish-timeout= FLUSH-FINISH-PERIOD +If we flushed +.IR FLUSH-FINISH-PERIOD +ago, and are still trying to finish processing articles that were +written to the old feed file, we forcibly and violently make sure that +we do by abandoning and deferring all the work (which includes +unceremoniously dropping any connections on which we've sent some of +those articles but not yet had replies, as they're probably stuck +somehow). +The default is +.BR 2000s . +.TP +.BI \-\-idle-timeout= PERIOD +Connections which have had no activity for +.IR PERIOD +will be closed. This includes connections where we have sent commands +or articles but have not yet had the responses, so this same value +doubles as the timeout after which we conclude that the peer is +unresponsive or the connection has become broken. +The default is +.BR 1000s . +.TP +.BI \-\-max-bad-input-data-ratio= PERCENT +We tolerate up to this proportion of badly-formatted lines in the +feedfile and other input files. Every badly-formatted line is logged, +but if there are too many we conclude that the corruption to our +on-disk data is too severe, and crash; to successfully restart, +administrator intervention will be required. This avoids flooding the +logs with warnings and also arranges to abort earlyish if an attempt +is made to process a file in the wrong format. +The default is +.BR 1 +(ie, 1%). +.TP +.BI \-\-max-bad-input-data-init= LINES +Additionally, we tolerate this number of additional badly-formatted +lines, so that if the badly-formatted lines are a few but at the start +of the file, we don't crash immediately. +The default is +.BR 30 +(which would suffice to ignore one whole corrupt 4096-byte disk block +filled with random data, or one corrupt 1024-byte disk block filled +with an inappropriate text file with a mean line length of at least +35). +.SH CONTROLLING INNDUCT +If you tell innd to drop the feed, innduct will (when it notices, +which will normally be the next time it decides flushes) finish up the +articles it has in hand now, and then exit. It is harmless to cause +innd to flush the feed (but innduct won't notice and flushing won't +start a new feedfile; you have to leave that to innduct). +.LP +If you want to stop innduct you can send it SIGTERM or SIGINT, or the +.B stop +control command, in which case it will report statistics so far and +quickly exit. If innduct receives SIGKILL nothing will be broken or +corrupted; you just won't see some of the article stats. +.LP +innduct listens on an AF_UNIX socket, and provides a command-line +interface which can be used to trigger various events and for +debugging. innduct listens (by default on +.IR pathrun \fB/innduct/\fR site ) +and when connected reads and writes lines (with unix line endings). +The cli can most easily be accessed with a program like +.I netcat-openbsd +(eg +.B nc.openbsd -U /var/run/news/innduct/ +.IR site ) +or +.IR socat . +The prompt is +.IR site \fB|\fR. +.LP +The following control commands are supported: +.TP +.B h +Print a list of all the commands understood. This list includes +undocumented commands which mess with innduct's internal state and +should only be used by a developer in conjuction with the innduct +source code. +.TP +.B flush +Start a new feed file and trigger a flush of the feed. (Or, cause +the +.I FLUSH-FINISH-PERIOD +to expire early, forcibly completing a previously started flush.) +.TP +.B stop +Log statistics and exit. (Same effect as SIGTERM or SIGINT.) +.TP +.BR "dump q" | a +Writes information about innduct's state to a plain text file +.IR feedfile \fB_dump\fR. +This overwrites any previous dump. +.B "dump q" +is a summary including general state and a list of connections; +.B "dump a" +also includes information about each article innduct is dealing with. +.TP +.B next blscan +Requests that innduct rescan for new backlog files at the next +.I PERIOD +poll. Normally innduct assumes that any backlog files dropped in by +the administrator are not urgent and may not get around to noticing +them for +.IR BACKLOG-SCAN-PERIOD . +.TP +.B next conn +Resets the connection startup delay counter so that innduct may +consider making a new connection to the peer right away, regardless +of the setting of +.IR RECONNECT-PERIOD . +A connection attempt will still only be made if innduct feels that it +needs one, and innduct may wait up to +.I PERIOD +before actually starting the attempt. +.IR BACKLOG-SCAN-PERIOD . +.SH EXIT STATUS +.TP +.B 0 +An instance of innduct is already running for this +.I feedfile +and +.B -q +was specified. +.TP +.B 4 +The feed has been dropped by innd, and we (or previous innducts) have +successfully offered all the old articles to the peer site. Our work +is done. +.TP +.B 8 +innduct was invoked with bad options or command line arguments. The +error message will be printed to stderr, and also (if any options or +arguments were passed at all) to syslog with severity +.BR crit . +.TP +.B 12 +Things are going wrong, hopefully shortage of memory, system file +table entries; disk IO problems; disk full; etc. The specifics of the +error will be logged to syslog with severity +.B err +(if syslog is working!) +.TP +.B 16 +Things are going badly wrong in an unexpected way: system calls which +are not expected to fail are doing so, or the protocol for +communicating with innd is being violated, or some such. Details will +be logged with severity +.B crit +(if syslog is working!) +.TP +.BR 24 - 27 +These exit statuses are used by children forked by innduct to +communicate to the parent. You should not see them. If you do, it is +a bug. +.SH FILES +innduct dances a somewhat complicated dance with innd to make sure +that everything goes smoothly and that there are no races. (See the +two ascii-art diagrams in innduct.c for details of the protocol.) Do +not mess with the feedfile and other associated files, other than as +explained here: +.IX Header "FILES" +.IP \fIpathrun\fR +.IX Item "default directory" +Default current working directory for innduct, and also default +parent directory for the command line socket. +.IP \fIpathoutgoing\fR/\fIsite\fR +.IX Item "default feedfile" +Default +.IR feedfile . +.IP \fIfeedfile\fR +.IX Item feedfile +Main feed file as specified in +.IR newsfeeds (5). +This and other batchfiles used by innduct contains lines each of which +is of the form +\& \fItoken\fR \fImessageid\fR +where \fItoken\fR is the inn storage API token. Such lines can be +written by \fBTf,Wnm\fR in a \fInewsfeeds\fR(5) entry. During +processing, innduct overwrites lines in the batch files which +correspond to articles it has processed: the line is replaced with +one containing only spaces. Only innd should create this file, and +only innduct should remove it. +.IP \fIfeedfile\fR_lock +.IX Item "lock file" +Lockfile, preventing multiple innduct invocations for the same +feed. A process holds this lock after it has opened the lockfile, +made an fcntl F_SETLK call, and then checked with stat and fstat that +the file it now has open and has locked still has the name +\fIfeedfile\fR_lock. (Only) the lockholder may delete the lockfile. +For your convenience, after the lockfile is locked, +.IR innfeed 's +pid, the +.IR site , +.IR feedfile +and +.IR fqdn +are all written to the lockfile. NB that stale lockfiles may contain +stale data so this information should not be relied on other than for +troubleshooting. +.IP \fIfeedfile\fR_flushing +.IX Item "flushing file" +Batch file: the main feedfile is renamed to this filename by innduct +before it asks inn to flush the feed. Only innduct should create or +remove this file. +.IP \fIfeedfile\fR_defer +.IX Item "flushing file" +Batch file containing details of articles whose transmission has +recently been deferred at the request of the recipient site. Created, +written, read and removed by innduct. +.IP \fIfeedfile\fR_backlog.\fItime_t\fR.\fIinum\fR +.IX Item "backlog file" +Batch file containing details of articles whose transmission has less +recently been deferred at the request of the recipient site. Created +by innduct, and will also be read and removed by innduct. However you +(the administrator) may also safely remove backlog files. +.IP \fIfeedfile\fR_backlog\fIsomething\fR +.IX Item "manual backlog file" +Batch file manually provided by the administrator. The file should be +complete and ready to process at the time it is renamed or hardlinked +to this name. innduct will then automatically find and read and +process it and eventually remove it. The administrator may also +safely remove backlog files. \fIsomething\fR may not contain \fB#\fR +\fB~\fR or \fB/\fR. Be sure to have finished writing the file before +you rename it to match the pattern \fIfeedfile\fR\fB_backlog\fR*, as +otherwise innduct may find and process the file and read it to EOF +before you have finished creating it. +.IP \fIpathrun\fR\fB/innduct/\fB\fIsite\fR +.IX Item "control command line socket" +Default AF_UNIX listening socket for the control command line. See +CONTROLLING INNDUCT, above. +.IP \fIfeedfile\fR_dump +.IX Item "debug dump file" +On request via a control connection innduct dumps a summary of its +state to this text file. This is mostly useful for debugging. +.IP /etc/news/inn.conf +.IX Item inn.conf +Used for +.IR pathoutgoing +(to compute default +.IR feedfile +and associated paths), +.IR pathrun +(to compute default +.IR CLI-DIR and -.I shrinkfile -operate. -When no such pairs are given on the command line, then -the information found in -.I innduct.ctl -becomes the default flags for that site. -Any flags given on the command line override the default flags -for the site. -.SH OPTIONS -.TP -.B "\-d \-D" -The ``\-d'' flag causes -.I innduct -to send output to stdout rather than the log file -.IR /innduct.log . -The ``\-D'' flag does the same -and it passes ``\-d'' to all -.I innxmit -invocations, which in turn causes -.I innxmit -to go into debug mode. -.TP -.B -n -If the ``\-n'' flag is used, then -.I innduct -does not use -.IR shlock (1) -and does not lock batch files. -.TP -.B \-s size -If the ``\-s'' flag is used, then -.IR shrinkfile (1) -will be invoked to perform a head truncation on the batchfile and the flag -will be passed to it. -.TP -.B \-w delay -If the ``\-w'' flag is used, then -.I innduct -waits for -.I delay -seconds after flushing the site before launching -.IR innxmit . -.TP -.B "\-a \-c \-l \-N \-P \-p \-r \-S \-T \-t" -The ``\fB\-a\fP'', ``\fB\-c\fP'', ``\fB\-l\fP'', ``\fB\-P\fP'', ``\fB\-p\fP'', -``\fB\-r\fP'', \``\fB\-S\fP'', ``\fB\-T\fP'' and ``\fB\-t\fP'' -flags are passed on to the child -.I innxmit -program. The ``\-N'' flag is passed as ``\fB\-s\fP'' flag to the child -.I innxmit -program. -See -.IR innxmit (8) -for more details. -Note that if the ``\-p'' flag is used then no connection is made and -no articles are fed to the remote site. -It is useful to have -.IR cron (8) -invoke -.I innduct -with this flag in case a site cannot be reached for an extended period of time. -.SH EXAMPLES -With the following -.IR innduct.ctl (5) -control file: -.PP -.RS -.nf -nsavax:erehwon.nsavax.gov::-S -t60 -group70:group70.org:: -walldrug:walldrug.com:4m-1m:-T1800 -t300 -kremvax:kremvax.cis:2m: -.fi -.RE -.PP -The command: -.PP -.RS -innduct -.PP -.RE -will result in the following: -.PP -.RS -.nf -Sitename Truncation Innxmit flags -nsavax (none) \-a \-S \-t60 -group70 (none) \-a \-t180 -walldrug 1m if >4m \-a \-T1800 \-t300 -kremvax 2m \-a \-t180 -.fi -.RE -.PP -The command: -.PP -.RS -innduct \-d \-T1200 -.RE -.PP -will result in the following: -.PP -.RS -.nf -Sitename Truncation Innxmit flags -nsavax (none) \-a \-d \-S \-T1200 \-t60 -group70 (none) \-a \-d \-T1200 \-t180 -walldrug 1m if >4m \-a \-d \-T1200 \-t300 -kremvax 2m \-a \-d \-T1200 \-t180 -.fi -.RE -.PP -The command: -.PP -.RS -innduct \-s 5m \-T1200 nsavax erehwon.nsavax.gov group70 group70.org -.PP -.RE -will result in the following: -.PP -.RS -.nf -Sitename Truncation Innxmit flags -nsavax 5m \-a \-T1200 \-t180 -group70 5m \-a \-T1200 \-t180 -.fi -.RE -.PP -Remember that ``\-a'' is always given, and ``\-t'' defaults to 180. +.IR CLI-PATH ), +for finding how to communicate with innd, and also for +.IR sourceaddress +and/or +.IR sourceaddress6 . .SH HISTORY -Written by Landon Curt Noll -and Rich $alz for InterNetNews. -.de R$ -This is revision \\$3, dated \\$4. -.. -.R$ $Id: innduct.8 5909 2002-12-03 05:17:18Z vinocur $ +Written by Ian Jackson .SH "SEE ALSO" inn.conf(5), -innxmit(1), -newsfeeds(5), -innduct.ctl(5), -shrinkfile(1). +newsfeeds(5)