[ Top | Up | Prev | Next | Map | Index ]

Readme for analog3.90beta1

Form interface and CGI program

The form interface provides an HTML front end to analog. That means that users can select options from a web page, instead of having to create a configuration file.

The form interface, including the forms in the various languages, is completely rewritten in this version. Please let me know whether it works!

Please don't try and set up the form until analog has been set up and is running properly on its own. It just adds another level of complexity to troubleshoot. And unlike analog itself, the form interface will not run "out of the box". You have to read this section to find out how to set it up.

(Actually, analog itself can run as a CGI program if you include the configuration command

CGI ON
to add the correct HTTP headers to the output. You can't choose any options that way though.)

The form interface is suitable for ordinary users to use, but it needs to be set up by a system administrator or other expert. In order to set it up, you have to be running a web server. You need to know what CGI programs are, where they live on your server, and how to set up their permissions properly. You also need to know how to write HTML forms. I shall assume this level of background knowledge for the rest of this section. And you have to be running Perl 5.001 or later: see Technical details below for other system requirements.

Warning: CGI programs can contain security loopholes which allow an unscrupulous user to harm your system. (If you don't know about this, you shouldn't be running CGI programs at all. Read and understand the World Wide Web Security FAQ and the CGI Security FAQ first.) I have tried to make this form interface safe, but I cannot guarantee it. Even the most carefully-designed CGI programs can accidentally have serious security bugs. And I take no responsibility if anything goes wrong: you use it at your own risk. (See the licence.) Furthermore, you should be aware that unless you take special measures like password protection, setting up the form interface implies making analog executable, and your logfiles analysable, by anyone on the internet.

The form interface consists of two parts: a form (called anlgform.html) to choose the options, and a cgi program (called anlgform.pl) to interpret them and pass them to the analog program. Both anlgform.html and anlgform.pl must be configured to your system before they will work at all. There are instructions at the top of both files explaining how to do this.

The form which is distributed with the program should only be regarded as an example form. You can find forms in languages other than English in the lang directory. Or you can write your own if you prefer. In fact you don't actually need the form at all: if you want just to create a link to the cgi program, with the arguments passed after a question mark in the URL in the usual way, then that's fine.


Almost every analog configuration command can be specified on the form, just by including a form element with that name on the form. So, for example, if you wanted to add a field for users to choose a logfile, you could write
Logfile name: <input type=text name="LOGFILE">
or maybe something like
<select name=LOGFILE size=1>
  <option value="/var/log/apache/fred"> Fred's logfile
  <option value="/var/log/apache/jane"> Jane's logfile
</select>

There are a few commands which you can't specify on the form for security or performance reasons. The full list is *LOGFORMAT, LANGFILE, HEADERFILE, FOOTERFILE, UNCOMPRESS, OUTFILE, CACHEOUTFILE, ERRFILE, DNS and CGI. The reasons for these exclusions are given at the top of anlgform.pl.

In fact the person setting up the CGI program can add more commands to the forbidden list at the top of anlgform.pl. For example, it is theoretically possible (though rather unlikely), that another file on your system could conform sufficiently closely to one of the predefined log formats that analog could be persuaded to analyse it and so reveal some of its contents. If you're worried about this, or even if you want to force only one particular logfile to be analysed from the form, you can add the LOGFILE command to the list of forbidden commands. While we're being paranoid, you could specify DOMAINSFILE for the same reasons.


Some commands are most conveniently specified in two halves. First, there are commands which take two arguments (for example ALIASes). You can cope with these by sending two commands from the form, called COMMAND1 and COMMAND2. For example,
Alias this file: <input type=text name="FILEALIAS1">
To this one: <input type=text name="FILEALIAS2">
You can only specify one such pair this way; so there's no way to specify several of the same ALIAS, for example.

Then there are FLOOR commands. To avoid users of the form having to know the syntax of these commands, you can if you want specify them in two halves, FLOORA and FLOORB, and they will be stuck together. For example, the form distributed with the program specifies

<br>Include all domains with at least
<input type=TEXT name="DOMFLOORA" maxlength=6 size=6>
<select name="DOMFLOORB">
  <option value=r>requests
  <option value=p>requests for pages
  <option value=b selected>bytes
</select>
If DOMFLOORA contains 5% and DOMFLOORB contains r, then DOMFLOOR 5%r will be sent to the program. (Or DOMFLOORA=5 and DOMFLOORB=%r would work too, if you chose to present the form that way.)
There are a couple of extra non-analog commands which can be sent from the form. First, if the option qv=1 is set, then analog is not run, but a list of the configuration commands which would have been sent to analog is printed instead. This is useful for checking that the CGI program is working properly. It can also allow users to produce a configuration file from form settings.

Secondly, you can specify other configuration files to be included at specific times. When analog is called by the CGI program, it first processes the default configuration file as usual. Then it processes any configuration file specified by an option with name cg. Then it processes all the other commands which the CGI program specifies. After that, it processes any configuration file specified by an option with name cm. Finally, it processes the mandatory configuration file as usual. (You may therefore want two copies of analog, one for form use and one for non-form use, with different configuration files compiled in.) Note that the commands in the default and mandatory configuration files will contribute to the configuration: some of them may even override options specified on the form. For example, if the default configuration file contains an INCLUDE command, this may cause INCLUDE and EXCLUDE commands specified on the form to behave unexpectedly.


anlgform.pl usually sends the commands to analog in the order which it received them, which should be the same as the order they occurred in the form. But there are some exceptions. First, all commands of the same name are grouped together. So an interleaved sequence of INCLUDEs and EXCLUDEs won't work, for example. Secondly, even though the names of commands are case-insensitive, commands of the same name but in different cases may come in the wrong order. Keep them in the same case! Thirdly, LOGTIMEOFFSET is sent first (and thus applies to any logfiles specified on the form).

There are a couple of commands which the form always sets. These may override what you have set elsewhere. First, it sets either DNS READ (if a DNSFILE is set on the form) or DNS NONE (otherwise). You can override this behaviour in the mandatory configuration file, but you are likely to run into timeout problems if you do. Secondly, it always sets WARNINGS FL, so that the less important warnings don't fill up your server's error log. You can override this by sending an explicit WARNINGS command at the beginning of the form.


Troubleshooting

There are lots of reasons why the form interface may not work, and I can't diagnose them very easily. Here is what to do if you are having problems.

First, you can run anlgform.pl from the command line. This is good enough to debug most problems. You can specify options in pairs like this:

anlgform.pl qv=1 LOGFILE=/some/log REQINCLUDE=pages
If you include qv=1 in the argument list as above, you will see what anlgform.pl is trying to send to analog. If you don't include qv=1, anlgform.pl will try and run analog.

If it still doesn't work, check the following points:

  1. Have you edited anlgform.pl and anlgform.html as instructed at the top of those files?
  2. Do other CGI programs work on your server? Is anlgform.pl in the right place to be recognised as a CGI program by the server?
  3. Look in the server's error log for clues.
  4. Are all relevant files (analog itself, logfiles, configuration files, auxiliary files such as domain files...) executable/readable by your web server?
  5. If some form options don't seem to take effect, then check whether they are being overridden by a command in a configuration file.
  6. If you get a long wait, then no data returned, the server is probably timing out the request before analog has finished. The remedy is to increase the timeout interval.
  7. As explained above, the form always sets DNS READ or DNS NONE, and WARNINGS FL, overriding your default configuration file.

Technical details

You need to be running Perl 5.001 or later. You can get the latest version of Perl free from www.perl.org. You also need the module CGI.pm, and on Windows Win32::Console from the file libwin32, but these should normally be distributed with Perl anyway.

On Windows, you have to associate the .pl extension with the Perl executable so that Perl scripts are executed by Perl.

anlgform.pl will understand the GET or POST methods of form submission. The HTML spec says that GET should be used when, as in this case, running the program has no side effects. However, section 15.1.3 of the HTTP spec says that POST should be used if some of the options being passed might be confidential. Also, very long URLs, formed by specifying lots of options, can cause trouble to some older servers. So anlgform.html uses the POST method by default. However, the GET method will also work. For example, you could make a normal link to anlgform.pl with options specified after a question mark in the usual GET way.


Stephen Turner
Need help with analog? Subscribe to the analog-help mailing list

[ Top | Up | Prev | Next | Map | Index ]