[ Top | Up | Prev | Next | Map | Index ]

Readme for analog 4.03

Frequently asked questions

This list is divided into six sections:
  1. Getting Started
  2. Basic Configuration
  3. Understanding the Output
  4. Advanced Usage
  5. Form Interface
  6. Design Decisions

A. Getting Started

Most questions in this category are answered in the section entitled Starting to use analog. If you can't get analog running you should look there.
  1. Analog doesn't have a setup.exe.
    No, and it doesn't need one. It's already ready to run! See Starting to use analog under Windows.
  2. Analog just flashes up a DOS window and then quits.
    This is the correct behaviour. It should have created a report called Report.html. See Starting to use analog under Windows.
  3. When I try and compile analog, it gives me an error (e.g. on SunOS 5).
    Maybe you need to edit the Makefile. There are some platform-specific notes in the section Starting to use analog on other platforms, and in the Makefile itself.
  4. Analog didn't write the logfile when I ran it.
    Analog doesn't write the logfiles. Your web server writes the logfiles, and analog just reads them. See Starting to use analog.
  5. Analog is looking for files like /usr/local/etc/httpd/analog/analog.cfg which don't exist.
    You have to set the location of these files in anlghead.h before compiling.
  6. Analog won't read extended logfiles generated by IIS.
    This server writes the date only at the top of the logfile, not on every line. But it doesn't write a new date if the date changes during the logfile, so analog can't tell which date later entries in the log occurred on. More details, and what to do about it, are in the section on Choosing a logfile.
  7. What does "Logfile with ambiguous dates" mean?
    See the section on Errors and warnings.
  8. What does this error message mean?
    Again, see the section on Errors and warnings.
  9. I tried to run analog from my browser, but it didn't work.
    Analog should not be run as a CGI program, or even put in the folder with your CGI programs, for security reasons. You should use the special CGI program instead.
  10. Is analog Year 2000 compatible?
    Yes (and so are all previous versions). It interprets two-year dates in input as lying between 1970 & 2069 inclusive.

B. Basic Configuration

Analog has lots of configuration commands, all of which are in the section on Customising analog. Here are some of the most common questions. If your question isn't answered here, you could also try looking in the index.
  1. I want to make several different statistics pages. Do I have to install several copies of analog?
    No. Just install it once, and run it with different configuration files.
  2. My analog.cfg included lots of CONFIGFILE commands, but only one report was produced.
    Analog can only produce one report per run. To produce several reports, you have to run it several times.
  3. Why doesn't the Daily Report only show the last six weeks?
    This is controlled by the FULLDAYROWS command.
  4. Why do the time reports all list 0 requests?
    They probably only list 0 requests for pages. Maybe you need to use PAGEINCLUDE to count more files as pages.
  5. How do I get the Request Report to list files with fewer than 20 requests?
    Use the REQFLOOR command.
  6. How do I ignore accesses from my site?
    Use the HOSTEXCLUDE command.
  7. How do I ignore internal referrers in the Referrer Report?
    Use the REFREPEXCLUDE command.
  8. How do I get information on just my pages, not everybody's?
    Use the FILEINCLUDE command.
  9. I used the command "DIREXCLUDE /mydir/", but files in that directory were still listed.
    DIREXCLUDE only affects the Directory Report, not the other reports. You want "FILEEXCLUDE /mydir/*" instead.
  10. I used the command "FILEEXCLUDE /cgi-bin/script.pl", but that file was still listed in the Request Report.
    If the file has search arguments, you have to be a bit careful with FILEEXCLUDE. This is described in the section about search arguments.
  11. Does the order of the commands matter in the configuration file? Only occasionally. If you have two of one command, the later one will generally override the earlier one. Apart from that, commands can come in any order, except that LOGFORMAT and LOGTIMEOFFSET commands must come before the LOGFILE to which they refer.
  12. Why are my browser and referrer reports empty?
    Maybe your logfile doesn't contain any browser and referrer information?
  13. Why isn't the Referrer Report sorted properly?
    It is sorted properly. But search arguments are also listed under the file they belong to, and this interrupts the ordering. If you set the REFARGSFLOOR high enough you won't see the search arguments. Or you can include the N column to make the ordering more obvious.
  14. Why can't I have P in the REQCOLS or REQSORTBY?
    The number of page requests doesn't make sense in the Request Report because it's either the same as the number of requests (if the file is a page) or zero (if it isn't). If you want to list only pages in this report, use REQINCLUDE pages instead.
  15. I want to list (or not to list) referrers with their search arguments in the Referrer Report.
    To see the search arguments you may need to set the REFARGSFLOOR lower. To avoid seeing them, you could set the REFARGSFLOOR higher, or alternatively use the REFARGSEXCLUDE command to ignore them either for all files or just for particular files.
  16. Can I find out which files each referrer pointed to?
    or Can I find out which files each host has read?
    or Can I find out which hosts have read each file?
    or Can I find out the number of hosts visiting on each day?
    or lots of similar questions.
    There are lots of questions like this. They all want analog to cross-reference two sorts of item (e.g. files and referrers in the first example above, or hosts and dates in the last). Granted, these would be useful. But it is fundamental to analog's speed and minimal memory requirement that it only records statistics for each type of item individually, and doesn't record enough information to cross-reference them afterwards.
    What you can do is to restrict the analysis to just requests from certain referrers (for example) with the REFINCLUDE command, or to a particular time period with FROM and TO. This is often good enough.
  17. Can I use %d, %m etc. in the LOGFILE, like I can in the OUTFILE?
    No. This is rarely useful, because you can only get at one logfile that way. If you're on Unix, you can embed the date in the logfile name using the date command: for example,
    analog access.`date +%Y%m%d`.log
  18. I get the message "logfiles overlap" even though the two logfiles contain completely separate requests.
    This message is based only on the dates of the files, not the contents. If you're sure there is no problem, you can turn it off with the command WARNINGS -L.
  19. Can I get data on individual visitors, or visits, to my site?
    No, it's not technically possible, and don't believe any program which tells you it is. See the section on How the web works for details.
  20. Can I change the background colour of my output?
    Yes. The correct way to do this is to write a style sheet, and then use the STYLESHEET command.
  21. Can I change the way dates are formatted in the output?
    or Can I change some of the phrases in the output?
    Yes, by editing the language file.

C. Understanding the Output

Most of the questions in this category are answered in the section on What the results mean, which I really recommend you read if you want to understand what analog is telling you.
  1. How do I find out the number of hits from your data?
    I don't use the word hits, because people use it in different ways, so it's misleading. I use requests for the number of transfers of any type of file (text, graphics, ...), and page requests for the number of transfers of HTML pages. See the section on Analog's definitions for more information.
  2. Why are there so many referrers from my own site?
    These come from all the internal links on your site, and all the graphics on your pages. See the section on How the web works for more information. If you don't want to see them, you can use REFREPEXCLUDE to exclude them.
  3. Why doesn't analog agree with the counter on my page?
    There are lots of possible reasons. Do they both start from the same date? Are you just looking at requests for that one page with analog, not for all your other pages and graphics? Also, analog will record all requests to that page; if it's a graphic, your counter will only measure requests from people on graphical browsers that reached that place on the page.
  4. Why do I only get "unresolved numerical addresses" in the domain report?
    Your server only records the numerical IP address of the hosts that contact you, not their names. Read the section about DNS lookups, or turn DNS resolution on in your server.
  5. Why are my click-thru's (or CGI scripts) not listed in the Request Report?
    If they cause a redirection to another page, they will be listed in the Redirection Report, rather than the Request Report.
  6. Why are directories listed in the Request Report?
    They are not directories, they are pages with the same name as the directory. For example, I have both a directory called /analog/ and a page called /analog/ (which happens to be the same as /analog/index.html).
  7. When someone reads one of my pdf files, it scores dozens of hits.
    PDF files are often downloaded and read one page at a time, and each page will then count as a separate request. Although this is not ideal, it's much less clear what to do about it. Analog has no way of knowing how many pages constituted a single download in the reader's mind. As usual, we can only reliably report how many requests there were at the server, not guess what users did with the file later.

D. Advanced Usage

  1. How can I do such-and-such with a command line option?
    Use the +C option to put any configuration command on the command line.
  2. I want a list of all command line arguments.
    There is a list in the index.
  3. Can analog read FTP logfiles?
    Yes. If you are using the xferlog format, then there is a configuration file to help you in the examples directory. Otherwise you will have to write your own LOGFORMAT. (You probably won't be able to read anything other than the lines corresponding to file transfers.)
  4. How can I run analog automatically every day?
    This depends on your particular machine. On Unix, you need to run analog as a cron job (see "man cron"). This is my cron command to run it at 1:50am every day:
    50 1 * * * $HOME/bin/analog
    On Windows NT you can do the same with the at command, but only an administrator can run at. On Windows 98, it should be possible with the Task Scheduler, although I haven't tried it. On Windows 95 it's not possible as far as I know.
    On Mac, there are programs called Cron or CronoTask to do this.
  5. I'm setting up IIS. Which logfile format should I use?
    The W3C format is probably best. You can turn fields on and off in this format. And it contains all the possible fields which can be logged, which the other formats do not. However, it is important to turn the date field on (it's off by default), not just to log the date once at the top: see the section on problems with logfile formats for why.
  6. I host lots of virtual domains. How should I set up analog?
    There's a file in the examples directory which discusses this issue.
  7. Can I make multiple reports with one pass through the logfile?
    Not at the moment. I want to do this in a future version, but it will require some considerable work.
  8. I ran out of memory when trying to run analog. What can I do?
    See the section on Coping with low memory.
  9. You're processing 20,000,000 requests in under 10 minutes. Why is mine much slower?
    or Analog appears to stall.
    If you have DNS lookups on, they are very slow. Otherwise, it probably depends on the speed of your computer and disks, and what other programs are running at the same time. You can use the PROGRESSFREQ command to see if it's really stalled or whether it's just being slow. If you are running out of memory, you might find analog's LOWMEM commands helpful.
  10. How do I make a link on my page that runs analog?
    Link to the anlgform program, with the desired options. But be careful about the load on your server.
  11. Do I have to save all my old logfiles?
    or Can analog make statistics from an old report instead of reading the whole logfile again?
    These questions are answered in the section about Cache files.
  12. Can analog write to a database or spreadsheet?
    Use the computer-readable output style, which can export to CSV. Or if what you really want to do is to run analog again without re-reading the logfiles, read the section about Cache files.

E. Form Interface

There is also a section on troubleshooting in the documentation about the form interface.
  1. I couldn't make the form run.
    Have you made analog work without the form? Have you run anlgform.pl from the command line as explained in the section on troubleshooting?
  2. How can I specify different logfiles from the form interface?
    Just add a new field to the form with name=LOGFILE
  3. I specified LOGFILE=/var/log/apache/* from the form but it didn't work.
    On the form, you can't use wildcards in the LOGFILE name for security reasons.
  4. My browser showed me anlgform.pl, rather than running it.
    You have to tell the server to execute the CGI program, not just send it out like it would for a normal file. Often this is done by putting it in a special /cgi-bin/ directory.
  5. Why does the form interface give "Document Returned no Data"?
    If it doesn't happen for a while, then probably the server is giving up before the analog process has finished running. Increase the timeout interval on the server.
  6. The images don't appear when running analog from the form interface.
    You probably need to set the IMAGEDIR. If the images are in your /cgi-bin/ directory, the server will normally try to execute them instead of just sending them out.
  7. Why do I get some reports that weren't requested on the form?
    If a report is neither included nor excluded on the form, the system default will be used. This will depend on your configuration files and on compile-time settings.

F. Design Decisions

or "Why didn't you do it this way?"
  1. Why doesn't the HEADERFILE replace the whole <head> of the output file?
    Because you almost never get valid HTML that way. Use a style sheet instead.
  2. Why not use HTML tables?
    Most non-graphical browsers don't do a good job with tables. Also tables aren't available in HTML 2.0, which is the sort of HTML analog writes.
  3. Why are you still using HTML 2.0?
    Unfortunately my bar charts aren't valid in HTML 4.0.
  4. It would be better if you used png's instead of gif's.
    I'm aware of the issues. But png support isn't good enough even in new browsers; and I have always made a point of designing analog to work even on old browsers.
  5. Why not just do DNS resolution of the hosts that actually make it into the Host Report?
    There is one theoretical and one practical problem. Theoretically, the problem is that which hosts do make it into the Host Report can change when the DNS lookups have been done. And practically, this wouldn't help identify the busiest countries or organisations, which is usually what you really want to know. However, there is a Perl script on the helper applications page to do this.
  6. Couldn't you do the DNS lookups faster with threads?
    The problem is, the standard commands for DNS lookups are not thread-safe on many platforms, so it would involve a lot of platform-specific code. Again, there are programs for specific platforms on the helper applications page.
  7. Why doesn't analog analyse the error_log?
    This is answered in detail on the What's new? page. But in summary, it's too difficult because each server has a different format for its error log. The various failure reports are good enough for most purposes.
  8. My server lists local names in the logfile. Can you put a common suffix on them automatically?
    This wouldn't be a good idea by default, because things like "unknown" would get the suffix. You can always add them using HOSTALIAS. On operating systems with regular expressions, there is an example to accomplish this in the section about aliases.
  9. Can you extrapolate from the current month's partial data to produce a prediction for the whole month, based on the rate so far?
    No. There are too many problems in trying to produce anything sensible, especially near the beginning of the month. Different days of the week and different times of day cause lots of problems. I would prefer to produce raw accurate data than suspect derived data.
  10. Can you extend the Domain Report to say which US states people visited from?
    No. Some programs pretend to do this, but you can actually only tell which state the computer the person was using is in, which may be quite different from where the user was for ISP's or other large organisations.
  11. Why not use language codes instead of country codes for the names of the language files?
    People are more familiar with the country codes. And not all of my languages have language codes anyway.
  12. Why don't you sell analog?
    I didn't write analog for the money, and I'm happy just to see people use it. Also, by making it open source, lots of people send me ideas and code to include in future versions. How do you think I got all those languages? (Of course, if you want to send me money, or gifts in kind, or even just postcards...).

If there's still something you can't figure out, see the next section for how to get help with analog.
Stephen Turner
Need help with analog? Subscribe to the analog-help mailing list

[ Top | Up | Prev | Next | Map | Index ]