[ Top | Up |
Prev | Next |
Map | Index ]
Analog 4.15:
Frequently asked questions
This list is divided into six sections:
- Getting Started
- Basic Configuration
- Understanding the Output
- Advanced Usage
- Form Interface
- Design Decisions
- Getting Started
See also Starting to use
analog.
- Analog doesn't have a setup.exe.
- Analog just flashes up a DOS window and then
quits.
- When I try and edit analog.cfg,
Windows asks me which program I want to use to open that file.
- When I try and compile analog, it gives me an
error (e.g. on SunOS 5).
- Analog didn't write the logfile when I ran
it.
- Analog is looking for files like
/usr/local/etc/httpd/analog/analog.cfg which don't
exist.
- Analog won't read extended logfiles generated by
IIS.
- What does "Logfile with ambiguous
dates" mean?
- What does this error message mean?
- I tried to run analog from my browser, but it
didn't work.
- Is analog Year 2000 compatible?
- Basic Configuration
- I want to make several different statistics
pages. Do I have to install several copies of analog?
- My analog.cfg included lots of
CONFIGFILE commands, but only one report was
produced.
- Why doesn't the Daily Report only show the last
six weeks?
- Why do the time reports all list 0 requests?
- How do I get the Request Report to list files
with fewer than 20 requests?
- How do I ignore accesses from my site?
- How do I ignore internal referrers in the
Referrer Report?
- How do I get information on just my pages, not
everybody's?
- How do I list subdirectories not just top-level
directories in the Directory Report?
- How do I list minor browser versions in the
Browser Summary?
- I used the command "DIREXCLUDE
/mydir/", but files in that directory were still
listed.
- I used the command
"FILEEXCLUDE /cgi-bin/script.pl", but that
file was still listed in the Request Report.
- I used the command "IMAGEDIR
C:\analog\images\", but I only got broken images.
- I want a configuration file with all of the
possible configuration commands in it.
- Does the order of the commands matter in the
configuration file?
- Why are my browser and referrer reports
empty?
- Why isn't the Referrer Report sorted
properly?
- I want to list (or not to list) referrers
with their search arguments in the Referrer Report.
- Why are my click-thru's (or CGI scripts)
not listed in the Request Report?
- I can't find /script.pl?q=1 in the
Request Report.
- Why can't I have P in the
REQCOLS, REQSORTBY or
REQFLOOR?
- Can I find out which files each referrer pointed
to?
or Can I find out which files each
host has read?
or Can I find out which hosts have
read each file?
or Can I find out the number of hosts
visiting on each day?
or lots of similar questions.
- Can I use %d, %m etc. in
the LOGFILE, like I can in the
OUTFILE?
- Can SETTINGS ON produce a
configuration file instead of an English list of settings?
- I get the message "logfiles overlap"
even though the two logfiles contain completely separate
requests.
- Can I get data on individual visitors, or
visits, to my site?
- Can I change the background colour of my
output?
- Can I change the way dates are formatted in the
output?
or Can I change some of the phrases
in the output?
- Understanding the Output
See also What the results
mean.
- How do I find out the number of hits from your
data?
- Why are there so many referrers from my own
site?
- The report covers exactly a week, but the
figures for the last seven days don't agree with the totals.
- I only have 240 requests in total. Why does
analog think there are 840 requests per week?
- Why doesn't analog agree with the counter on my
page?
- Why doesn't analog agree with grepping the
logfile?
- Why do I only get "unresolved numerical
addresses" in the Domain Report and Organisation Report?
- Why are directories listed in the Request
Report?
- When someone reads one of my PDF files, it
scores dozens of hits.
- The Organisation Report doesn't identify
organisations correctly.
- "Organization" isn't spelled
correctly.
- Advanced Usage
- How can I do such-and-such with a command line
option?
- I want a list of all command line arguments.
- How do I list all numerical subdomains to depth
2 in the Domain Report?
- I want to be able to count requests with status
code 301 and 302 as successes, so that they appear in the Request
Report.
- I want to report on a field analog doesn't know
about.
- Can analog analyse FTP logfiles?
- Can analog analyse other logfiles, such as mail
logs, or the syslog?
- How can I run analog automatically every day?
- When I run analog in a batch job, it doesn't
work.
- I'm setting up IIS. Which logfile format should
I use?
- I host lots of virtual domains. How should I set
up analog?
- Can I make several reports with just one run of
analog?
- I ran out of memory when trying to run
analog. What can I do?
- You're processing 20,000,000 requests in under
10 minutes. Why is mine much slower?
or Analog appears to stall.
- How do I make a link on my page that runs
analog?
- Do I have to save all my old logfiles?
or Can analog make statistics from an
old report instead of reading the whole logfile again?
- Can analog write to a database or
spreadsheet?
- Form Interface
See also Form
troubleshooting.
- I couldn't make the form run.
- How can I specify different logfiles from the
form interface?
- I specified
LOGFILE=C:\inetpub\wwwroot\w3svc1\*.log from the form
but it said "Unsafe characters in
LOGFILE".
- My browser showed me anlgform.pl, rather than
running it.
- Why does the form interface give "Document
Returned no Data"?
- The images don't appear when running analog from
the form interface.
- Why do I get some reports that weren't requested
on the form?
- How do I make a link to anlgform.pl
without using anlgform.html?
- Is there a form interface not using Perl
(e.g. ASP or .exe)?
- Design Decisions
- Why doesn't the HEADERFILE replace
the whole <head> of the output file?
- Why not use HTML tables?
- Why are you still using HTML 2.0?
- It would be better if you used png's instead of
gif's.
- Why not just do DNS resolution of the hosts that
actually make it into the Host Report?
- Couldn't you do the DNS lookups faster with
threads?
- Why doesn't analog analyse the error_log?
- My server lists local names in the logfile. Can
you put a common suffix on them automatically?
- Can you extrapolate from the current month's
partial data to produce a prediction for the whole month, based on
the rate so far?
- Can you extend the Domain Report to say which US
states people visited from?
- Why not use language codes instead of country
codes for the names of the language files?
- Why doesn't analog produce statistics on
"visits"?
- Why don't you sell analog?
Most questions in this category are answered in the section entitled
Starting to use analog. If you can't get
analog running you should look there.
- Analog doesn't have a setup.exe.
No, and it doesn't need one. It's already ready to run! See
Starting to use analog under
Windows.
- Analog just flashes up a DOS window and then
quits.
This is the correct behaviour. It should have created a report
called Report.html. See Starting
to use analog under Windows.
- When I try and edit analog.cfg,
Windows asks me which program I want to use to open that file.
Use Notepad, or any other plain text editor.
- When I try and compile analog, it gives me an
error (e.g. on SunOS 5).
Maybe you need to edit the Makefile. There are some
platform-specific notes in the section
Starting to use analog on other
platforms, and in the Makefile itself.
- Analog didn't write the logfile when I ran
it.
Analog doesn't write the logfiles. Your web server writes the
logfiles, and analog just reads them. See
Starting to use analog.
- Analog is looking for files like
/usr/local/etc/httpd/analog/analog.cfg which don't exist.
You have to set the location of these files in anlghead.h
before compiling.
- Analog won't read extended logfiles generated by
IIS.
This server writes the date only at the top of the logfile, not on
every line. But it doesn't write a new date if the date changes during
the logfile, so analog can't tell which date later entries in the log
occurred on. More details, and what to do about it, are in the section
on Choosing a logfile.
- What does "Logfile with ambiguous
dates" mean?
See the section on Errors and
warnings.
- What does this error message mean?
Again, see the section on Errors and
warnings.
- I tried to run analog from my browser, but it
didn't work.
Analog should not be run as a CGI program, or even put in the folder
with your CGI programs, for security reasons. You should use the special
CGI program instead.
- Is analog Year 2000 compatible?
Yes (and so are all previous versions). It interprets two-year dates
in input as lying between 1970 & 2069 inclusive.
Analog has lots of configuration commands, all of which are in the section on
Customising analog. Here are some of
the most common questions. If your question isn't answered here, you could
also try looking in the index.
- I want to make several different statistics
pages. Do I have to install several copies of analog?
No. Just install it once, and run it with different
configuration files. (You do have
to run it once per output page though.)
- My analog.cfg included lots of
CONFIGFILE commands, but only one report was produced.
Analog can only produce one report per run. To produce several
reports, you have to run it several times.
- Why doesn't the Daily Report only show the last
six weeks?
This is controlled by the
FULLDAYROWS command.
- Why do the time reports all list 0 requests?
They probably only list 0 requests for pages. Maybe you need to use
PAGEINCLUDE to count
more files as pages.
- How do I get the Request Report to list files
with fewer than 20 requests?
Use the REQFLOOR
command, e.g., REQFLOOR 10r to list down to 10
requests. Also, if you want to list all the files not just pages, you
may need to use the command REQINCLUDE *
- How do I ignore accesses from my site?
Use the HOSTEXCLUDE command.
- How do I ignore internal referrers in the
Referrer Report?
Use the REFREPEXCLUDE command.
- How do I get information on just my pages, not
everybody's?
Use the FILEINCLUDE command.
- How do I list subdirectories not just top-level
directories in the Directory Report?
SUBDIR */*
- How do I list minor browser versions in the
Browser Summary?
Use SUBBROW */*.*
- I used the command "DIREXCLUDE
/mydir/", but files in that directory were still listed.
DIREXCLUDE only affects the Directory Report, not the
other reports. You want "FILEEXCLUDE /mydir/*"
instead.
- I used the command
"FILEEXCLUDE /cgi-bin/script.pl", but that
file was still listed in the Request Report.
If the file has search arguments, you have to be a bit careful with
FILEEXCLUDE. This is described in the section about
search arguments.
- I used the command "IMAGEDIR
C:\analog\images\", but I only got broken images.
The IMAGEDIR command has to be a URL, not a directory on
your disk. (It's just inserted into the <img> tags in
the HTML output: have a look at the output and you'll see.) Also this
means that the images have to be put in the part of your filespace that
has your web files.
- I want a configuration file with all of the
possible configuration commands in it.
One is already distributed with the program, in the
examples folder.
- Does the order of the commands matter in the
configuration file?
Only occasionally. If you have two of one command, the later one will
generally override the earlier one. Apart from that, commands can come
in any order, except that LOGFORMAT
and LOGTIMEOFFSET
commands must come before the LOGFILE to which they refer.
- Why are my browser and referrer reports
empty?
Maybe your logfile doesn't contain any browser and referrer
information?
- Why isn't the Referrer Report sorted
properly?
It is sorted properly. But search arguments
are also listed under the file they belong to, and this interrupts the
ordering. If you set the
REFARGSFLOOR high
enough you won't see the search arguments. Or you can include the
N column to make the
ordering more obvious.
- I want to list (or not to list) referrers
with their search arguments in the Referrer Report.
To see the search arguments you may need to set the
REFARGSFLOOR lower. To
avoid seeing them, you could set the REFARGSFLOOR higher, or
alternatively use the
REFARGSEXCLUDE command
to ignore them either for all files or just for particular files.
- Why are my click-thru's (or CGI scripts)
not listed in the Request Report?
If they cause a redirection to another page, they will be listed in
the Redirection Report, rather than the Request Report.
- I can't find /script.pl?q=1 in the
Request Report.
If it causes a redirection, it will be in the Redirection Report not
the Request Report. But also, you may need to set the
REQARGSFLOOR or
REDIRARGSFLOOR lower to
actually see it.
- Why can't I have P in the
REQCOLS, REQSORTBY or REQFLOOR?
The number of page requests doesn't make sense in the Request Report
because it's either the same as the number of requests (if the file is a
page) or zero (if it isn't). If you want to list only pages in this
report, use REQINCLUDE pages instead.
- Can I find out which files each referrer pointed
to?
or Can I find out which files each host has read?
or Can I find out which hosts have read each file?
or Can I find out the number of hosts visiting on each
day?
or lots of similar questions.
There are lots of questions like this. They all want analog to
cross-reference two sorts of item (e.g. files and referrers in the first
example above, or hosts and dates in the last). Granted, these would be
useful. But it is fundamental to analog's speed and minimal memory
requirement that it only records statistics for each type of item
individually, and doesn't record enough information to cross-reference
them afterwards.
What you can do is to restrict the analysis to just requests from
certain referrers (for example) with the
REFINCLUDE command, or to a
particular time period with
FROM and TO.
This is often good enough.
- Can I use %d, %m etc. in
the LOGFILE, like I can in the
OUTFILE?
No. This is rarely useful, because you can only get
at one logfile that way. If you're on Unix, you can embed the date in
the logfile name using the date command: for example,
analog access.`date +%Y%m%d`.log
- Can SETTINGS ON produce a
configuration file instead of an English list of settings?
No. But it does tell you which configuration files it read, so you
can just get the commands out of them. Or if you want a list of all
configuration commands, there is one in the examples
directory.
- I get the message "logfiles overlap"
even though the two logfiles contain completely separate requests.
This message is based only on the dates of the files, not the
contents. If you're sure there is no problem, you can turn it off with
the command WARNINGS -L.
- Can I get data on individual visitors, or
visits, to my site?
No, it's not technically possible, and don't believe any program
which tells you it is. See the section on
How the web works for details.
- Can I change the background colour of my
output?
Yes. The correct way to do this is to write a style sheet, and then
use the STYLESHEET
command.
- Can I change the way dates are formatted in the
output?
or Can I change some of the phrases in the output?
Yes, by editing the language
file.
Most of the questions in this category are answered in the section on
What the results mean, which I really
recommend you read if you want to understand what analog is telling you.
- How do I find out the number of hits from your
data?
I don't use the word hits, because people use it in
different ways, so it's misleading. I use requests for the
number of transfers of any type of file (text, graphics, ...), and
page requests for the number of transfers of HTML pages. See the
section on Analog's definitions
for more information.
- Why are there so many referrers from my own
site?
These come from all the internal links on your site, and all the
graphics on your pages. See the section on
How the web works for more
information. If you don't want to see them, you can use
REFREPEXCLUDE to
exclude them.
- The report covers exactly a week, but the
figures for the last seven days don't agree with the totals.
The figures in parentheses are for the seven days before the time
the program was run, unless there is a TO command. They
are never for the seven days before the end of the logfile.
(Although if you know that the logfile only contains entries up to a
certain time, you may want to include a TO command for that
time to get the last seven days' data right.)
- I only have 240 requests in total. Why does
analog think there are 840 requests per week?
If you have 240 requests in two days, that's a rate of 840 requests
per week. Just like if you drove 28 miles in 20 minutes, you'd have
driven at 84 miles per hour.
- Why doesn't analog agree with the counter on my
page?
There are lots of possible reasons. Do they both start from
the same date? Are you just looking at requests for that one page with
analog, not for all your other pages and graphics? Also, analog will
record all requests to that page; if it's a graphic, your counter will
only measure requests from people on graphical browsers that reached
that place on the page.
- Why doesn't analog agree with grepping the
logfile?
Have you understood what analog includes in
its counts? In particular, most reports only list "successful"
requests (HTTP status codes 200-209 & 304). A naïve grep would
count failures too.
- Why do I only get "unresolved numerical
addresses" in the Domain Report and Organisation Report?
Your server only records the numerical IP address of the hosts that
contact you, not their names. Read the section about
DNS lookups, or turn DNS resolution
on in your server.
- Why are directories listed in the Request
Report?
They are not directories, they are pages with the same name as
the directory. For example, I have both a directory called
/analog/ and a page called /analog/ (which happens
to be the same as /analog/index.html).
- When someone reads one of my PDF files, it
scores dozens of hits.
PDF files are often downloaded and read one page at a time, and each
page will then count as a separate request. Although this is not ideal,
it's much less clear what to do about it. Analog has no way of knowing
how many pages constituted a single download in the reader's mind. As
usual, we can only reliably report how many requests there were at the
server, not guess what users did with the file later.
- The Organisation Report doesn't identify
organisations correctly.
The rules I use are described in the section on
The domains file.
I admit they aren't perfect, but this is because in domains in which
organisations aren't all at the same level in the domain hierarchy,
there is no way to identify them perfectly without long lists.
- "Organization" isn't spelled
correctly.
Yes it is. If you want American spellings, you have to specify
LANGUAGE US-ENGLISH
in your configuration file.
- How can I do such-and-such with a command line
option?
Use the +C option to put
any configuration command on the command line.
- I want a list of all command line arguments.
There is a list in the index.
- How do I list all numerical subdomains to depth
2 in the Domain Report?
SUBDOMAIN *.* deliberately only lists the top-level
numerical subdomains to avoid cluttering the output.
SUBDOMAIN *.*.* will work but will list everything else
to depth 3. So the best solution is
SUBDOMAIN 1*.*,2*.*,3*.*,...
- I want to be able to count requests with status
code 301 and 302 as successes, so that they appear in the Request
Report.
No, you really don't, because that would lead to double counting
when a request for /dir (code 301) is redirected to
/dir/ (code 200). For CGI scripts etc. look in the
Redirection Report instead of the Request Report.
- I want to report on a field analog doesn't know
about.
Use the following kludge. Write a
LOGFORMAT to declare the field to
be a virtual host or a user (whichever you aren't already using). Then
edit your language file so that the right text is output.
- Can analog analyse FTP logfiles?
Yes. If you are using the xferlog format, then there is a
configuration file to help you in the examples
directory. Otherwise you will have to write your own
LOGFORMAT. (You probably won't be
able to read anything other than the lines corresponding to file
transfers.)
- Can analog analyse other logfiles, such as mail
logs, or the syslog?
Yes and no. For mail logs, there is a program on the
helper applications page to help you. For
other logs, you can get some results out by writing your own
LOGFORMAT. But analog does make
some assumptions about the sort of information it expects on a logfile
line, and the further these assumptions are from being met, the harder
it will be!
- How can I run analog automatically every day?
This depends on your particular machine. On Unix, you need to run
analog as a cron job (see "man cron"). This is my cron command
to run it at 1:50am every day:
50 1 * * * $HOME/bin/analog
On Windows NT you can do the same with the at command. (It's
probably easiest to put it in a batch job; also only an
administrator can run at.) On Windows 98, it should be possible with the
Task Scheduler, although I haven't tried it. On Windows 95 it's not
possible as far as I know.
On Mac, there are programs called
Cron or
CronoTask
to do this.
- When I run analog in a batch job, it doesn't
work.
Most likely it can't find its configuration files etc. because you
are in the wrong directory. On Windows, the first command in your batch
file should be a cd command to change to the analog
directory.
- I'm setting up IIS. Which logfile format should
I use?
The W3C format is probably best. You can turn fields on and off in
this format. And it contains all the possible fields which can be
logged, which the other formats do not. However, it is important to turn
the date field on (it's off by default), not just to log the date once
at the top: see the section on problems
with logfile formats for why.
- I host lots of virtual domains. How should I set
up analog?
There's a file in the examples directory which discusses
this issue.
- Can I make several reports with just one run of
analog?
Not at the moment. I want to do this in a future version, but it will
require some considerable work. However, depending on your which options
you want to vary, you may be able to avoid having to read the logfile
several times by using cache files. (This is
likely to be faster, but more complicated.)
- I ran out of memory when trying to run
analog. What can I do?
See the section on Coping with low memory.
- You're processing 20,000,000 requests in under
10 minutes. Why is mine much slower?
or Analog appears to stall.
If you have DNS lookups on, they are very
slow. Otherwise, it probably depends on the speed of your computer and
disks, and what other programs are running at the same time. You can
use the PROGRESSFREQ
command to see
if it's really stalled or whether it's just being slow. If you are
running out of memory, you might find analog's
LOWMEM commands helpful.
- How do I make a link on my page that runs
analog?
Link to the anlgform program, with the
desired options. But be careful about the load on your server.
- Do I have to save all my old logfiles?
or Can analog make statistics from an old report instead
of reading the whole logfile again?
These questions are answered in the section about
Cache files.
- Can analog write to a database or
spreadsheet?
Use the computer-readable output style,
which can export to CSV. Or if what you really want to do is to run
analog again without re-reading the logfiles, read the section about
Cache files.
There is also a section on troubleshooting in
the documentation about the form interface.
- I couldn't make the form run.
Have you made analog work without the form? Have you run
anlgform.pl from the command line as explained in the section
on troubleshooting?
- How can I specify different logfiles from the
form interface?
Just add a new field to the form with name=LOGFILE
- I specified
LOGFILE=C:\inetpub\wwwroot\w3svc1\*.log from the form but it
said "Unsafe characters in LOGFILE".
On the form, you can't use wildcards in the LOGFILE
name for security reasons.
- My browser showed me anlgform.pl, rather than
running it.
You have to tell the server to execute the CGI program, not just
send it out like it would for a normal file. Often this is done by
putting it in a special /cgi-bin/ directory.
- Why does the form interface give "Document
Returned no Data"?
If it doesn't happen for a while, then probably the server is giving
up before the analog process has finished running. Increase the timeout
interval on the server.
- The images don't appear when running analog from
the form interface.
You probably need to set the
IMAGEDIR. If the images
are in your /cgi-bin/ directory, the server will normally try
to execute them instead of just sending them out.
- Why do I get some reports that weren't requested
on the form?
If a report is neither included nor excluded on the form, the
system default will be used. This will depend on your configuration files
and on compile-time settings.
- How do I make a link to anlgform.pl
without using anlgform.html?
anlgform.pl accepts the GET or POST
methods of form submission. So you can make a link with the arguments
passed after a question mark in the usual GET way.
- Is there a form interface not using Perl
(e.g. ASP or .exe)?
There is a Windows executable version of the Perl script on the
analog helpers page. At the time of writing,
I don't know of any ASP version of the anlgform program, but if someone
writes one, I'll put it on the analog helpers
page too.
Warning: Potential authors must understand CGI
security issues in general, and the extra
issues about what the analog form interface must disallow, or they
will open security holes on their system.
or "Why didn't you do it this way?"
- Why doesn't the HEADERFILE replace
the whole <head> of the output file?
Because you almost never get valid HTML that way. Use a
style sheet instead.
- Why not use HTML tables?
Most non-graphical browsers don't do a good job with tables. Also
tables aren't available in HTML 2.0, which is the sort of HTML
analog writes.
- Why are you still using HTML 2.0?
It seems to be impossible to make my bar charts in HTML 4.0.
- It would be better if you used png's instead of
gif's.
I'm aware of the issues. But png support isn't good enough even in
new browsers; and I have always made a point of designing analog to work
even on old browsers.
- Why not just do DNS resolution of the hosts that
actually make it into the Host Report?
There is one theoretical and one practical problem. Theoretically,
the problem is that which hosts do make it into the Host Report can
change when the DNS lookups have been done. And practically, this
wouldn't help identify the busiest countries or organisations, which is
usually what you really want to know. However, there is a Perl script on
the helper applications page to do this.
- Couldn't you do the DNS lookups faster with
threads?
The problem is, the standard commands for DNS lookups are not
thread-safe on many platforms, so it would involve a lot of
platform-specific code. Again, there are programs for specific platforms
on the helper applications page.
- Why doesn't analog analyse the error_log?
The error log is intended for humans rather than computers to read.
So there is no consistent format: even different versions of the same
server have different formats. And there is not much need to analyse it
because analog's various failure reports are good enough for almost all
purposes.
- My server lists local names in the logfile. Can
you put a common suffix on them automatically?
This wouldn't be a good idea by default, because things like
"unknown" would get the suffix. You can always add them using
HOSTALIAS. (There is
an example to accomplish this using regular expressions in the
section about aliases.)
- Can you extrapolate from the current month's
partial data to produce a prediction for the whole month, based on the
rate so far?
No. There are too many problems in trying to produce anything
sensible, especially near the beginning of the month. Different days of
the week and different times of day cause lots of problems. I would
prefer to produce raw accurate data than suspect derived data.
- Can you extend the Domain Report to say which US
states people visited from?
No. Some programs pretend to do this, but you can actually only tell
which state the computer the person was using is in, which may be quite
different from where the user was for ISP's or other large organisations.
- Why not use language codes instead of country
codes for the names of the language files?
People are more familiar with the country codes. And not all of my
languages have language codes anyway.
- Why doesn't analog produce statistics on
"visits"?
See How the Web Works.
- Why don't you sell analog?
I didn't write analog for the money, and I'm happy just to see
people use it. Also, by making it open source, lots of people send me
ideas and code to include in future versions. How do you think I got all
those languages? (Of course, if you want to send me money, or gifts in
kind, or even just postcards...).
Go to the analog home page.
Stephen Turner
01 February 2001
Need help with analog? Use the analog-help
mailing list.
[ Top | Up |
Prev | Next |
Map | Index ]