[ Top | Up | Prev | Next | Map | Index ]

Readme for analog 4.02

Inclusions and exclusions

After aliasing each item, analog decides whether that item is wanted or not. The whole line is only counted if all the items are wanted. Whether an item is wanted or not is determined by INCLUDE and EXCLUDE commands specified by the user. These commands can be used to exclude requests from your local users, for example, or to analyse only files in a subdirectory. For example
HOSTEXCLUDE mycomputer.myisp.com
would exclude all requests by that computer from the statistics.

The rule for determining whether an item is included or excluded is as follows. All the INCLUDE and EXCLUDE commands for that item are considered one by one in order, and the item is included or excluded according to the last command it matched. Items which don't match any of the INCLUDE or EXCLUDE commands are included if the first command was an exclusion, and excluded if the first command was an inclusion. For example, the configuration

FILEINCLUDE /~sret1/*
FILEEXCLUDE /~sret1/backgammon/*,/~sret1/analog/*
FILEINCLUDE /~sret1/backgammon/*.gif
would instruct the program to examine only my files, excluding my backgammon and analog files, but including gifs in my backgammon directory. On the other hand,
FILEEXCLUDE /~sret1/*/img/*
would analyse all files, except for images in my various directories. Note that inclusions and exclusions can contain any number of wildcards.

The full list of these commands is HOSTINCLUDE and HOSTEXCLUDE; FILEINCLUDE and FILEXCLUDE; BROWINCLUDE and BROWEXCLUDE; REFINCLUDE and REFEXCLUDE; USERINCLUDE and USEREXCLUDE; and VHOSTINCLUDE and VHOSTEXCLUDE.

Because the inclusions and exclusions take place after the aliasing, the name you must use is the aliased name. (In the absence of OUTPUTALIAS commands, this is the name of the item in the output.)

Sometimes a line doesn't contain a particular sort of item, either because there is no field reserved for it on the line, or because the browser didn't send it for that request. You can include or exclude these lines by making a special blank entry in the INCLUDE or EXCLUDE command. For example,

USERINCLUDE jim
USERINCLUDE ""
would include lines from user jim and lines without any user specified.

The behaviour of REQINCLUDE and REFINCLUDE can be slightly unintuitive if the file has search arguments.

On suitable operating systems, you can use regular expressions for the inclusions and exclusions by prefixing the expression with "REGEXP:" or "REGEXPI:". I've already described this at length in the context of aliases, so you can look there for all the details.

If you get confused with all the inclusions and exclusions, remember that you can always run analog -settings to see what the options you have specified represent.


There is also one other pair of commands which belongs in this category, namely the FROM and TO commands. These specify a time period to restrict the analysis to. The simplest usage of these commands is FROM yyMMdd or FROM yyMMdd:hhmm, where yy represents the last two digits of the year (analog assumes that the year is between 1970 and 2069), MM represents the month, dd is the date, hh the hour, and mm the minute. So, for example, to analyse only requests from July 1999 to June 2000 I would use the configuration
FROM 990701
TO   000630
Alternatively, each of the components can be preceded by + or - to represent time relative to the time at which the program was invoked. In this case, the date can have more than 2 digits. This allows constructions like
FROM -01-00+01   # from tomorrow last year
TO -00-0131  # to the end of last month (OK even if last month
             # didn't have 31 days)
FROM -00-00-112
TO   -00-00-01  # statistics for the last 16 weeks
FROM -00-00-00:-06+01  # statistics for the last 6 hours
There are command line abbreviations +F and +T for the FROM and TO commands; for example, +T-00-00-01:1800 looks at statistics until 6pm yesterday. -F and -T turn off the from and to, as do FROM OFF and TO OFF.
There are also INCLUDE and EXCLUDE commands for most of the reports. These exclude individual lines from particular reports. So, for example, the command
REFREPEXCLUDE http://your.site.com/*
would exclude your internal referrers from the Referrer Report. However, it would not exclude them from the Failed Referrer Report, the Referring Site Report, etc. (you need to use FAILREFEXCLUDE, REFSITEEXCLUDE etc. for that); nor would it prevent other analysis of logfile lines with those referrers, as REFEXCLUDE would. Also REFREPEXCLUDE would include the referrers in the "not listed" line at the bottom of the report.

The full list of these commands is REQINCLUDE and REQEXCLUDE; REDIRINCLUDE and REDIREXCLUDE; FAILINCLUDE and FAILEXCLUDE; TYPEINCLUDE and TYPEEXCLUDE; DIRINCLUDE and DIREXCLUDE; HOSTREPINCLUDE and HOSTREPEXCLUDE; DOMINCLUDE and DOMEXCLUDE; ORGINCLUDE and ORGEXCLUDE; REFREPINCLUDE and REFREPEXCLUDE; REFSITEINCLUDE and REFSITEEXCLUDE; SEARCHQUERYINCLUDE and SEARCHQUERYEXCLUDE; SEARCHWORDINCLUDE and SEARCHWORDEXCLUDE; REDIRREFINCLUDE and REDIRREFEXCLUDE; FAILREFINCLUDE and FAILREFEXCLUDE; BROWSUMINCLUDE and BROWSUMEXCLUDE; FULLBROWINCLUDE and FULLBROWEXCLUDE; OSINCLUDE and OSEXCLUDE; VHOSTREPINCLUDE and VHOSTREPEXCLUDE; USERREPINCLUDE and USERREPEXCLUDE; and FAILUSERINCLUDE and FAILUSEREXCLUDE. The inclusion or exclusion applies to the unaliased name, if you are doing any output aliases.

You can also use the symbolic word pages in suitable INCLUDE and EXCLUDE commands; one very common command is

REQINCLUDE pages
to include only pages in the request report.
Analog determines which files should count as pages (and thus which requests count as page requests) using another INCLUDE/EXCLUDE pair, called PAGEINCLUDE and PAGEEXCLUDE. By default, *.html, *.htm and directories (*/) count as pages. But you change the list by commands like
PAGEINCLUDE *.ps,*.ps.gz
PAGEEXCLUDE /sret1.html
I.e., Postscript and gzipped Postscript are pages, but /sret1.html isn't. (If the file has search arguments, the PAGEINCLUDE and PAGEEXCLUDE are reckoned just on the part of the filename before the question mark.)
There is one more set of INCLUDE and EXCLUDE commands which I'll describe now. In the Request Report and the three referrer reports (Referrer Report, Redirected Referrer Report and Failed Referrer Report), analog can link to the files which it's listing. There are commands LINKINCLUDE and LINKEXCLUDE for the Request Report, and REFLINKINCLUDE and REFLINKEXCLUDE for the referrer reports, to specify exactly which files are linked to. So, for example, REFLINKINCLUDE pages would link to pages in the three referrer reports.
There is one final set of INCLUDE and EXCLUDE commands to include or exclude the search arguments at the end of URLs. But there are some slightly complicated issues surrounding those, so they deserve a new section.
Stephen Turner
Need help with analog? Subscribe to the analog-help mailing list

[ Top | Up | Prev | Next | Map | Index ]