AWStats

Web Log Analyzer

AWStats

Introduction

Advanced Web Statistics - AWStats is an "old-school" log file analyzer. It was written by Laurent Destailleur in 2000. As it is written in Perl, AWStats can be run on any system with a Perl interpreter. I use Strawberry Perl but there are also other distributions available such as ActiveState Perl.

In order for Perl programs to work it is easiest if Perl is available via the PATH settings on Windows. Tilburg Science Hub has a simple explanation on how to do this.


Installation

AWStats is a Perl script program. It needs a Perl interpreter and both can be placed anywhere you like. It can be run directly from the command line which makes it particularly easy to add to Windows Task Scheduler, but in order to be used, you need to know the locations of the Perl interpreter, the AWStats script and its ancillary files.

AWStats can be run as a CGI program from the browser but I don't want to allow that.so I had to make no changes to Apache's configuration files.

Organization

I copied all the AWStats folders and files to my main Apche folder. In the main AWStat's folder, I created a new one named data to keep Awstat's processed history files.

I then created a new folder named aswlogs in my utils folder on the website, ie somehwre where they are accessible from the website, but it could be anywhere if you want to keep the statistics private. Because of the number of files that AWStats produces, I made a folder for each year and inside that, one for each month for that year. The only folder needed from the AWStats directory is the icon one from Awstat's wwwroot folder.

AWStats file locations
  Location
Main program C:\Apache24\awstats
Executable C:\Apache24\awstats\wwwroot\cgi-bin
Config file C:\Apache24\awstats\wwwroot\cgi-bin
Log files C:\Apache24\logs
AWStats log files C:\Apache24\awstats\data\brisray
Main output C:\Apache24\htdocs\brisray\utils\awslogs
https://brisray.com/utils/awslogs/

The created AWStats folders

The newly created AWStat folders, both the program (left) and partially completed output folders (right)

Configuration

In awstats\wwwroot\cgi-bin is a file named awstats.model.conf, this is an example configuration file for AWStats. I made a copy of this file and named it awstats.brisray.conf. This is going to be my personal configuration file for AWStats. There were just a few changes that had to be made:

The LogFile line was commented out as there are lots of old Apache files to be processed to start with and these can be specified on the command line.

The SiteDomain line was changed to the name of my site - SiteDomain="brisray.com"

The DirIcons line was changed to the relative path to where the output HTML files are - DirIcons="../icon"

The DirData line was changed to where I want the processed data files to be kept - DirData="C:/Apache24/awstats/data/brisray"

The line TrapInfosForHTTPErrorCodes was changed because AWStats reports all these error codes but does not produce reports for all of them. The line is now - TrapInfosForHTTPErrorCodes="206 400 401 403 404 405 500"

The line NotPageList was changed because I use webp images on the site and that was not included in the original file list. The line is now

NotPageList="css js class gif jpg jpeg png bmp webp ico rss xml swf eot woff woff2"


Some Problems

Of course some of these problems were due to my misunderstanding the documentation. I initially had problems with making AWStats aware of the image folders it needs and also with the dropdowns to get to the various month/year statistics. I had a lot of trouble of getting AWStats to work at all. It wasn't just my misunderstanding of the documentation, after a lot of faffing around I realized some of my older log files dating from 2011 were corrupt in some way and AWStats had a lot of trouble creating the reports from them. In fact, AWStats did very well to get anything at all from these old files. Because of the amount of information that AWStats extracts from the log files it is more fussy than some of the other log file analytics programs about the state of the logs.

Debug

To help understand what was going on and AWStats was not working correctly I enabled debugging. This is a two-step process. First, the confuration file line has to be changed from

DebugMessages=0

to

DebugMessages=1

Then on the command line specify the debug level (1 - 5) and optionally write the output debug information to a file using the > redirection character. At level 5, the debug file is going to be quite large Remember that the configuration file should be named awstats.site.conf so my site configuration file is named awstats.brisray.conf. For example:

awstats.pl -config=brisray -debug=5 > C:\Apache24\awstats\debug.txt

From this I learned that AWStats was reading the log file, it was writing its history file to ./awstats012009.brisray.txt which is the same folder I ran awstats.pl from. As far as I can tell from the debug files, it is not even trying to write the HTML files to display on the website.

Silly me! It turned out I missed the "=" sign between one of the directives and its value and that I managed to mess up the output file name on the command line.

The value of the DebugMessages directive was returned to 0 after I had solved the problems.

One Config File - Multiple Virtual Hosts

I run three websites as vhosts on the server, brisray.com, hmsgambia.org and ihor4x4.com. I have already arranged in Apache's vhost configuration file to split the log files out for each domain. There's only two lines in awstats.brisray.conf that are actually specific to the site, so I had thought of creating a new one called awstats.mysites.conf, remove the site specific lines and use the command line to insert the site specific lines.

Unfortunately that does not work, AWStats expects the directives to be in the configuration files and not on the command line. When I tried it I got the error message "Error: SiteDomain parameter not defined in your config/domain file. You must edit it for using this version of AWStats."

Multiple Log Files

AWstats is designed to read Apache's log files before they are rotated or split. I have several years of log files I want AWStat to process which have already been split by year and month, but there appears to be no way for AWStats to work on multiple log files sequentially. Instead, AWStats insists on the split files being concatinated into a large one and comes with a utility named logresolvemerge.pl to do this.

It may be easier and quicker if I write a batch or PowerShell script so AWStats can process the log files one at a time.

Dropdowns

AWStats is capable of making dropdowns on the pages it produces to navigate between the month and year reports. This only appears to work properly if the reports are produced as a CGI and not as static pages.

perl awstats.pl -config=mysite -output=downloads -staticlinks > awstats.mysite.downloads.html

All Reports

In the same section of documentation as above it says that "you can use the awstats_buildstaticpages tool to build all these pages in one command". It appears that this only works if it is run as a CGI, as I got the message it couldn't find the configuration in the Apache configuration files.

Downloads Report

There appears to be an omission in the list of reports in the Building and reading reports documentation. The command for making the full list of downloads is missing. It should of course be:

perl awstats.pl -config=mysite -output=alldomains -staticlinks > awstats.mysite.alldomains.html

Error Code Reports

I use the directive TrapInfosForHTTPErrorCodes = "206 400 401 403 404 405 500" in the AWStats configuration file. To produce the full listing for these codes then the same style of command line as for the 404 error codes on Building and reading reports can be used. For example:

perl awstats.pl -config=mysite -output=errors206 -staticlinks > awstats.mysite.errors206.html

Keywords Reports

While processing my old logs AWstats would sometimes hang. It was nearly always while producing the keywords reports. Most of the log files processed proeprly, so I do not think it's anything to do with AWStats or the command line I used. My best guess is that someone put odd characters while searching for the site and it is these that are causing AWStats to choke.

Flush history file on disk

At of the top of some of AWStats results pages you may see something like "Flush history file on disk (unique url reach flush limit of 5000)". This is an informational notice from AWStats to inform you that a memory data buffer is full and that it will be temporarily saved to a disk file.


Command Line

I run a script that splits my Apache log files into monthly reports in the form of brisray-access-yyyy-mm.log such as brisray-access-2023-10.log

The command line I used for producing the reports for the year is

perl C:/Apache24/awstats/wwwroot/cgi-bin/awstats.pl -config=brisray -LogFile=C:/Apache24/logs/brisray-access-2023-01.log year=2023 -output -staticlinks > C:/Apache24/htdocs/brisray/utils/awslogs/2023/index.htm -update

Unforuntately this means that this has to be run for each month of the year.

The command line I used for producing the reports for the month is

perl C:/Apache24/awstats/wwwroot/cgi-bin/awstats.pl -config=brisray -LogFile=C:/Apache24/logs/brisray-access-2023-10.log -month=10 -year=2023 -output -staticlinks > C:/Apache24/htdocs/brisray/utils/awslogs/2023-10/index.htm -update

Not only that there are another 26 commands to be run as on Building and reading reports to create the full reports. This is why I suggest a new folder is created for each year and inside that, one for each month of that year.

I have seven year's worth of old Apache log files I want AWStats to process. I need to write a batch file or PowerShell script using variables for the month and year to process the files and to also create the new folders.

What I came up with was a batch file to process a year's worth of old logs at a time.

Here's the batch file:

cls
set year=2017

for %%a in (01 02 03 04 05 06 07 08 09 10 11 12) do (call :monthly %%a)

goto :eof

:monthly
set month=%1
mkdir C:\Apache24\htdocs\brisray\utils\awslogs\%year%\%year%-%month%

perl C:\Apache24\awstats\wwwroot\cgi-bin\awstats.pl -config=brisray -LogFile=C:\Apache24\logs\brisray-access-%year%-%month%.log -month=All -year=%year% -output -staticlinks > C:\Apache24\htdocs\brisray\utils\awslogs\%year%\index.htm -update

perl C:\Apache24\awstats\wwwroot\cgi-bin\awstats.pl -config=brisray -LogFile=C:\Apache24\logs\brisray-access-%year%-%month%.log -month=%month% -year=%year% -output -staticlinks > C:\Apache24\htdocs\brisray\utils\awslogs\%year%\%year%-%month%\index.htm -update

for %%b in (alldomains allhosts unknownip allrobots lastrobots downloads urldetail urlentry urlexit osdetail unknownos browserdetail unknownbrowser refererse refererpages keyphrases keywords errors206 errors400 errors401 errors403 errors404 errors405 errors500) do (call :makereports %%b)

goto :eof

:makereports
set report=%1

perl C:\Apache24\awstats\wwwroot\cgi-bin\awstats.pl -config=brisray -LogFile=C:\Apache24\logs\brisray-access-%year%-%month%.log -month=All -year=%year% -output=%report% -staticlinks ^gt; C:\Apache24\htdocs\brisray\utils\awslogs\%year%\awstats.brisray.%report%.html -update

perl C:\Apache24\awstats\wwwroot\cgi-bin\awstats.pl -config=brisray -LogFile=C:\Apache24\logs\brisray-access-%year%-%month%.log -month=%month% -year=%year% -output=%report% -staticlinks > C:\Apache24\htdocs\brisray\utils\awslogs\%year%\%year%-%month%\awstats.brisray.%report%.html -update

goto :eof

The batch file is just a nested loop The outer loop steps through the months and writes the main index.htm files. It then calls the inner loop which steps through AWStats available reports and writes them.

Index File

As I cannot get AWStats dropdown menus to work, I will have to create an index page, much like I did for the Webalizer statistics.


Newer File Types

AwStats was written in 2000, and I use v7.9 which was published in January 2023. It seems the multimedia list AWStats uses have has not changed much since it was first published and I added webp, webm and svg files to the list of file types to be counted as hits but not page views. To do this I added them to the NotPageList directive in the configuration file:

NotPageList="css js class gif jpg jpeg png bmp webp ico rss xml swf eot woff woff2 webm svg"


Live Output

I have decided to make the statistics obtained from my web logs public. The logs are subject to referer spam, so if the referring site in these analytics pages looks suspicious, it probably is.

AWStats Statistics (2011 - present)

The index page the above link goes to was not made by AWStats. I took the information from the various annual reports it produces along with screenshots of them and created the page in the same style used by the program.

Unless it was changed, the AWStats statistics pages contain unique phrases such as "Advanced Web Statistics", "(build", and "Created by awstats". A search of those in Google or Bing brings up the statistics pages from other sites. Interestingly, many of these are in PDF format which is an experimental feature of AWStats.


Sources and Resources

Analyze site traffic with AWStats - A guide to the installation of AWStats and some of the plugins available for it.
ActiveState Perl
AWStats - Official web site
AWStats (GitHub)
AWStats (Wikipedia)
AWstats Configuration Example (Jackie Chen)
AWStats Discussion - An active forum
AWStats Review (Pat Research)
Comparison table of the features offered by Analog, AWStats, and Webalizer
How to Configure AWStats for Windows and IIS (Damir Arh)
Set up AWStats (Smartlab Software)
Strawberry Perl
Unique URL reach flush limit of 5000 (Source Forge)