Web Log Files

Spikes, Trends, and Patterns

Introduction

Part of a log file

Interpreting web log files from the raw text is impossible which is why they need to be visualized. Using the log file visualizers and analyzers it is relatively easy to spot spikes, trends, and patterns in the data. Once you know about unusual activity then it can be investigated. Is it a natural trend as the website becomes more popular? Is it bot activity? If it's a bad bot, what is it doing?

In the following example, the graphs are from Analog and they have been rotated to lie horizontally.

An Example

Looking through the Analog Monthly Report, I saw the server, since around Novermber 2022, was getting a lot more traffic than it had previously.

Analog's Monthly Report

Analog's Monthly Report

It would be nice to think this was all human traffic, but I doubted it because the extra traffic was too abrupt.

Analog's Weekly Report

Analog's Weekly Report

Analog's weekly report showed that every two weeks, starting from October 23, 2022, there were huge spikes in the traffic to the site. Whatever was causing them was not human. So, what bot was visiting the site and what was it doing?

Analog's Daily Report

Analog's Daily Report

The daily report confirms that the spikes occured every two week, and the dates show it was always on a Friday.

Analog's Daily Summary

Analog's Daily Summary

The daily summary confirms the extra traffic was seen on Friday's and show that around 7x the normal amount of pages and files were being requested from the server.

Analog's Hour of the Week Summary

Analog's Hour of the Week Summary

The hour of the week summary shows the server experiences heavier traffic between 2am and 7pm on Fridays, and the earlier charts show this occurs on alternate weeks. Time to look at the logs for these dates and times or better still, query the logs using Log Parser to see what is going on. I used Log Parser to query which user agent was making the most requests on the affected days:

Here's a complete command line I used (all on one line):

logparser -i:ncsa "SELECT top 10 User-Agent, count (User-Agent) as requestcount FROM 'C:\Users\brisr\Documents\logtest\brisray-access-2023-03.log' where to_string(to_date(DateTime),'dd/MMM/yyyy') = '17/Mar/2023' GROUP BY User-Agent ORDER BY count(User-Agent) DESC"

Here's the output of the above command at prompt:

Output of the useragent command

Output of the above command
As I suspected, a single user agent is making many, many more times requests than is normal

I used the same query in Log Parser Studio:

Output of the useragent command

The same query as above in Log Parser Studio

The user agent Mozilla/5.0 (compatible; Detectify) +https://detectify.com/bot/b2c1358a-3268-49ed-ac13-e7d47627c4cb made 247,616 requests in a single day. Detectify is a very good, highly rated security scanner. Some years ago they offered free scans for non-commercial sites which I signed up for. I didn't realize what an impact it might have.

This page created April 28, 2023; last modified May 2, 2023