Wusage: Silly Name, Serious Stats!

Wusage 8.0 Manual

Analyzing Many Log Files

Popular sites have a problem: log files become very large very quickly! It is necessary to set them aside in some way when they become too large to be kept in uncompressed form or kept on the hard drive at all.

Most sites solve this problem by periodically compressing and setting aside old log files. Wusage can analyze log files compressed with the gzip and zip utilities automatically.

"My server generates a new log file for each day, so I have dozens of log files already and more on the way. How can I analyze these logs with wusage?"

If you have many log files, just specify the location of the log directory rather than the location of a single file when you set the Log Files and Directories (logfiles) option. Beginning in Wusage 8.0, you can also use wildcards, such as the * wildcard, in the last component of the log file path.

"What if my log files are in several directories, or on several computers?"

With Wusage 7.0 and above, you can add more than one entry to the Log Files and Directories (logfiles) option. Each entry can be a local file path, an HTTP: URL, or an FTP: URL. Each entry can refer to a single log file or to an entire directory of log files.

"Should I keep my log files in the same place forever?" No, this will slow the program down. Although old log data will be correctly ignored, it will be necessary for the program to check each file for new information. Users of ISP mode can automatically rotate their log files out for archiving and eventual deletion; for more information, see Configuring the Program in ISP Mode . Users of simple mode and advanced moce can solve this problem by moving log files after they are analyzed. If you wish to keep your log data compressed and not uncompress it for disk space reasons, we recommend that you use the well-known free program gzip to compress your files.

Wusage can analyze gzip- or zip-compressed log files directly without the need to uncompress them first. Please note that zip-compressed archives containing more than one file may require that the Sort Logs (sortlogs) option be turned on, if the log files were not added to the archive in chronological order. If this is the case, and the sortlogs option was not turned on, a warning will appear during the analysis.

"I have many mirror sites, so I have a collection of log files that all contain entries from the same period in time. Can wusage cope with this?"

Yes! Just use the Log Files and Directories (logfiles) option to tell Wusage where to find the log files you have collected. You can specify more than one location, and HTTP and FTP URLs are accepted. If you have more than 20 log files that cover the same period in time, for instance 50 log files from 50 mirrors of the same site, Wusage may slow down considerably. This does not occur with non-overlapping log files, even if there are hundreds of them. If your needs cannot be met with a limit of 20 overlapping log files, please see the Contacting Boutell.Com, Inc. section for more information.

"This is all well and good, but I already have several really old compressed log files that were not compressed with gzip or zip. How do I analyze them with wusage?"

The Unix cat and zcat commands are extremely useful to reconstruct a single log file. In order to allow the DNS and document structure analysis features of Wusage 8.0 to work well, we recommend reconstructing a single log file first, then instructing Wusage to analyze this file.

Important note: when reconstructing log files, it is important to feed log files to cat in ascending order. An older log file should precede a newer one. If this is not possible, be sure to set the Sort Log Files (sortlogs) option first.

"I ran wusage on an old log file, and now I have zero accesses for the last two months! What happened?"

Wusage normally creates reports through the most recent complete day or week, and will not generate those reports again. You can override this behavior using the -b and -e command line options, which are used to force wusage to start re-generating reports at an earlier date, or to stop well before the present date. If you inadvertently produce empty reports for the most recent several weeks or months, just use the -b option to specify a date from which wusage should start re-generating those reports, and specify the more recent logfile as well using the -l option. To protect your reports, Wusage will not attempt to regenerate reports for time periods that are not entirely within the date range specified when -b and -e are in use.


Previous: Patterns and Regular Expressions
Next: clickthrough.cgi: Watching Outgoing Clicks
Table of Contents
Topical Configuration Editor Reference
Alphabetical Configuration Editor Reference
Alphabetical Configuration File Reference
Glossary of Frequently Used Terms

Copyright 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, Boutell.Com, Inc.
wusage@boutell.com
Boutell.Com, Inc - PO Box 16716, Seattle WA, 98116, USA
Phone/Fax +1 206 658-8176
Copyright Statement

All material, including images, on this web site is Copyright 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, Boutell.Com, Inc. unless otherwise noted.
Need more information? Wondering who to contact? Visit our whom page.

Boutell.Com, Inc.
PO Box 16716
Seattle WA 98116
USA
+1 206 658 8176