Monday, November 27, 2006

HOWTO use microsoft's logparser to analyze IIS logs with example sql/code

Logparser can be your good friend if you have a large set of data (text form or otherwise) and you would like to summarize it. It can be used to analyze Microsoft’s Internet Information Server (IIS) logfiles, text based logfiles, XML files, Eventviewer data, Registry, Active Directory Objects, CSVs and more (see all the input formats at the end of blog entry).

The below is my documenting a howto use logparser with a number of examples. Most of the examples of IIS log parsing were not developed by me, rather there is a MS team that can be employed to do an IIS health check, these were the logparser SQLs they used.

Logparser to start

I recommend become familiar with:
logparser -h
In all truth all my needs have been answered in the command-line help. I may have googling for a solution, but the problem was solvable with careful reading.

Logparser and IIS logs.

Logpaser automatically reads the IIS header. In fact, I highly suspect that the reason for the tool’s existence began with the need to analyze IIS logs - the history and lore, I have not taken that much time to learn. I'll let you correct me?

Queries (examples):
updated March 2007 to add reverse DNS lookup, Referer URLs (sic), and Referer Summary (sic).
• Merge Multiple Log files
To consolidate log files into a single file.
logparser -o:IIS "select * into merged.log from ex*.log"
• A count of the Total Requests
logparser "select count(*) into IISLOG_TOTAL_REQ.csv from ex061023.log"
• How many unique clients
logparser "select count(distinct c-ip) into IISLOG_DISTINCT_CLIENTS.csv from ex061023.log"
• Top 20 URLs Hit
logparser "SELECT TOP 20 cs-uri-stem, COUNT(*) AS Hits INTO Analysis.csv from ex061023.log group by cs-uri-stem order by Hits DESC"
• Top 20 ASP pages Hit
logparser "SELECT TOP 20 cs-uri-stem, COUNT(*) AS Hits INTO Analysis.csv from ex061023.log where cs-uri-stem like '%%.asp' group by cs-uri-stem order by Hits DESC"
• Hit Frequency (how many hits per hour)
logparser "SELECT TO_LOCALTIME(QUANTIZE(TO_TIMESTAMP(date, time),3600)), COUNT(*) AS Hit_Frequency INTO IISLOG_ANALYSIS_HIT_FREQ.CSV FROM ex061023.log GROUP BY TO_LOCALTIME(QUANTIZE(TO_TIMESTAMP(date, time),3600)) ORDER BY TO_LOCALTIME(QUANTIZE(TO_TIMESTAMP(date, time),3600)) ASC"
• Bytes per Extension
What is the percentage of the bytes served per extension-type?
logparser "SELECT EXTRACT_EXTENSION(cs-uri-stem) AS Extension, MUL(PROPSUM(sc-bytes),100.0) AS PercentOfTotalBytes INTO IISLOG_ANALYSIS_BYTES_PER_EXT.CSV FROM ex061023.log GROUP BY Extension ORDER BY PercentOfTotalBytes DESC"
• Top 20 Clients Hitting this server
logparser "SELECT top 20 c-ip AS Client_IP,count(c-ip) AS PageCount from ex061023.log to IISLOG_ANALYSIS_TOP20_CLIENT_IP.CSV GROUP BY c-ip ORDER BY count(c-ip) DESC"
• REVERSEDNS of Top 20 Clients Hitting this server (reversedns(...) is a long running function for obvious reasons)
logparser "SELECT top 20 c-ip AS Client_IP, REVERSEDNS(c-ip),count(c-ip) AS PageCount from ex061023.log to IISLOG_ANALYSIS_TOP20_CLIENT_IP_WITH_DNS.CSV GROUP BY c-ip ORDER BY count(c-ip) DESC"
• Referrer Host Names directing traffic to this server with count of pages referred (summary)
logparser "SELECT ReferringHost, count(*) AS TotalReferrals, Min(cs(Referer)) AS ExampleRefererURL USING CASE EXTRACT_TOKEN(cs(Referer),2, '/') WHEN null THEN 'NoReferer' ELSE EXTRACT_TOKEN(cs(Referer),2, '/') END as ReferringHost into IISLOG_ANALYSIS_REFERER_HOSTS.CSV FROM ex061023.log group by ReferringHost order by count(*) DESC"
• Referrer URLs directing traffic to this server (full report)
logparser "SELECT EXTRACT_TOKEN(cs(Referer),2, '/') as RefererHostName, cs(Referer) AS RefererURL, count(cs(Referer)) AS TotalReferrals into IISLOG_ANALYSIS_REFERERURLs.CSV FROM ex061023.log group by cs(Referer) order by count(cs(Referer)) DESC"
• Unique Clients per Hour
This is two separate SQLs.
1. logparser -o:CSV "Select TO_LOCALTIME(QUANTIZE(TO_TIMESTAMP(date, time),3600)) as Times, c-ip as ClientIP into IISLOG_ANALYSIS_DIST_CLIENT_IP.LOG from ex061023.log group by Times, ClientIP"
2. logparser -i:CSV "Select Times, count(*) as Count from IISLOG_ANALYSIS_DIST_CLIENT_IP.LOG to IISLOG_ANALYSIS_HOURLY_UNIQUE_CIP.CSV group by Times order by Times ASC"
• IIS Errors and URL Stem (Error code > 400)
logparser "SELECT cs-uri-stem, sc-status,sc-win32-status,COUNT(cs-uri-stem) from ex061023.log to IISLOG_ANALYSIS_ERROR_COUNT.CSV where sc-status>=400 GROUP BY cs-uri-stem,sc-status,sc-win32-status ORDER BY COUNT(cs-uri-stem) DESC"
• IIS Errors by hour (Error code > 500)
Can answer if the errors are load related
logparser "SELECT TO_LOCALTIME(QUANTIZE(TO_TIMESTAMP(date, time),3600)), COUNT(*) AS Error_Frequency FROM ex061023.log TO IISLOG_ANALYSIS_ERROR_FREQ.CSV WHERE sc-status >= 500 GROUP BY TO_LOCALTIME(QUANTIZE(TO_TIMESTAMP(date, time),3600)) ORDER BY TO_LOCALTIME(QUANTIZE(TO_TIMESTAMP(date, time),3600)) ASC"
• Status Code distribution
logparser "SELECT sc-status, COUNT(*) AS Times from ex061023.log to IISLOG_ANALYSIS_STATUS_CODE.CSV GROUP BY sc-status ORDER BY Times DESC"
• Top 20 Longest time-taken (on average) pages
logparser "SELECT top 20 cs-uri-stem,count(cs-uri-stem) As Count,avg(sc-bytes) as sc-bytes,max(time-taken) as Max,min(time-taken) as Min,avg(time-taken) as Avg from ex061023.log to IISLOG_ANALYSIS_TOP20_AVG_LONGEST.CSV GROUP BY cs-uri-stem ORDER BY avg(time-taken) DESC"
• Top 50 longest requests
logparser "SELECT top 50 TO_LOWERCASE(cs-uri-stem),time,sc-bytes,time-taken INTO IISLOG_ANALYSIS_TOP50_LONGEST.CSV FROM ex061023.log ORDER BY time-taken DESC"
• Average Response time by Hour
logparser "SELECT TO_LOCALTIME(QUANTIZE(TO_TIMESTAMP(date, time),3600)), avg(time-taken) INTO IISLOG_ANALYSIS_AVG_RESP_TIME.CSV FROM ex061023.log WHERE cs-uri-stem like '%%.asp' GROUP BY TO_LOCALTIME(QUANTIZE(TO_TIMESTAMP(date, time),3600)) ORDER BY TO_LOCALTIME(QUANTIZE(TO_TIMESTAMP(date, time),3600)) ASC"
• Percentage Processing time by extension
logparser "SELECT EXTRACT_EXTENSION(cs-uri-stem) AS Extension, MUL(PROPSUM(time-taken),100.0) AS Processing_Time INTO IISLOG_ANALYSIS_PROCTIME_PER_EXT.CSV FROM ex061023.log GROUP BY Extension ORDER BY Processing_Time DESC"

As an added bonus, I’ve created a small cmd (windows) shell script that runs thru all (but the first) of these queries below against a log file. It is located at the following link
download it
Note it requires logparser on the path and has a commandline invocation of:
logparseranalysis.cmd ex061023.log

Logparser and creating separate SQL files (the file: argument)

You may have noticed that these SQLs can get a long, as is the way with SQL. Logparser provides the means to create a text file with these long sqls in it. Additionally the ability to pass arguments is of course a given. Next, an example is in order. To use the commandline below you will need to create a little text file (extension sql) with the contents of the below.

Command Line:
logparser file:iis.sql?logfile=ex061113.log

Text file: iis.sql
-- Start of SQL file --
SELECT
c-ip AS ClientIP,
cs-host AS HostName,
cs-uri-stem AS URIStem,
sc-status AS Status,
cs(User-Agent) AS UserAgent,
count (*) as Requests
INTO output.csv
FROM %logfile%
where time > to_timestamp('18:20:00', 'hh:mm:ss') and time < to_timestamp('18:45:00', 'hh:mm:ss') GROUP BY c-ip, cs-uri-stem, cs-host, cs(User-Agent), sc-status ORDER BY Requests DESC
-- End of SQL file --


Logparser and the files without headers

Don’t have a header in your csv file? With a little work we can define a logparser SQL that will map the empty fields to names with meaning. The automatic header row parsing will need to be turned off.
Command Line:
logparser -i:csv -headerRow:OFF file:dslog.sql?logfile=logwoutheader.log+outputfile=out.csv
Text file: log.sql
-- Start of SQL file --
select To_TimeStamp(MyDate, MyTime) as DateTime, field3 as MachineNane, field4 as PID, field5 as TID, To_Int(field6) as ErrorLevel, field7 as RegExp, field8 as Line, field9 as SID, field12 as Message using TO_TIMESTAMP(field1,'MM/dd/yyyy') as MyDate, TO_TIMESTAMP(field2, 'hh:mm:ss.lx') as MyTime into %OUTPUTFILE% from %LOGFILE% where ErrorLevel >= 35
-- End of SQL file --


Logparser and the eventviewer
Although already covered in a previous article, logparser can also connect to eventviewer and analyze those logs. It can even do this on remote machines. The below SQL is an example on how to detect locked out accounts.

Command Line:
logparser file:lockedoutaccounts.sql?DOMAINCONTROLER=HQDC01C

Text file: lockedoutaccounts.sql
-- Start of SQL file --
SELECT timegenerated AS TimeLockedout,
extract_token(strings, 0, '|') As UserName ,
extract_token(strings, 1, '|') AS OriginatingMachine,
EventID,
SourceName,
Message,
CASE EventID
WHEN 529 THEN 'Invalid userid/password'
WHEN 531 Then 'Account disabled out'
WHEN 539 Then 'Account locked out'
WHEN 530 Then 'Outside of logon time'
WHEN 532 THEN 'Account Expired'
WHEN 535 THEN 'Password Expired'
WHEN 533 THEN 'User not from allowed system'
WHEN 644 THEN 'Account Auto Locked'
WHEN 540 THEN 'Successful logon'
ELSE 'Not specified' END AS EventDesc,
strings
INTO lockedact.csv
FROM \\%DOMAINCONTROLER%\Security
WHERE EventID=644
-- End of SQL file --


Reference Material:
http://www.logparser.com/ the unofficial logparser site. It hosts a great knowledge base and an active forum.

How logparser 2.2 Works

To download: Download Logparser

Logparser Blog Entry

Logparser 2.2 Input formats:
• IISW3C: This is the IIS W3C Extended log file format.
• IIS: This is the IIS log file format.
• IISMSID: This is the log format for files generated by IIS when the MSIDFILT filter or the CLOGFILT filter is installed.
• NCSA: This is the IIS NCSA Common log file format.
• ODBC: This is the IIS ODBC format, which sends log files to an ODBC-compliant database.
• BIN: This is the IIS binary log file format.
• URLSCAN: This is the format for URLScan logs.
• HTTPERR: This is the IIS 6.0 HTTP error log file format.
• EVT: This is the Microsoft Windows Event Messages format.
• TEXTWORD: This is a generic text file, where the TEXT value is any separate word.
• TEXTLINE: This is a generic text file, where the TEXT value is any separate line.
• CSV: This is a comma-separated list of values.
• W3C: This is a generic W3C log file, such as a log generated by Windows Media Services or Personal Firewall.
• FS: This provides information about file and directory properties.
• XML: Reads XML files (requires the Microsoft® XML Parser (MSXML)) •
• TSV: Reads tab- and space- separated values text files
• •ADS: Reads information from Active Directory objects
• REG: Reads information from the Windows Registry
• NETMON: Makes it possible to parse NetMon .cap capture files
• ETW: Reads Event Tracing for Windows log files and live sessions
Logparser 2.2 Output formats:
• W3C: This format sends results to a text file that contains headers and values that are separated by spaces.
• IIS: This format sends results to a text file with values separated by commas and spaces.
• SQL: This format sends results to a SQL table.
• CSV: This format sends results to a text file. Values are separated by commas and optional tab spaces.
• XML: This format sends results to an XML-formatted text file.
• Template: This format sends results to a text file formatted according to a user-specified template.
• Native: This format is intended for viewing results on screen.
• CHART: Creates chart image files (requires Microsoft Office 2000 or later)
• TSV: Writes tab- and space- separated values text files
• SYSLOG: Sends information to a SYSLOG server or to a SYSLOG-formatted text file


Keywords: IIS log file analysis, IIS 6.0, IIS 5.0, IIS, logparser, logparser examples, logparser samples, logparser input formats, logparser output formats, logparser examples, howto use logparser, example sqls for logparser, how to use logparser, Analyzing IIS logs with logparser, logparser and files without headers, logparser eventviewer example, using logparser to analyze IIS logfiles, logparser sample code.

21 comments:

Anonymous said...

Team PESIT said...
really informative site

www.askken.blogspot.com

Anonymous said...

Great summary of features and usage!

Paul Cooley said...

Happy to help. In fact this is what I wanted for myself, when I first got a hold of logparser.

Anonymous said...

This site is great! I use it all the time. You truely have great LOGPARSER skills! ;)

-O

Paul Cooley said...

I am glad it helped you out. Happy SQLing!

Anonymous said...

Hi,

Do you know by chance the unsupported registry keys to overcome the 128kb row size limit error in logparser? I tried with regmon, but I failed to find the key :(

Thanks,
Cheers
Ferenc

Anonymous said...

After using this page quite a few times I guess I'll chime in to say that it's great! thanks :)

Paul Cooley said...

Glad it helps!

Anonymous said...

Great Article, provided very useful information.

StillRabbit said...

so cool~~!! that it!!! thank you~

Unknown said...

阿勞斯;其他入酒店經紀六日晚間為酒店經紀他入榜的還酒店經紀,酒店經紀,酒店經紀,酒店經紀,酒店經紀,酒店經紀,酒店經紀,酒店經紀,酒店經紀,酒店經紀,酒店兼差,酒店兼差,酒店兼差,酒店兼差,酒店兼差,酒店兼差,酒店兼差,酒店兼差,酒店兼差本和台灣一直維持著一酒店兼差,年齡最大的則是酒店兼差,酒店兼差,酒店兼差,酒店兼差,酒店兼差,酒店打工,酒店打工,酒店打工,酒店打工,酒店打工,酒店打工,酒店打工,酒店打工,酒店打工,酒店打工,酒店打工,酒店打工,酒店打工,酒店打工法國總統的賀雅爾暫酒店上班模出身的酒店上班,酒店上班,酒店上班,酒店上班,酒店上班,酒店上班,酒店上班發表您的看酒店上班態度,老實說轉變的太大了前些酒店上班,酒店上班可能會帶點政治色彩酒店上班,酒店上班,酒店上班,暑假打工,暑假打工,暑假打工,暑假打工,暑假打工女模的義大利平暑假打工,暑假打工昂豔冠群芳暑假打工,寒假打工,寒假打工,美容整形,整形,雷射美容,臉部整形,雷射整形,整形外科,微整形,醫學美容,臉部整形,雷射整形,整形外科,微整形,醫學美容,臉部整形,微整形,美形,身體整形月收增加 4-8萬,又美容整形,美容,雷射美容,美形,身體整形,美容整形,美容,整形手術,美形,身體整形,整形手術,整形論壇,牙齒整形,整形診所,整形,整形論壇,牙齒整形,整形診所,整形,雷射美容班牙政壇入選者整形論壇,整形診所,
大方室內設計公司提供專業的室內設計服務及空間設計服務,若需要請來電洽詢。

8:43 AM
Sightline Paymentsbingo games

Cialis kaufen said...

Never thought about using logparser before. Thanks for sharing that idea. Worked wonderfully.

Quincy said...

您好 與您一起分享有關於黃金回收的
黃金回收價格
如有任何黃金回收請找台北專業黃金回收宏鑫貴金屬
謝謝

Unknown said...

Introduction

This tool migrates a blog from online line Blog systems. It uses the public API of LiveJournal, Blogger, WordPress, and Spaces to extract data and create equivalent entries in a different account.
Lottery Resultsvilla

Unknown said...

to communicate with Gentoo Linux (2.6.17-gentoo-r8 kernel) in my home network using NUT (Network UPS Tools). Note this is much easier in Dec 2007 - see new howto here

Sightline Payments Kirk Sanfordผลบอล

Anonymous said...

Secret Serviceslots online
Fast delivery, wholesale from China

We use DHL, EMS Speedpost, UPS, etc ... from China to the U.S., Europe and Australia the major destinations, only 2 to 5 days.

Unknown said...

you can only hope that you are not amongst this minority.

poker multibancocasino spel online

Evan said...

It cannot have effect in reality, that is what I consider.
how to trademark a name | austin web designers | chicago workers compensation lawyer | downtown montreal hotels | play salon games

Unknown said...

Now that we are aware of the number of calories in rice paper, let's learn a little about its nutritional value. Rice paper contains fat in a very small quantity along with some protein. In addition to this
lottery uksalvia legal

Ronald Vaughn said...

Thank you a good deal for the topic, i 've got slightly distinct standpoint at buy Levitra
Cosme

Aditi Kalal said...
This comment has been removed by the author.