Storing I/O Stats at
Meaningful Intervals
Analyzing I/O statistics
gathered during peak usage
is much more informative
than trying to work with
numbers that included both
peak and off-peak times of
day. The solution presented
in this section stores the
raw information provided by
fn_virtualfilestats in a way
that provides information in
a user selected time slice.
To accomplish this task, I
have created a table,
dbaFilestats, and a stored
procedure
usp_Perf_FilestatGathering
that I run every day as a
scheduled job.
DbaFilestats is a table
that I keep in my DBA
database for gathering the
statistics. Here is the
CREATE TABLE script:
CREATE TABLE dbo.dbaFilestats (
ID int IDENTITY (1, 1) NOT NULL ,
dbid int NOT NULL ,
fileid int NOT NULL ,
Reads bigint NULL ,
Reads_FromStart bigint NULL ,
Writes bigint NULL ,
Writes_FromStart bigint NULL ,
BytesRead bigint NULL ,
BytesRead_FromStart bigint NULL ,
BytesWritten bigint NULL ,
BytesWritten_FromStart bigint NULL ,
IostallMS bigint NULL ,
IostallMS_FromStart bigint NULL ,
RecordedDT datetime NULL ,
timestamp bigint NULL ,
IntervalMSec bigint NULL ,
FirstMeasure_FromStart bit NULL
)
I've chosen to keep both
the interval measures, such
as Reads and BytesWritten,
and the measures from the
instance startup (%_FromStart).
It is possible to do away
with the _FromStart numbers
by keeping them in a
temporary table as the
stored procedure runs, but I
have found that every so
often, I want to see them.
The stored procedure to
perform the information
gathering is
dba_Perf_FilestatGathering.
It takes four parameters,
which are listed in Table 3.
Table 3 Parameters to
dba_Perf_FilestatGathering
|
Parameter |
Data Type |
Description |
|
@Duration |
Datetime |
The amount of time
to collect data. |
|
@IntervalSec |
Int |
The interval in
seconds between data
gathering. |
|
@WhichDB |
Int |
DB_ID of the
database to gather
for. -1 for all
databases. |
|
@WhichFile |
Int |
File_ID to gather
information for. -1
for all files. |
To create the procedure
on your system, you will
need a UDF,
udf_DT_AddTime,
which you will find at this
URL:
http://www.novicksoftware.com/UDFofWeek/Vol1/T-SQL-UDF-Volume-1-Number-38-udf_DT_Addtime.htm.
Here is the CREATE PROC
script:
CREATE PROC dbo.dba_Perf_FilestatGathering
@Duration datetime = '01:00:00' -- Duration of data collection
-- long time gathers forever
, @IntervalSec int = 600 -- Approx sec in the gathering interval
, @WhichDB int = -1 -- DB_ID to monitor, -1 for all
, @WhichFile int = -1 -- File_ID of file to monitor, -1 for all
/*
* Procedure to monitor fn_virtualfilestats and record their value at
* intervals in the dbaFilestats table.
*
* Example:
-- TRUNCATE TABLE DBAFILESTATS
DECLARE @RC int, @StartTime datetime, @thisDBID int
SELECT @StartTime = getdate(), @thisDBID = db_id()
EXEC @RC = dba_Perf_FilestatGathering '00:10:21', 60, @ThisDBID , 1
select * from dbafilestats where RecordedDT > @StartTime
and dbid=@thisDBID and fileid = 1 Order by ID
********************************************************************/
AS
DECLARE @TerminateGatheringDT datetime -- when to stop gathering
, @WaitFor_Interval datetime
, @LastRecordingDT datetime
, @RecordingDT datetime
, @myDBID int
, @myError int -- Local copy of @@ERROR
, @myRowCount int -- Local copy of @@RowCount
, @msgText nvarchar(4000) -- for error messages
SELECT @WaitFor_Interval = DATEADD (s, @IntervalSec
, CONVERT (datetime, '00:00:00', 108)
)
, @TerminateGatheringDT =
dbo.udf_DT_AddTime (getdate(), @Duration)
, @myDBID = db_id()
WHILE getdate() <= @TerminateGatheringDT BEGIN
SELECT @LastRecordingDT = @RecordingDT
SELECT @RecordingDT = getdate() --
-- Get Stats for All files
INSERT INTO dbo.dbaFilestats (
dbid
, fileid
, Reads
, Reads_FromStart
, Writes
, Writes_FromStart
, BytesRead
, BytesRead_FromStart
, BytesWritten
, Byteswritten_FromStart
, IOStallMS
, iostallMs_FromStart
, RecordedDT
, [timestamp]
, IntervalMSec
, FirstMeasure_FromSTart
)
SELECT vfs.dbid -- Database ID
, vfs.fileid -- File ID
, vfs.numberreads - dbaf.reads_fromstart as Reads
, vfs.numberreads as Reads_FromStart
, vfs.numberwrites - dbaf.writes_fromstart
as Writes
, vfs.numberwrites as Writes_FromStart
, vfs.bytesread - dbaf.bytesRead_FromStart
as BytesRead
, vfs.bytesread as BytesRead_FromStart
, vfs.bytesWritten - dbaf.BytesWritten_FromStart
as BytesWritten
, vfs.byteswritten as BytesWritten_FromStart
, vfs.IOStallMS - dbaf.IOStallMS_FromStart
as IOStallMS
, vfs.iostallms as IOStallMS_FromStart
, @RecordingDT
, vfs.[timestamp]
, CASE WHEN @LastRecordingDT IS NULL
THEN NULL
ELSE DATEDIFF(ms, dbaf.RecordedDT, @RecordingDT)
END as IntervalMSec
, CASE WHEN @LastRecordingDT IS NULL THEN 1 ELSE 0 END
as FirstMeasure_FromStart
FROM ::fn_virtualfilestats(@WhichDB , @WhichFile) vfs
LEFT OUTER JOIN dbaFilestats dbaf
ON vfs.dbid = dbaf.dbid and vfs.fileid = dbaf.fileid
WHERE (@LastRecordingDT IS NULL
OR dbaf.RecordedDT = @LastRecordingDT)
SELECT @myError = @@Error, @myRowCount = @@RowCount
IF @myError != 0 BEGIN
SET @msgText = 'Error ' + CONVERT(varchar(9), @myError)
+ ' in usp_FileStatGathering during INSERT'
RAISERROR (@msgText, 16, 1) WITH LOG
RETURN @myError
END
WAITFOR DELAY @WaitFor_Interval -- delay
END
I usually run for 10
minutes and 21 seconds
('00:10:21'), gathering
information every 60 seconds
for all databases (-1) and
all files (-1). The reason
for adding the 21 seconds to
the 10 minutes is that I
want 10 measurements and the
21 seconds allows for the
time that it takes to make
and store the measurements.
Let's take a look at some
of the raw data from a run
against my database 16,
which had a fairly heavy
load at the time:
SELECT Top 5 [ID], dbid, [fileid] as [File]
, Reads, Writes, BytesRead, BytesWritten, IOStallMS
, IntervalMSec as [MSec]
FROM dbaFilestats
WHERE FirstMeasure_FromStart = 0
ORDER BY [ID]
Go
(Results)
ID dbid File Reads Writes BytesRead BytesWritten IOStallMS MSec
---- ---- ----- ------ ------ --------- ------------ --------- ------
2 16 1 1243 564 35774464 25452544 74530 72846
3 16 1 1911 847 75489280 48136192 62485 64186
4 16 1 244 1356 37617664 53551104 3890 62263
5 16 1 1393 1315 64258048 38379520 40297 63500
6 16 1 152 1485 13426688 66273280 4733 62703
As you can see from the
MSec column, it took longer
than 60 seconds to get
through each of the
measurements. The actual
time will depend on the load
on the system and the number
of databases.
Looking at the raw
numbers from dbaFilestats is
helpful. However, I find the
most useful use of the
information is to turn most
of the measurements into
rates. This query does the
job:
SELECT Top 5
CAST(Reads/(IntervalMSec/1000.0) as Numeric (18,2))
as [Reads/Sec]
, CAST(Writes/(IntervalMsec/1000.0) as Numeric (18,2))
as [Writes/Sec]
, (Reads+Writes) as [I/O Count]
, CAST((Reads+Writes)/(IntervalMsec/1000.0) as Numeric(18,2))
as [I/O /sec]
, Iostallms/CASE WHEN (reads+writes)>0
THEN reads+writes
ELSE 1 END AS [IoStallMS/Op]
FROM dbaFilestats
WHERE FirstMeasure_FromStart = 0
AND dbid = 16 and fileid=1
ORDER BY [id]
GO
(Results)
Reads/Sec Writes/Sec I/O Count IO /sec IoStallMS/Op
----------- ------------ --------- ---------- ------------
17.06 7.74 1807 24.81 41
29.77 13.20 2758 42.97 22
3.92 21.78 1600 25.70 2
21.94 20.71 2708 42.65 14
2.42 23.68 1637 26.11 2
These rates indicate that
there may be a problem with
the disk(s) that database 16
is using. I know that
because of the high
IoStallMS/Op measurement (IoStall
Milliseconds per I/O
operation). Anything over
twenty indicates that there
is some type of problem.
That is in spite of the fact
that the I/O rates are
moderate, in the range of
25-43 per second. A modern
server disk should be able
to achieve 85 to 100
operations per second. The
situation warrants further
investigation.
Conclusion
Input/Output is at the
heart of what SQL Server
does. This article has shown
several methods for getting
I/O statistics from SQL
Server as an aid in
diagnosing performance
issues. I find that the most
useful technique is to
gather statistics during
peak usage at one-minute
intervals. The key
indicators are the I/O
operations per second and
the IoStallMS/Op. You might
want to investigate other
ratios for your server.