Wednesday, September 3, 2014

WRITELOG waittype - Implicit vs. Explicit Transaction

WRITELOG waittype - Implicit vs. Explicit Transaction: disk or coding issue?

In a OLTP system where concurrency is high observing WRITELOG waittype is normal for a short period of time during peak usage. Generally, a high value of this waittype indicates slow response of the application. So, observing this waittype for a longer period of time and repeatedly for DML operations such as MERGE, INSERT, UPDATE and DELETE, obviously this indicates two specific issues:

1.Disk I/O sub-system
2.Transaction handing

If the underlying disk sub-system is the best in the industry and configured correctly, investigation shows nothing suspicious around I/O and if we still see the WRITELOG waittype then it is obviously the result of poor or improper handling of Transactions in a SQL batch.

Figure #1: WRITELOG waittype in a live system


Let’s explain what WRITELOG waittype is. This is a log management system waiting for dirty data pages to write on the disk. This waittype indicates that the SPID is waiting for a transaction log I/O request to be completed. Thus if the number of WRITELOG waittype is high that means there are a number of requests in the queue which are processing slowly.
Although, many experts and even MS KB article (KB822101) indicates it is a disk bottleneck, in reality this may not be the case unless your disk system is IDE or SATA based or configured incorrectly. We need to investigate one more area before confirming that the disk is an issue.

Implicit and explicit transaction:
As the WRITELOG waittype is related to the transaction log management system, let’s try to understand the difference between the two.

SQL Server handles two types of transaction mode:
(a)Implicit transaction
(b)Explicit transaction

Implicit transaction: In Implicit transaction mode, the instance of the SQL Server Database Engine automatically starts a new transaction after the current transaction is committed or rolled back for each DML statement. We do not need to define the start of a transaction; each statement level transaction will be committed or rolled back automatically.

Implicit transaction can be turned on or off per batch basis. We can use the “SET IMPLICIT_TRANSACTIONS ON/OFF” statement to turn implicit transaction mode off. We can use the “COMMIT TRANSACTION” or “ROLLBACK TRANSACTION” statements to end each transaction.

Explicit transaction: An explicit transaction is one in which we explicitly define both the starting and ending of the transaction. To start an explicit transaction, we can use “BEGIN TRANSACTION” and then complete the transaction by using “COMMIT TRANSACTION” or “ROLLBACK TRANSACTION”.

The main difference between Implicit and Explicit transaction is that implicit transaction is auto, controlled by SQL Server Engine and statement level transaction, whereas explicit transaction is user defined and batch-scoped transaction. Some key comparisons are follows:

Indicator
Implicit Transaction
Explicit Transaction
Scope
Statement level
Batch level
Log generation
continuous chain of transactions
One transaction log entry
Log I/O
For each statement gets one I/O
Entire batch needs one I/O
WRITELOG Waittype
Sustained period of time
N/A or unless there is real disk issue
Response/duration
Longer due to several I/O activities
Fast, less I/O activities
T-Log Restore
Long, Redo/undo phase
Short redo/undo phase
Impact on T-Log
Extremely high
Less and insignificant
Write Duration
Longer and slow
Shorter and fast

The application which is utilizing the Implicit transaction mechanism actually makes the I/O system busy and degrades application response time dramatically. As a result, various disk related performance counters show significantly high values and we start believing that the disk is the bottleneck.

Observing the WRITLOG waittype:
In the following test we will review how implicit and explicit transaction incurs WRITELOG waittype. The given script can be executed in any version of SQL Server and with any kind of disk system to understand the behavior.


Complete T-SQL Code to produce the behavior:
/*Implicit vs Explicit Transaction
Author: Sushil Rout
Apply to: SQL 2005 to 2012*/

DBCC SQLPERF('sys.dm_os_wait_stats', CLEAR);

IF OBJECT_ID('tempdb..#wait') IS NOT NULL
DROP TABLE #wait
GO

/*Create a test database*/

USE [master]
GO
IF EXISTS ( SELECT name FROM sys.databases WHERE name = N'TestDB' )
BEGIN
ALTER DATABASE [TestDB] SET SINGLE_USER WITH ROLLBACK IMMEDIATE
DROP DATABASE [TestDB]
END
GO

CREATE DATABASE TestDB
GO

USE TestDB
GO

/*Create a a table*/

SET NOCOUNT ON
IF OBJECT_ID('tblLarge') IS NOT NULL
DROP TABLE tblLarge
GO

CREATE TABLE tblLarge
(
xID INT IDENTITY(1, 1) ,
sName1 VARCHAR(100) ,
sName2 VARCHAR(1000) ,
sName3 VARCHAR(400) ,
sIdentifier CHAR(100) ,
dDOB DATETIME NULL ,
nWage NUMERIC(20, 2) ,
sLicense VARCHAR(25)
)
GO

/* Perform a checkpoint*/

CHECKPOINT


/* Collect initial wait stats*/

SELECT 'w1' WaitRun ,
wait_type ,
wait_time_ms / 1000.0 AS WaitSec ,
( wait_time_ms - signal_wait_time_ms ) / 1000.0 AS ResourceSec ,
signal_wait_time_ms / 1000.0 AS SignalSec ,
waiting_tasks_count AS WaitCount
INTO #Wait
FROM sys.dm_os_wait_stats
WHERE wait_type LIKE 'WRITELOG%'

GO

/* Using explicit Transaction*/

SET NOCOUNT ON
USE TestDB
GO

DECLARE @n INT
SET @n=1

BEGIN TRANSACTION
WHILE @n <= 10000
BEGIN
INSERT INTO tblLarge ( sName1 , sName2 , sName3 , sIdentifier , dDOB , nWage , sLicense )
VALUES ( LEFT(CAST(NEWID() AS VARCHAR(36)), RAND() * 50) , -- sName1
LEFT(CAST(NEWID() AS VARCHAR(36)), RAND() * 60) , -- sName2
LEFT(CAST(NEWID() AS VARCHAR(36)), RAND() * 70) , -- sName2
LEFT(CAST(NEWID() AS VARCHAR(36)), 2) , -- sIdentifier
DATEADD(dd, -RAND() * 20000, GETDATE()) , -- dDOB
( RAND() * 1000 ) , -- nWage
SUBSTRING(CAST(NEWID() AS VARCHAR(36)), 6, 7) -- sLicense
)
SET @n = @n + 1
END
COMMIT TRANSACTION

GO

/* Collect wait stats for Explicit TRN */

INSERT INTO #wait
SELECT 'w2' WaitRun ,
wait_type ,
wait_time_ms / 1000.0 AS WaitSec ,
( wait_time_ms - signal_wait_time_ms ) / 1000.0 AS ResourceSec ,
signal_wait_time_ms / 1000.0 AS SignalSec ,
waiting_tasks_count AS WaitCount
FROM sys.dm_os_wait_stats
WHERE wait_type LIKE 'WRITELOG%'
GO

/* Check for Log flush for Explicit TRN*/

SELECT 'Explicit' AS Transaction_Type , SPID , Operation ,COUNT(Operation) [OperationCount] ,
[Transaction Name] , COUNT([Transaction Name]) AS [TransactionNameCount]
FROM :: fn_dblog(NULL, NULL)
WHERE operation LIKE '%LOP_BEGIN_XACT%'
AND [SPID] = @@spid
AND [SPID] IS NOT NULL
GROUP BY SPID , Operation , [Transaction Name]
GO

/* Using implicit Transaction*/

SET NOCOUNT ON
USE TestDB
GO

DECLARE @n INT
SET @n=1

WHILE @n <= 10000
BEGIN
INSERT INTO tblLarge( sName1 , sName2 , sName3 , sIdentifier , dDOB , nWage , sLicense)
VALUES ( LEFT(CAST(NEWID() AS VARCHAR(36)), RAND() * 50) , -- sName1
LEFT(CAST(NEWID() AS VARCHAR(36)), RAND() * 60) , -- sName2
LEFT(CAST(NEWID() AS VARCHAR(36)), RAND() * 70) , -- sName2
LEFT(CAST(NEWID() AS VARCHAR(36)), 2) , -- sIdentifier
DATEADD(dd, -RAND() * 20000, GETDATE()) , -- dDOB
( RAND() * 1000 ) , -- nWage
SUBSTRING(CAST(NEWID() AS VARCHAR(36)), 6, 7) -- sLicense
)
SET @n = @n + 1
END

GO

/*Collect wait stats for Implicit TRN*/

INSERT INTO #wait
SELECT 'w3' WaitRun , wait_type , wait_time_ms / 1000.0 AS WaitSec ,
( wait_time_ms - signal_wait_time_ms ) / 1000.0 AS ResourceSec ,
signal_wait_time_ms / 1000.0 AS SignalSec , waiting_tasks_count AS WaitCount
FROM sys.dm_os_wait_stats
WHERE wait_type LIKE 'WRITELOG%'
GO

/*Check for Log flush*/

SELECT 'Implicit' AS Transaction_Type , SPID , Operation , COUNT(Operation) [OperationCount] ,
[Transaction Name] , COUNT([Transaction Name]) AS [TransactionNameCount]
FROM :: fn_dblog(NULL, NULL)
WHERE operation LIKE '%LOP_BEGIN_XACT%'
AND [SPID] = @@spid
AND [SPID] IS NOT NULL
GROUP BY SPID , Operation , [Transaction Name]
GO

/* Compare the waittype collection*/

SELECT a.wait_type , 'Explicit Transaction' AS Transaction_Type ,
( b.WaitSec - a.WaitSec ) AS TotalwaitSec ,
( b.ResourceSec - a.ResourceSec ) AS TotalResourceSec ,
( b.SignalSec - a.SignalSec ) AS TotalSignalSec ,
( b.WaitCount - a.WaitCount ) AS TotalWaitCount
FROM ( SELECT *
FROM #wait
WHERE waitrun = 'w1'
) a
JOIN ( SELECT * FROM #wait WHERE waitrun = 'w2'
) b ON a.wait_type = b.wait_type

UNION ALL

SELECT a.wait_type ,
'Implicit Transaction' AS Transaction_Type ,
( b.WaitSec - a.WaitSec ) AS TotalwaitSec ,
( b.ResourceSec - a.ResourceSec ) AS TotalResourceSec ,
( b.SignalSec - a.SignalSec ) AS TotalSignalSec ,
( b.WaitCount - a.WaitCount ) AS TotalWaitCount
FROM ( SELECT *
FROM #wait
WHERE waitrun = 'w2'
) a
JOIN ( SELECT * FROM #wait WHERE waitrun = 'w3'
) b ON a.wait_type = b.wait_type

GO

Figure #1: implicit vs. explicit



Figure #2: implicit vs. explicit



Figure #3: implicit vs. explicit




Analysis:
The analysis is simple and easy to understand from the output. From the above pictures we can see that the implicit transaction is always slow, signal wait is high and requires more I/O activities to complete a task.

Summary:
We saw that WRITELOG waittype does not necessarily indicate a disk throughput problem. It is the SQL Server Transaction Log management system who is doing more work to write each transaction into the log file. So before concluding, I would recommend to review your application code and see if you can use explicit transactions. I am sure that even though you use an Enterprise Class SSD system, implicit transaction may not improve the application performance dramatically.