Problem
The SAN data volume has a throughput capacity of 400MB/sec; however my query is still running slow and it is waiting on I/O (PAGEIOLATCH_SH). Windows Performance Monitor shows data volume speed of 4MB/sec. Where is the problem and how can I find the problem?
Solution
This is another summary of a great article published by R. Meyyappan at www.sqlworkshops.com. In my opinion, this is the first article that highlights and explains with working examples how PREFETCH determines the performance of a Nested Loop join. First of all, I just want to recall that Prefetch is a mechanism with which SQL Server can fire up many I/O requests in parallel for a Nested Loop join.
When SQL Server executes a Nested Loop join, it may or may not enable Prefetch accordingly to the number of rows in the outer table. If the number of rows in the outer table is greater than 25 then SQL will enable and use Prefetch to speed up query performance, but it will not if it is less than 25 rows.
In this section we are going to see different scenarios where prefetch is automatically enabled or disabled.
These examples only use two tables RegionalOrder and Orders. If you want to create the sample tables and sample data, please visit this site www.sqlworkshops.com.
The breakdown of the data in the RegionalOrders table is shown below and the Orders table contains about 6 million rows.
In this first example, I am creating a stored procedure against two tables and then execute the stored procedure. Before running the stored proceudre, I am going to include the actual execution plan.
--Example provided by www.sqlworkshops.com
--Create procedure that pulls orders based on City
--Do not forget to include the actual execution plan
CREATE PROC RegionalOrdersProc @City CHAR(20)
AS
BEGIN
DECLARE @OrderID INT, @OrderDetails CHAR(200)
SELECT @OrderID = o.OrderID, @OrderDetails = o.OrderDetails
FROM RegionalOrders ao INNER JOIN Orders o ON (o.OrderID = ao.OrderID)
WHERE City = @City
END
GO
SET STATISTICS time ON
GO
--Example provided by www.sqlworkshops.com
--Execute the procedure with parameter SmallCity1
EXEC RegionalOrdersProc 'SmallCity1'
GO
After running the stored procedure, if we right click on the Clustered Index Scan and click Properties we can see the Estimated Numbers of Rows is 24.
If we right click on Nested Loops and click Properties we do not see Prefetch, because it is disabled. This behavior was expected, because the number of rows containing the value ‘SmallCity1’ in the outer table is less than 25.
Now, if I run the same procedure with parameter ‘BigCity’ will Prefetch be enabled?
--Example provided by www.sqlworkshops.com
--Execute the procedure with parameter BigCity
--We are using cached plan
EXEC RegionalOrdersProc 'BigCity'
GO
As we can see from the below screenshot, prefetch is not enabled and the query takes around 7 seconds to execute. This is because the query used the cached plan from ‘SmallCity1’ that had prefetch disabled. Please note that even if we have 999 rows for ‘BigCity’ the Estimated Numbers of Rows is still 24.
Finally, let’s clear the procedure cache to trigger a new optimization and execute the procedure again.
DBCC freeproccache
GO
EXEC RegionalOrdersProc 'BigCity'
GO
This time, our procedure runs under a second, Prefetch is enabled and the Estimated Number of Rows is 999.
The RegionalOrdersProc can be optimized by using the below example where we are using an optimizer hint. I have also shown some other hints that could be used as well.
--Example provided by www.sqlworkshops.com
--You can fix the issue by using any of the following
--hints
--Create procedure that pulls orders based on City
DROP PROC RegionalOrdersProc
GO
CREATE PROC RegionalOrdersProc @City CHAR(20)
AS
BEGIN
DECLARE @OrderID INT, @OrderDetails CHAR(200)
SELECT @OrderID = o.OrderID, @OrderDetails = o.OrderDetails
FROM RegionalOrders ao INNER JOIN Orders o ON (o.OrderID = ao.OrderID)
WHERE City = @City
--Hinting optimizer to use SmallCity2 for estimation
OPTION (optimize FOR (@City = 'SmallCity2'))
--Hinting optimizer to estimate for the currnet parameters
--option (recompile)
--Hinting optimize not to use histogram rather
--density for estimation (average of all 3 cities)
--option (optimize for (@City UNKNOWN))
--option (optimize for UNKNOWN)
END
GO
Conclusion, this tip was mainly aimed at illustrating how Prefetch can speed up query execution and how the different number of rows can trigger this.