Seeking on a Heap, and Two Useful DMVs

Posted by Paul White on SQL Blog See other posts from SQL Blog or by Paul White
Published on Thu, 17 Feb 2011 06:23:25 GMT Indexed on 2011/02/17 7:31 UTC
Read the original article Hit count: 472

Filed under:

So far in this mini-series on seeks and scans, we have seen that a simple ‘seek’ operation can be much more complex than it first appears.  A seek can contain one or more seek predicates – each of which can either identify at most one row in a unique index (a singleton lookup) or a range of values (a range scan).  When looking at a query plan, we will often need to look at the details of the seek operator in the Properties window to see how many operations it is performing, and what type of operation each one is.  As you saw in the first post in this series, the number of hidden seeking operations can have an appreciable impact on performance.

Measuring Seeks and Scans

I mentioned in my last post that there is no way to tell from a graphical query plan whether you are seeing a singleton lookup or a range scan.  You can work it out – if you happen to know that the index is defined as unique and the seek predicate is an equality comparison, but there’s no separate property that says ‘singleton lookup’ or ‘range scan’.  This is a shame, and if I had my way, the query plan would show different icons for range scans and singleton lookups – perhaps also indicating whether the operation was one or more of those operations underneath the covers.

In light of all that, you might be wondering if there is another way to measure how many seeks of either type are occurring in your system, or for a particular query.  As is often the case, the answer is yes – we can use a couple of dynamic management views (DMVs): sys.dm_db_index_usage_stats and sys.dm_db_index_operational_stats.

Index Usage Stats

The index usage stats DMV contains counts of index operations from the perspective of the Query Executor (QE) – the SQL Server component that is responsible for executing the query plan.  It has three columns that are of particular interest to us:

  • user_seeks – the number of times an Index Seek operator appears in an executed plan
  • user_scans – the number of times a Table Scan or Index Scan operator appears in an executed plan
  • user_lookups – the number of times an RID or Key Lookup operator appears in an executed plan

An operator is counted once per execution (generating an estimated plan does not affect the totals), so an Index Seek that executes 10,000 times in a single plan execution adds 1 to the count of user seeks.  Even less intuitively, an operator is also counted once per execution even if it is not executed at all.  I will show you a demonstration of each of these things later in this post.

Index Operational Stats

The index operational stats DMV contains counts of index and table operations from the perspective of the Storage Engine (SE).  It contains a wealth of interesting information, but the two columns of interest to us right now are:

  • range_scan_count – the number of range scans (including unrestricted full scans) on a heap or index structure
  • singleton_lookup_count – the number of singleton lookups in a heap or index structure

This DMV counts each SE operation, so 10,000 singleton lookups will add 10,000 to the singleton lookup count column, and a table scan that is executed 5 times will add 5 to the range scan count.

The Test Rig

To explore the behaviour of seeks and scans in detail, we will need to create a test environment.  The scripts presented here are best run on SQL Server 2008 Developer Edition, but the majority of the tests will work just fine on SQL Server 2005.  A couple of tests use partitioning, but these will be skipped if you are not running an Enterprise-equivalent SKU.  Ok, first up we need a database:

USE     master;
GO
IF      DB_ID('ScansAndSeeks')
        IS NOT NULL
        DROP DATABASE ScansAndSeeks;
GO
CREATE  DATABASE ScansAndSeeks;
GO
USE     ScansAndSeeks;
GO
ALTER   DATABASE ScansAndSeeks
SET     ALLOW_SNAPSHOT_ISOLATION OFF
;
ALTER   DATABASE ScansAndSeeks 
SET     AUTO_CLOSE OFF,
        AUTO_SHRINK OFF,
        AUTO_CREATE_STATISTICS OFF,
        AUTO_UPDATE_STATISTICS OFF,
        PARAMETERIZATION SIMPLE,
        READ_COMMITTED_SNAPSHOT OFF,
        RESTRICTED_USER
;

Notice that several database options are set in particular ways to ensure we get meaningful and reproducible results from the DMVs.  In particular, the options to auto-create and update statistics are disabled.  There are also three stored procedures, the first of which creates a test table (which may or may not be partitioned).  The table is pretty much the same one we used yesterday:

Example Table and Data

The table has 100 rows, and both the key_col and data columns contain the same values – the integers from 1 to 100 inclusive.  The table is a heap, with a non-clustered primary key on key_col, and a non-clustered non-unique index on the data column.  The only reason I have used a heap here, rather than a clustered table, is so I can demonstrate a seek on a heap later on.  The table has an extra column (not shown because I am too lazy to update the diagram from yesterday) called padding – a CHAR(100) column that just contains 100 spaces in every row.  It’s just there to discourage SQL Server from choosing table scan over an index + RID lookup in one of the tests.

The first stored procedure is called ResetTest:

CREATE  PROCEDURE dbo.ResetTest
        @Partitioned    BIT = 'false'
AS
BEGIN
        SET     NOCOUNT ON
        ;
        IF      OBJECT_ID(N'dbo.Example', N'U')
                IS NOT NULL
        BEGIN
                DROP TABLE dbo.Example;
        END
        ;
        -- Test table is a heap
        -- Non-clustered primary key on 'key_col'
        CREATE  TABLE 
                dbo.Example
                (
                key_col     INTEGER     NOT NULL,
                data        INTEGER     NOT NULL,
                padding     CHAR(100)   NOT NULL DEFAULT SPACE(100),
                
                CONSTRAINT [PK dbo.Example key_col]
                    PRIMARY KEY NONCLUSTERED (key_col)
                )
        ;
        IF      @Partitioned = 'true'
        BEGIN
                -- Enterprise, Trial, or Developer 
                -- required for partitioning tests
                IF  SERVERPROPERTY('EngineEdition') = 3
                BEGIN
                    EXECUTE
                    ('
                    DROP    TABLE dbo.Example
                    ;
                    IF      EXISTS 
                            (
                            SELECT  1 
                            FROM    sys.partition_schemes 
                            WHERE   name = N''PS''
                            ) 
                            DROP PARTITION SCHEME PS
                    ;
                    IF      EXISTS 
                            (
                            SELECT  1 
                            FROM    sys.partition_functions 
                            WHERE   name = N''PF''
                            ) 
                            DROP PARTITION FUNCTION PF
                    ;
                    CREATE  PARTITION FUNCTION 
                            PF (INTEGER)
                    AS      RANGE RIGHT
                            FOR VALUES (20, 40, 60, 80, 100)
                    ;
                    CREATE  PARTITION SCHEME PS
                    AS      PARTITION PF
                            ALL TO ([PRIMARY])
                    ;
                    CREATE  TABLE 
                            dbo.Example
                            (
                            key_col     INTEGER     NOT NULL,
                            data        INTEGER     NOT NULL,
                            padding     CHAR(100)   NOT NULL 
                                            DEFAULT SPACE(100),
                            
                            CONSTRAINT [PK dbo.Example key_col]
                                PRIMARY KEY NONCLUSTERED (key_col)
                            )
                    ON      PS (key_col);
                    ');
            END
            ELSE
            BEGIN
                    RAISERROR('Invalid SKU for partition test', 16, 1);
                    RETURN;
            END;
        END
        ;
        -- Non-unique non-clustered index on the 'data' column
        CREATE  NONCLUSTERED INDEX 
                [IX dbo.Example data]
                ON dbo.Example (data)
        ;
        -- Add 100 rows
        INSERT  dbo.Example 
                WITH (TABLOCKX)
                (
                key_col, 
                data
                )
        SELECT  key_col = V.number, 
                data = V.number
        FROM    master.dbo.spt_values AS V
        WHERE   V.[type] = N'P'
        AND     V.number BETWEEN 1 AND 100
        ;
END;
GO

The second stored procedure, ShowStats, displays information from the Index Usage Stats and Index Operational Stats DMVs:

CREATE  PROCEDURE dbo.ShowStats
        @Partitioned    BIT = 'false'
AS
BEGIN
        -- Index Usage Stats DMV (QE)
        SELECT  index_name = ISNULL(I.name, I.type_desc),
                scans = IUS.user_scans, 
                seeks = IUS.user_seeks, 
                lookups = IUS.user_lookups
        FROM    sys.dm_db_index_usage_stats AS IUS
        JOIN    sys.indexes AS I
                ON  I.object_id = IUS.object_id
                AND I.index_id = IUS.index_id
        WHERE   IUS.database_id = DB_ID(N'ScansAndSeeks')
        AND     IUS.object_id = OBJECT_ID(N'dbo.Example', N'U')
        ORDER   BY
                I.index_id
        ;
        -- Index Operational Stats DMV (SE)
        IF      @Partitioned = 'true'
                SELECT  index_name = ISNULL(I.name, I.type_desc),
                        partitions = COUNT(IOS.partition_number),
                        range_scans = SUM(IOS.range_scan_count),
                        single_lookups = SUM(IOS.singleton_lookup_count)
                FROM    sys.dm_db_index_operational_stats
                            (
                            DB_ID(N'ScansAndSeeks'), 
                            OBJECT_ID(N'dbo.Example', N'U'), 
                            NULL, 
                            NULL
                            ) AS IOS
                JOIN    sys.indexes AS I
                        ON  I.object_id = IOS.object_id
                        AND I.index_id = IOS.index_id
                GROUP   BY
                        I.index_id, -- Key
                        I.name, 
                        I.type_desc
                ORDER   BY
                        I.index_id;
        ELSE
                SELECT  index_name = ISNULL(I.name, I.type_desc),
                        range_scans = SUM(IOS.range_scan_count),
                        single_lookups = SUM(IOS.singleton_lookup_count)
                FROM    sys.dm_db_index_operational_stats
                            (
                            DB_ID(N'ScansAndSeeks'), 
                            OBJECT_ID(N'dbo.Example', N'U'), 
                            NULL, 
                            NULL
                            ) AS IOS
                JOIN    sys.indexes AS I
                        ON  I.object_id = IOS.object_id
                        AND I.index_id = IOS.index_id
                GROUP   BY
                        I.index_id, -- Key
                        I.name, 
                        I.type_desc
                ORDER   BY
                        I.index_id;
END;

The final stored procedure, RunTest, executes a query written against the example table:

CREATE  PROCEDURE dbo.RunTest
        @SQL            VARCHAR(8000),
        @Partitioned    BIT = 'false'
AS
BEGIN
        -- No execution plan yet
        SET     STATISTICS XML OFF
        ;
        -- Reset the test environment
        EXECUTE dbo.ResetTest @Partitioned
        ;
        -- Previous call will throw an error if a partitioned
        -- test was requested, but SKU does not support it
        IF      @@ERROR = 0
        BEGIN
                -- IO statistics and plan on
                SET     STATISTICS XML, IO ON
                ;
                -- Test statement
                EXECUTE (@SQL)
                ;
                -- Plan and IO statistics off
                SET     STATISTICS XML, IO OFF
                ;
                EXECUTE dbo.ShowStats @Partitioned;
        END;
END;

The Tests

The first test is a simple scan of the heap table:

EXECUTE dbo.RunTest 
        @SQL = 'SELECT * FROM Example';

Heap Scan

The top result set comes from the Index Usage Stats DMV, so it is the Query Executor’s (QE) view.  The lower result is from Index Operational Stats, which shows statistics derived from the actions taken by the Storage Engine (SE).  We see that QE performed 1 scan operation on the heap, and SE performed a single range scan.  Let’s try a single-value equality seek on a unique index next:

EXECUTE dbo.RunTest 
        @SQL = 'SELECT key_col 
                FROM Example 
                WHERE key_col = 32';

Singleton Lookup

This time we see a single seek on the non-clustered primary key from QE, and one singleton lookup on the same index by the SE.  Now for a single-value seek on the non-unique non-clustered index:

EXECUTE dbo.RunTest 
        @SQL = 'SELECT data 
                FROM Example 
                WHERE data = 32';

Range Scan 1

QE shows a single seek on the non-clustered non-unique index, but SE shows a single range scan on that index – not the singleton lookup we saw in the previous test.  That makes sense because we know that only a single-value seek into a unique index is a singleton seek.  A single-value seek into a non-unique index might retrieve any number of rows, if you think about it.  The next query is equivalent to the IN list example seen in the first post in this series, but it is written using OR (just for variety, you understand):

EXECUTE dbo.RunTest 
        @SQL = 'SELECT data 
                FROM Example 
                WHERE data = 32 
                OR data = 33';

Range Scan 2

The plan looks the same, and there’s no difference in the stats recorded by QE, but the SE shows two range scans.  Again, these are range scans because we are looking for two values in the data column, which is covered by a non-unique index.  I’ve added a snippet from the Properties window to show that the query plan does show two seek predicates, not just one.  Now let’s rewrite the query using BETWEEN:

EXECUTE dbo.RunTest 
        @SQL = 'SELECT data 
                FROM Example 
                WHERE data BETWEEN 32 AND 33';

Range Scan 3

Notice the seek operator only has one predicate now – it’s just a single range scan from 32 to 33 in the index – as the SE output shows.  For the next test, we will look up four values in the key_col column:

EXECUTE dbo.RunTest 
        @SQL = 'SELECT key_col 
                FROM Example 
                WHERE key_col IN (2,4,6,8)';

Singleton Lookups

Just a single seek on the PK from the Query Executor, but four singleton lookups reported by the Storage Engine – and four seek predicates in the Properties window.  On to a more complex example:

EXECUTE dbo.RunTest 
        @SQL = 'SELECT * FROM Example 
                WITH (INDEX([PK dbo.Example key_col])) 
                WHERE key_col BETWEEN 1 AND 8';

RID Lookups

This time we are forcing use of the non-clustered primary key to return eight rows.  The index is not covering for this query, so the query plan includes an RID lookup into the heap to fetch the data and padding columns.  The QE reports a seek on the PK and a lookup on the heap.  The SE reports a single range scan on the PK (to find key_col values between 1 and 8), and eight singleton lookups on the heap.  Remember that a bookmark lookup (RID or Key) is a seek to a single value in a ‘unique index’ – it finds a row in the heap or cluster from a unique RID or clustering key – so that’s why lookups are always singleton lookups, not range scans.

Our next example shows what happens when a query plan operator is not executed at all:

EXECUTE dbo.RunTest 
        @SQL = 'SELECT key_col 
                FROM Example 
                WHERE key_col = 8 
                AND @@TRANCOUNT < 0';

Startup Filter

The Filter has a start-up predicate which is always false (if your @@TRANCOUNT is less than zero, call CSS immediately).  The index seek is never executed, but QE still records a single seek against the PK because the operator appears once in an executed plan.  The SE output shows no activity at all.  This next example is 2008 and above only, I’m afraid:

EXECUTE dbo.RunTest 
        @SQL = 'SELECT * 
                FROM Example 
                WHERE key_col BETWEEN 1 AND 30', 
        @Partitioned = 'true';

Heap Seek 1

This is the first example to use a partitioned table.  QE reports a single seek on the heap (yes – a seek on a heap), and the SE reports two range scans on the heap.  SQL Server knows (from the partitioning definition) that it only needs to look at partitions 1 and 2 to find all the rows where key_col is between 1 and 30 – the engine seeks to find the two partitions, and performs a range scan seek on each partition.

The final example for today is another seek on a heap – try to work out the output of the query before running it!

EXECUTE dbo.RunTest 
        @SQL = 'SELECT TOP (2) WITH TIES * 
                FROM Example 
                WHERE key_col BETWEEN 1 AND 50 
                ORDER BY $PARTITION.PF(key_col) DESC', 
        @Partitioned = 'true';

Heap Seek 2

Notice the lack of an explicit Sort operator in the query plan to enforce the ORDER BY clause, and the backward range scan.

© 2011 Paul White

email: [email protected]
twitter: @SQL_Kiwi

© SQL Blog or respective owner