Something for the weekend - Whats the most complex query?
- by simonsabin
Whenever I teach about SQL Server performance tuning I
try can get across the message that there is no such thing as a table. Does that
sound odd, well it isn't, trust me. Rather than tables you need to consider
structures. You have
1. Heaps
2. Indexes (b-trees)
Some people split indexes in
two, clustered and non-clustered, this I feel confuses the situation as
people associate clustered indexes with sorting, but don't associate non clustered indexes with sorting, this
is wrong. Clustered and non-clustered indexes are the same b-tree structure(and
even more so with SQL 2005) with the leaf pages sorted
in a linked list according to the keys of the index.. The difference is
that non clustered indexes include in their structure either, the clustered key(s), or the
row identifier for the row in the table (see http://sqlblog.com/blogs/kalen_delaney/archive/2008/03/16/nonclustered-index-keys.aspx for
more details). Beyond that they are the same, they have key columns which
are stored on the root and intermediary
pages, and included columns which are on
the leaf level.
The reason this is important is that this is how the
optimiser sees the world, this means it can use any of these structures to
resolve your query. Even if your query only accesses one table, the optimiser can access
multiple structures to get your results. One commonly sees this with a non-clustered
index scan and then a key lookup (clustered index seek), but
importantly it's not restricted to just using one non-clustered index
and the clustered index or heap, and that's the
challenge for the weekend.
So the challenge for the weekend is to produce the most complex single table
query.
For those clever bods amongst you that are thinking, great I will just use
lots of xquery functions, sorry these are the rules.
1. You have to use a table from AdventureWorks (2005 or 2008)
2. You can add whatever indexes you like, but you must document these
3. You cannot use XQuery, Spatial, HierarchyId, Full Text or any open rowset
function.
4. You can only reference your table once, i..e a FROM clause with ONE table
and no JOINs
5. No Sub queries.
The aim of this is to show how the optimiser can use multiple structures to build the results of a query and to also highlight
why the optimiser is doing that. How many structures can you get the optimiser to
use?
As an example create these two indexes on AdventureWorks2008
create
index IX_Person_Person on Person.Person (lastName,
FirstName,NameStyle,PersonType)
create
index IX_Person_Person on Person.Person(BusinessentityId,ModifiedDate)with drop_existing
select lastName,
ModifiedDate
from Person.Person
where LastName = 'Smith'
You will see that the optimiser has decided to not access the
underlying clustered index of the table but to use two indexes above to resolve the
query. This highlights how the optimiser considers all storage structures,
clustered indexes, non clustered indexes and heaps when trying to resolve a
query.
So are you up to the challenge for the weekend to produce the most complex
single table query?
The prize is a pdf version of a popular SQL Server
book, or a physical book if you live in the UK.