Let's continue exploring some of the SQL Smells from Phil's list. He has been putting together.
Datatype mis-matches in predicates that rely on implicit conversion.(Plamen Ratchev)
This is a great example poking holes in the whole theory of "If it works it's not broken" Queries will this probably will generally work and give the correct response. In fact, without careful analysis, you probably may be completely oblivious that there is even a problem. This subtle little problem will needlessly complicate queries and slow them down regardless of the indexes applied.
Consider this example:
CREATE
TABLE [dbo].[Page](
[PageId] [int] IDENTITY(1,1)
NOT
NULL,
[Title] [varchar](75)
NOT
NULL,
[Sequence] [int] NOT
NULL,
[ThemeId] [int] NOT
NULL,
[CustomCss] [text] NOT
NULL,
[CustomScript] [text] NOT
NULL,
[PageGroupId] [int] NOT
NULL;
CREATE
PROCEDURE PageSelectBySequence
(
@sequenceMin smallint
,
@sequenceMax smallint
)
AS
BEGIN
SELECT [PageId] ,
[Title] ,
[Sequence] ,
[ThemeId] ,
[CustomCss] ,
[CustomScript] ,
[PageGroupId]
FROM [CMS].[dbo].[Page]
WHERE Sequence BETWEEN @sequenceMin AND @SequenceMax
END
Note that the Sequence column is defined as int while the sequence parameter is defined as a small int. The problem is that the database may have to do a lot of type conversions to evaluate the query. In some cases, this may even negate the indexes that you have in place.
Using Correlated subqueries instead of a join (Dave_Levy/ Plamen Ratchev)
There are two main problems here. The first is a little subjective, since this is a non-standard way of expressing the query, it is harder to understand.
The other problem is much more objective and potentially problematic. You are taking much of the control away from the optimizer. Written properly, such a query may well out perform a corresponding query written with traditional joins. More likely than not, performance will degrade.
Whenever you assume that you know better than the optimizer, you will most likely be wrong. This is the fundmental problem with any hint.
Consider a query like this:
SELECT Page.Title ,
Page.Sequence ,
Page.ThemeId ,
Page.CustomCss ,
Page.CustomScript ,
PageEffectParams.Name ,
PageEffectParams.Value ,
( SELECT EffectName
FROM dbo.Effect
WHERE EffectId = dbo.PageEffects.EffectId
)
AS EffectName
FROM Page
INNER
JOIN PageEffect ON Page.PageId = PageEffects.PageId
INNER
JOIN PageEffectParam ON PageEffects.PageEffectId = PageEffectParams.PageEffectId
This can and should be written as:
SELECT Page.Title ,
Page.Sequence ,
Page.ThemeId ,
Page.CustomCss ,
Page.CustomScript ,
PageEffectParams.Name ,
PageEffectParams.Value ,
EffectName
FROM Page
INNER
JOIN PageEffect ON Page.PageId = PageEffects.PageId
INNER
JOIN PageEffectParam ON PageEffects.PageEffectId = PageEffectParams.PageEffectId
INNER
JOIN dbo.Effect ON dbo.Effects.EffectId = dbo.PageEffects.EffectId
The correlated query may just as easily show up in the where clause. It's not a good idea in the select clause or the where clause.
Few or No comments.
This one is a bit more complicated and controversial. All comments are not created equal. Some comments are helpful and need to be included. Other comments are not necessary and may indicate a problem.
I tend to follow the rule of thumb that comments that explain why are good. Comments that explain how are bad. Many people may be shocked to hear the idea of a bad comment, but hear me out.
If a comment is needed to explain what is going on or how it works, the logic is too complex and needs to be simplified.
Comments that explain why are good. Comments may explain why the sql is needed are good. Comments that explain where the sql is used are good. Comments that explain how tables are related should not be needed if the sql is well written. If they are needed, you need to consider reworking the sql or simplify your data model.
Use of functions in a WHERE clause. (Anil Das)
Calling a function in the where clause will often negate the indexing strategy. The function will be called for every record considered. This will often a force a full table scan on the tables affected.
Calling a function will not guarantee that there is a full table scan, but there is a good chance that it will. If you find that you often need to write queries using a particular function, you may need to add a column to the table that has the function already applied.