Performance considerations for common SQL queries
        Posted  
        
            by Jim Giercyk
        on Geeks with Blogs
        
        See other posts from Geeks with Blogs
        
            or by Jim Giercyk
        
        
        
        Published on Wed, 16 Oct 2013 05:06:20 GMT
        Indexed on 
            2013/10/17
            16:01 UTC
        
        
        Read the original article
        Hit count: 806
        
Originally posted on: http://geekswithblogs.net/NibblesAndBits/archive/2013/10/16/performance-considerations-for-common-sql-queries.aspx
SQL offers many different methods to produce the same results. There is a never-ending debate between SQL developers as to the “best way” or the “most efficient way” to render a result set. Sometimes these disputes even come to blows….well, I am a lover, not a fighter, so I decided to collect some data that will prove which way is the best and most efficient.
For the queries below, I downloaded the test database from SQLSkills: http://www.sqlskills.com/sql-server-resources/sql-server-demos/. There isn’t a lot of data, but enough to prove my point: dbo.member has 10,000 records, and dbo.payment has 15,554. Our result set contains 6,706 records.
The following queries produce an identical result set; the result set contains aggregate payment information for each member who has made more than 1 payment from the dbo.payment table and the first and last name of the member from the dbo.member table.
/*************/      
/* Sub Query  */       
/*************/       
SELECT  a.[Member Number] ,       
        m.lastname ,       
        m.firstname ,       
        a.[Number Of Payments] ,       
        a.[Average Payment] ,       
        a.[Total Paid]       
FROM    ( SELECT    member_no 'Member Number' ,       
                    AVG(payment_amt) 'Average Payment' ,       
                    SUM(payment_amt) 'Total Paid' ,       
                    COUNT(Payment_No) 'Number Of Payments'       
          FROM      dbo.payment       
          GROUP BY  member_no       
          HAVING    COUNT(Payment_No) > 1       
        ) a       
        JOIN dbo.member m ON a.[Member Number] = m.member_no       
        
/***************/      
/* Cross Apply  */       
/***************/       
SELECT  ca.[Member Number] ,       
        m.lastname ,       
        m.firstname ,       
        ca.[Number Of Payments] ,       
        ca.[Average Payment] ,       
        ca.[Total Paid]       
FROM    dbo.member m       
        CROSS APPLY ( SELECT    member_no 'Member Number' ,       
                                AVG(payment_amt) 'Average Payment' ,       
                                SUM(payment_amt) 'Total Paid' ,       
                                COUNT(Payment_No) 'Number Of Payments'       
                      FROM      dbo.payment       
                      WHERE     member_no = m.member_no       
                      GROUP BY  member_no       
                      HAVING    COUNT(Payment_No) > 1       
                    ) ca
   
/********/                    
/* CTEs  */       
/********/       
;       
WITH    Payments       
          AS ( SELECT   member_no 'Member Number' ,       
                        AVG(payment_amt) 'Average Payment' ,       
                        SUM(payment_amt) 'Total Paid' ,       
                        COUNT(Payment_No) 'Number Of Payments'       
               FROM     dbo.payment       
               GROUP BY member_no       
               HAVING   COUNT(Payment_No) > 1       
             ),       
        MemberInfo       
          AS ( SELECT   p.[Member Number] ,       
                        m.lastname ,       
                        m.firstname ,       
                        p.[Number Of Payments] ,       
                        p.[Average Payment] ,       
                        p.[Total Paid]       
               FROM     dbo.member m       
                        JOIN Payments p ON m.member_no = p.[Member Number]       
             )       
    SELECT  *       
    FROM    MemberInfo
   
/************************/      
/* SELECT with Grouping   */       
/************************/
SELECT  p.member_no 'Member Number' ,      
        m.lastname ,       
        m.firstname ,       
        COUNT(Payment_No) 'Number Of Payments' ,       
        AVG(payment_amt) 'Average Payment' ,       
        SUM(payment_amt) 'Total Paid'       
FROM    dbo.payment p       
        JOIN dbo.member m ON m.member_no = p.member_no       
GROUP BY p.member_no ,       
        m.lastname ,       
        m.firstname       
HAVING  COUNT(Payment_No) > 1
We can see what is going on in SQL’s brain by looking at the execution plan. The Execution Plan will demonstrate which steps and in what order SQL executes those steps, and what percentage of batch time each query takes. SO….if I execute all 4 of these queries in a single batch, I will get an idea of the relative time SQL takes to execute them, and how it renders the Execution Plan. We can settle this once and for all. Here is what SQL did with these queries:
Not only did the queries take the same amount of time to execute, SQL generated the same Execution Plan for each of them.  Everybody is right…..I guess we can all finally go to lunch together!  But wait a second, I may not be a fighter, but I AM an instigator.  
   Let’s see how a table variable stacks up.  Here is the code I executed:
/********************/
/*  Table Variable  */        
/********************/         
DECLARE @AggregateTable TABLE         
    (         
      member_no INT ,         
      AveragePayment MONEY ,         
      TotalPaid MONEY ,         
      NumberOfPayments MONEY         
    )
INSERT  @AggregateTable      
        SELECT  member_no 'Member Number' ,       
                AVG(payment_amt) 'Average Payment' ,       
                SUM(payment_amt) 'Total Paid' ,       
                COUNT(Payment_No) 'Number Of Payments'       
        FROM    dbo.payment       
        GROUP BY member_no       
        HAVING  COUNT(Payment_No) > 1       
 
SELECT  at.member_no 'Member Number' ,      
        m.lastname ,       
        m.firstname ,       
        at.NumberOfPayments 'Number Of Payments' ,       
        at.AveragePayment 'Average Payment' ,       
        at.TotalPaid 'Total Paid'       
FROM    @AggregateTable at       
        JOIN dbo.member m ON m.member_no = at.member_no
In the interest of keeping things in groupings of 4, I removed the last query from the previous batch and added the table variable query. Here’s what I got:
Since we first insert into the table variable, then we read from it, the Execution Plan renders 2 steps. BUT, the combination of the 2 steps is only 22% of the batch. It is actually faster than the other methods even though it is treated as 2 separate queries in the Execution Plan. The argument I often hear against Table Variables is that SQL only estimates 1 row for the table size in the Execution Plan. While this is true, the estimate does not come in to play until you read from the table variable. In this case, the table variable had 6,706 rows, but it still outperformed the other queries. People argue that table variables should only be used for hash or lookup tables. The fact is, you have control of what you put IN to the variable, so as long as you keep it within reason, these results suggest that a table variable is a viable alternative to sub-queries.
If anyone does volume testing on this theory, I would be interested in the results. My suspicion is that there is a breaking point where efficiency goes down the tubes immediately, and it would be interesting to see where the threshold is.
Coding SQL is a matter of style.  If you’ve been around since they introduced DB2, you were probably taught a little differently than a recent computer science graduate.  If you have a company standard, I strongly recommend you follow it.  
  If you do not have a standard, generally speaking, there is no right or wrong answer when talking about the efficiency of these types of queries, and certainly no hard-and-fast rule.  Volume and infrastructure will dictate a lot when it comes to performance, so your results may vary in your environment.  Download the database and try it!
© Geeks with Blogs or respective owner