I'm having some serious performance issues with a multi-million row table that I feel I should be able to get results from fairly quick. Here's a run down of what I have, how I'm querying it, and how long it's taking:
I'm running SQL Server 2008 Standard, so Partitioning isn't currently an option
I'm attempting to aggregate all views for all inventory for a specific account over the last 30 days.
All views are stored in the following table:
CREATE TABLE [dbo].[LogInvSearches_Daily](
[ID] [bigint] IDENTITY(1,1) NOT NULL,
[Inv_ID] [int] NOT NULL,
[Site_ID] [int] NOT NULL,
[LogCount] [int] NOT NULL,
[LogDay] [smalldatetime] NOT NULL,
CONSTRAINT [PK_LogInvSearches_Daily] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
) ON [PRIMARY]
This table has 132,000,000 records, and is over 4 gigs.
A sample of 10 rows from the table:
ID Inv_ID Site_ID LogCount LogDay
-------------------- ----------- ----------- ----------- -----------------------
1 486752 48 14 2009-07-21 00:00:00
2 119314 51 16 2009-07-21 00:00:00
3 313678 48 25 2009-07-21 00:00:00
4 298863 0 1 2009-07-21 00:00:00
5 119996 0 2 2009-07-21 00:00:00
6 463777 534 7 2009-07-21 00:00:00
7 339976 503 2 2009-07-21 00:00:00
8 333501 570 4 2009-07-21 00:00:00
9 453955 0 12 2009-07-21 00:00:00
10 443291 0 4 2009-07-21 00:00:00
(10 row(s) affected)
I have the following index on LogInvSearches_Daily:
/****** Object: Index [IX_LogInvSearches_Daily_LogDay] Script Date: 05/12/2010 11:08:22 ******/
CREATE NONCLUSTERED INDEX [IX_LogInvSearches_Daily_LogDay] ON [dbo].[LogInvSearches_Daily]
(
[LogDay] ASC
)
INCLUDE ( [Inv_ID],
[LogCount]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
I need to pull inventory only from the Inventory for a specific account id. I have an index on the Inventory as well.
I'm using the following query to aggregate the data and give me the top 5 records. This query is currently taking 24 seconds to return the 5 rows:
StmtText
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SELECT TOP 5
Sum(LogCount) AS Views
, DENSE_RANK() OVER(ORDER BY Sum(LogCount) DESC, Inv_ID DESC) AS Rank
, Inv_ID
FROM LogInvSearches_Daily D (NOLOCK)
WHERE
LogDay DateAdd(d, -30, getdate())
AND EXISTS(
SELECT NULL FROM propertyControlCenter.dbo.Inventory (NOLOCK) WHERE Acct_ID = 18731 AND Inv_ID = D.Inv_ID
)
GROUP BY Inv_ID
(1 row(s) affected)
StmtText
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|--Top(TOP EXPRESSION:((5)))
|--Sequence Project(DEFINE:([Expr1007]=dense_rank))
|--Segment
|--Segment
|--Sort(ORDER BY:([Expr1006] DESC, [D].[Inv_ID] DESC))
|--Stream Aggregate(GROUP BY:([D].[Inv_ID]) DEFINE:([Expr1006]=SUM([LOALogs].[dbo].[LogInvSearches_Daily].[LogCount] as [D].[LogCount])))
|--Sort(ORDER BY:([D].[Inv_ID] ASC))
|--Nested Loops(Inner Join, OUTER REFERENCES:([D].[Inv_ID]))
|--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1011], [Expr1012], [Expr1010]))
| |--Compute Scalar(DEFINE:(([Expr1011],[Expr1012],[Expr1010])=GetRangeWithMismatchedTypes(dateadd(day,(-30),getdate()),NULL,(6))))
| | |--Constant Scan
| |--Index Seek(OBJECT:([LOALogs].[dbo].[LogInvSearches_Daily].[IX_LogInvSearches_Daily_LogDay] AS [D]), SEEK:([D].[LogDay] > [Expr1011] AND [D].[LogDay] < [Expr1012]) ORDERED FORWARD)
|--Index Seek(OBJECT:([propertyControlCenter].[dbo].[Inventory].[IX_Inventory_Acct_ID]), SEEK:([propertyControlCenter].[dbo].[Inventory].[Acct_ID]=(18731) AND [propertyControlCenter].[dbo].[Inventory].[Inv_ID]=[LOA
(13 row(s) affected)
I tried using a CTE to pick up the rows first and aggregate them, but that didn't run any faster, and gives me essentially the same execution plan.
(1 row(s) affected)
StmtText
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--SET SHOWPLAN_TEXT ON;
WITH getSearches AS (
SELECT
LogCount
-- , DENSE_RANK() OVER(ORDER BY Sum(LogCount) DESC, Inv_ID DESC) AS Rank
, D.Inv_ID
FROM LogInvSearches_Daily D (NOLOCK)
INNER JOIN propertyControlCenter.dbo.Inventory I (NOLOCK) ON Acct_ID = 18731 AND I.Inv_ID = D.Inv_ID
WHERE
LogDay DateAdd(d, -30, getdate())
-- GROUP BY Inv_ID
)
SELECT Sum(LogCount) AS Views, Inv_ID
FROM getSearches
GROUP BY Inv_ID
(1 row(s) affected)
StmtText
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|--Stream Aggregate(GROUP BY:([D].[Inv_ID]) DEFINE:([Expr1004]=SUM([LOALogs].[dbo].[LogInvSearches_Daily].[LogCount] as [D].[LogCount])))
|--Sort(ORDER BY:([D].[Inv_ID] ASC))
|--Nested Loops(Inner Join, OUTER REFERENCES:([D].[Inv_ID]))
|--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1008], [Expr1009], [Expr1007]))
| |--Compute Scalar(DEFINE:(([Expr1008],[Expr1009],[Expr1007])=GetRangeWithMismatchedTypes(dateadd(day,(-30),getdate()),NULL,(6))))
| | |--Constant Scan
| |--Index Seek(OBJECT:([LOALogs].[dbo].[LogInvSearches_Daily].[IX_LogInvSearches_Daily_LogDay] AS [D]), SEEK:([D].[LogDay] > [Expr1008] AND [D].[LogDay] < [Expr1009]) ORDERED FORWARD)
|--Index Seek(OBJECT:([propertyControlCenter].[dbo].[Inventory].[IX_Inventory_Acct_ID] AS [I]), SEEK:([I].[Acct_ID]=(18731) AND [I].[Inv_ID]=[LOALogs].[dbo].[LogInvSearches_Daily].[Inv_ID] as [D].[Inv_ID]) ORDERED FORWARD)
(8 row(s) affected)
(1 row(s) affected)
So given that I'm getting good Index Seeks in my execution plan, what can I do to get this running faster?
Thanks,
Dan