Search Results

Search found 85 results on 4 pages for 'nonclustered'.

Page 4/4 | < Previous Page | 1 2 3 4 

  • SQL Server - Rebuilding Indexes

    - by Renso
    Goal: Rebuild indexes in SQL server. This can be done one at a time or with the example script below to rebuild all index for a specified table or for all tables in a given database. Why? The data in indexes gets fragmented over time. That means that as the index grows, the newly added rows to the index are physically stored in other sections of the allocated database storage space. Kind of like when you load your Christmas shopping into the trunk of your car and it is full you continue to load some on the back seat, in the same way some storage buffer is created for your index but once that runs out the data is then stored in other storage space and your data in your index is no longer stored in contiguous physical pages. To access the index the database manager has to "string together" disparate fragments to create the full-index and create one contiguous set of pages for that index. Defragmentation fixes that. What does the fragmentation affect?Depending of course on how large the table is and how fragmented the data is, can cause SQL Server to perform unnecessary data reads, slowing down SQL Server’s performance.Which index to rebuild?As a rule consider that when reorganize a table's clustered index, all other non-clustered indexes on that same table will automatically be rebuilt. A table can only have one clustered index.How to rebuild all the index for one table:The DBCC DBREINDEX command will not automatically rebuild all of the indexes on a given table in a databaseHow to rebuild all indexes for all tables in a given database:USE [myDB]    -- enter your database name hereDECLARE @tableName varchar(255)DECLARE TableCursor CURSOR FORSELECT table_name FROM information_schema.tablesWHERE table_type = 'base table'OPEN TableCursorFETCH NEXT FROM TableCursor INTO @tableNameWHILE @@FETCH_STATUS = 0BEGINDBCC DBREINDEX(@tableName,' ',90)     --a fill factor of 90%FETCH NEXT FROM TableCursor INTO @tableNameENDCLOSE TableCursorDEALLOCATE TableCursorWhat does this script do?Reindexes all indexes in all tables of the given database. Each index is filled with a fill factor of 90%. While the command DBCC DBREINDEX runs and rebuilds the indexes, that the table becomes unavailable for use by your users temporarily until the rebuild has completed, so don't do this during production  hours as it will create a shared lock on the tables, although it will allow for read-only uncommitted data reads; i.e.e SELECT.What is the fill factor?Is the percentage of space on each index page for storing data when the index is created or rebuilt. It replaces the fill factor when the index was created, becoming the new default for the index and for any other nonclustered indexes rebuilt because a clustered index is rebuilt. When fillfactor is 0, DBCC DBREINDEX uses the fill factor value last specified for the index. This value is stored in the sys.indexes catalog view. If fillfactor is specified, table_name and index_name must be specified. If fillfactor is not specified, the default fill factor, 100, is used.How do I determine the level of fragmentation?Run the DBCC SHOWCONTIG command. However this requires you to specify the ID of both the table and index being. To make it a lot easier by only requiring you to specify the table name and/or index you can run this script:DECLARE@ID int,@IndexID int,@IndexName varchar(128)--Specify the table and index namesSELECT @IndexName = ‘index_name’    --name of the indexSET @ID = OBJECT_ID(‘table_name’)  -- name of the tableSELECT @IndexID = IndIDFROM sysindexesWHERE id = @ID AND name = @IndexName--Show the level of fragmentationDBCC SHOWCONTIG (@id, @IndexID)Here is an example:DBCC SHOWCONTIG scanning 'Tickets' table...Table: 'Tickets' (1829581556); index ID: 1, database ID: 13TABLE level scan performed.- Pages Scanned................................: 915- Extents Scanned..............................: 119- Extent Switches..............................: 281- Avg. Pages per Extent........................: 7.7- Scan Density [Best Count:Actual Count].......: 40.78% [115:282]- Logical Scan Fragmentation ..................: 16.28%- Extent Scan Fragmentation ...................: 99.16%- Avg. Bytes Free per Page.....................: 2457.0- Avg. Page Density (full).....................: 69.64%DBCC execution completed. If DBCC printed error messages, contact your system administrator.What's important here?The Scan Density; Ideally it should be 100%. As time goes by it drops as fragmentation occurs. When the level drops below 75%, you should consider re-indexing.Here are the results of the same table and clustered index after running the script:DBCC SHOWCONTIG scanning 'Tickets' table...Table: 'Tickets' (1829581556); index ID: 1, database ID: 13TABLE level scan performed.- Pages Scanned................................: 692- Extents Scanned..............................: 87- Extent Switches..............................: 86- Avg. Pages per Extent........................: 8.0- Scan Density [Best Count:Actual Count].......: 100.00% [87:87]- Logical Scan Fragmentation ..................: 0.00%- Extent Scan Fragmentation ...................: 22.99%- Avg. Bytes Free per Page.....................: 639.8- Avg. Page Density (full).....................: 92.10%DBCC execution completed. If DBCC printed error messages, contact your system administrator.What's different?The Scan Density has increased from 40.78% to 100%; no fragmentation on the clustered index. Note that since we rebuilt the clustered index, all other index were also rebuilt.

    Read the article

  • SQL Server 2008: FileStream Insertion Failure w/ .NET 3.5SP1

    - by James Alexander
    I've configured a db w/ a FileStream group and have a table w/ File type on it. When attempting to insert a streamed file and after I create the table row, my query to read the filepath out and the buffer returns a null file path. I can't seem to figure out why though. Here is the table creation script: /****** Object: Table [dbo].[JobInstanceFile] Script Date: 03/22/2010 18:05:36 ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO SET ANSI_PADDING ON GO CREATE TABLE [dbo].[JobInstanceFile]( [JobInstanceFileId] [int] IDENTITY(1,1) NOT NULL, [JobInstanceId] [int] NOT NULL, [File] [varbinary](max) FILESTREAM NULL, [FileId] [uniqueidentifier] ROWGUIDCOL NOT NULL, [Created] [datetime] NOT NULL, CONSTRAINT [PK_JobInstanceFile] PRIMARY KEY CLUSTERED ( [JobInstanceFileId] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] FILESTREAM_ON [JobInstanceFilesGroup], UNIQUE NONCLUSTERED ( [FileId] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] FILESTREAM_ON [JobInstanceFilesGroup] GO SET ANSI_PADDING OFF GO ALTER TABLE [dbo].[JobInstanceFile] ADD DEFAULT (newid()) FOR [FileId] GO Here's my proc I call to create the row before streaming the file: /****** Object: StoredProcedure [dbo].[JobInstanceFileCreate] Script Date: 03/22/2010 18:06:23 ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO create proc [dbo].[JobInstanceFileCreate] @JobInstanceId int, @Created datetime as insert into JobInstanceFile (JobInstanceId, FileId, Created) values (@JobInstanceId, newid(), @Created) select scope_identity() GO And lastly, here's the code I'm using: public int CreateJobInstanceFile(int jobInstanceId, string filePath) { using (var connection = new SqlConnection(ConfigurationManager.ConnectionStrings["ConsumerMarketingStoreFiles"].ConnectionString)) using (var fileStream = new FileStream(filePath, FileMode.Open)) { connection.Open(); var tran = connection.BeginTransaction(IsolationLevel.ReadCommitted); try { //create the JobInstanceFile instance var command = new SqlCommand("JobInstanceFileCreate", connection) { Transaction = tran }; command.CommandType = CommandType.StoredProcedure; command.Parameters.AddWithValue("@JobInstanceId", jobInstanceId); command.Parameters.AddWithValue("@Created", DateTime.Now); int jobInstanceFileId = Convert.ToInt32(command.ExecuteScalar()); //read out the filestream transaction context to stream the file for storage command.CommandText = "select [File].PathName(), GET_FILESTREAM_TRANSACTION_CONTEXT() from JobInstanceFile where JobInstanceFileId = @JobInstanceFileId"; command.CommandType = CommandType.Text; command.Parameters.AddWithValue("@JobInstanceFileId", jobInstanceFileId); using (SqlDataReader dr = command.ExecuteReader()) { dr.Read(); //get the file path we're writing out to string writePath = dr.GetString(0); using (var writeStream = new SqlFileStream(writePath, (byte[])dr.GetValue(1), FileAccess.ReadWrite)) { //copy from one stream to another byte[] bytes = new byte[65536]; int numBytes; while ((numBytes = fileStream.Read(bytes, 0, 65536)) 0) writeStream.Write(bytes, 0, numBytes); } } tran.Commit(); return jobInstanceFileId; } catch (Exception e) { tran.Rollback(); throw e; } } } Can someone please let me know what I'm doing wrong. In the code, the following expression is returning null for the file path and shouldn't be: //get the file path we're writing out to string writePath = dr.GetString(0); The server is different then the computer the code is running on but the necessary shares appear to be in order and I have also run the following: EXEC sp_configure filestream_access_level, 2 Any help would be greatly appreciated. Thanks!

    Read the article

  • Is READ UNCOMMITTED / NOLOCK safe in this situation?

    - by Ben Challenor
    I know that snapshot isolation would fix this problem, but I'm wondering if NOLOCK is safe in this specific case so that I can avoid the overhead. I have a table that looks something like this: drop table Data create table Data ( Id BIGINT NOT NULL, Date BIGINT NOT NULL, Value BIGINT, constraint Cx primary key (Date, Id) ) create nonclustered index Ix on Data (Id, Date) There are no updates to the table, ever. Deletes can occur but they should never contend with the SELECT because they affect the other, older end of the table. Inserts are regular and page splits to the (Id, Date) index are extremely common. I have a deadlock situation between a standard INSERT and a SELECT that looks like this: select top 1 Date, Value from Data where Id = @p0 order by Date desc because the INSERT acquires a lock on Cx (Date, Id; Value) and then Ix (Id, Date), but the SELECT acquires a lock on Ix (Id, Date) and then Cx (Date, Id; Value). This is because the SELECT first seeks on Ix and then joins to a seek on Cx. Swapping the clustered and non-clustered index would break this cycle, but it is not an acceptable solution because it would introduce cycles with other (more complex) SELECTs. If I add NOLOCK to the SELECT, can it go wrong in this case? Can it return: More than one row, even though I asked for TOP 1? No rows, even though one exists and has been committed? Worst of all, a row that doesn't satisfy the WHERE clause? I've done a lot of reading about this online, but the only reproductions of over- or under-count anomalies I've seen (one, two) involve a scan. This involves only seeks. Jeff Atwood has a post about using NOLOCK that generated a good discussion. I was particularly interested in a comment by Rick Townsend: Secondly, if you read dirty data, the risk you run is of reading the entirely wrong row. For example, if your select reads an index to find your row, then the update changes the location of the rows (e.g.: due to a page split or an update to the clustered index), when your select goes to read the actual data row, it's either no longer there, or a different row altogether! Is this possible with inserts only, and no updates? If so, then I guess even my seeks on an insert-only table could be dangerous. Update: I'm trying to figure out how snapshot isolation works. It seems to be row-based, where transactions read the table (with no shared lock!), find the row they are interested in, and then see if they need to get an old version of the row from the version store in tempdb. But in my case, no row will have more than one version, so the version store seems rather pointless. And if the row was found with no shared lock, how is it different to just using NOLOCK?

    Read the article

  • Many-To-Many Query with Linq-To-NHibernate

    - by rjygraham
    Ok guys (and gals), this one has been driving me nuts all night and I'm turning to your collective wisdom for help. I'm using Fluent Nhibernate and Linq-To-NHibernate as my data access story and I have the following simplified DB structure: CREATE TABLE [dbo].[Classes]( [Id] [bigint] IDENTITY(1,1) NOT NULL, [Name] [nvarchar](100) NOT NULL, [StartDate] [datetime2](7) NOT NULL, [EndDate] [datetime2](7) NOT NULL, CONSTRAINT [PK_Classes] PRIMARY KEY CLUSTERED ( [Id] ASC ) CREATE TABLE [dbo].[Sections]( [Id] [bigint] IDENTITY(1,1) NOT NULL, [ClassId] [bigint] NOT NULL, [InternalCode] [varchar](10) NOT NULL, CONSTRAINT [PK_Sections] PRIMARY KEY CLUSTERED ( [Id] ASC ) CREATE TABLE [dbo].[SectionStudents]( [SectionId] [bigint] NOT NULL, [UserId] [uniqueidentifier] NOT NULL, CONSTRAINT [PK_SectionStudents] PRIMARY KEY CLUSTERED ( [SectionId] ASC, [UserId] ASC ) CREATE TABLE [dbo].[aspnet_Users]( [ApplicationId] [uniqueidentifier] NOT NULL, [UserId] [uniqueidentifier] NOT NULL, [UserName] [nvarchar](256) NOT NULL, [LoweredUserName] [nvarchar](256) NOT NULL, [MobileAlias] [nvarchar](16) NULL, [IsAnonymous] [bit] NOT NULL, [LastActivityDate] [datetime] NOT NULL, PRIMARY KEY NONCLUSTERED ( [UserId] ASC ) I omitted the foreign keys for brevity, but essentially this boils down to: A Class can have many Sections. A Section can belong to only 1 Class but can have many Students. A Student (aspnet_Users) can belong to many Sections. I've setup the corresponding Model classes and Fluent NHibernate Mapping classes, all that is working fine. Here's where I'm getting stuck. I need to write a query which will return the sections a student is enrolled in based on the student's UserId and the dates of the class. Here's what I've tried so far: 1. var sections = (from s in this.Session.Linq<Sections>() where s.Class.StartDate <= DateTime.UtcNow && s.Class.EndDate > DateTime.UtcNow && s.Students.First(f => f.UserId == userId) != null select s); 2. var sections = (from s in this.Session.Linq<Sections>() where s.Class.StartDate <= DateTime.UtcNow && s.Class.EndDate > DateTime.UtcNow && s.Students.Where(w => w.UserId == userId).FirstOrDefault().Id == userId select s); Obviously, 2 above will fail miserably if there are no students matching userId for classes the current date between it's start and end dates...but I just wanted to try. The filters for the Class StartDate and EndDate work fine, but the many-to-many relation with Students is proving to be difficult. Everytime I try running the query I get an ArgumentNullException with the message: Value cannot be null. Parameter name: session I've considered going down the path of making the SectionStudents relation a Model class with a reference to Section and a reference to Student instead of a many-to-many. I'd like to avoid that if I can, and I'm not even sure it would work that way. Thanks in advance to anyone who can help. Ryan

    Read the article

  • DBCC CHECKDB on VVLDB and latches (Or: My Pain is Your Gain)

    - by Argenis
      Does your CHECKDB hurt, Argenis? There is a classic blog series by Paul Randal [blog|twitter] called “CHECKDB From Every Angle” which is pretty much mandatory reading for anybody who’s even remotely considering going for the MCM certification, or its replacement (the Microsoft Certified Solutions Master: Data Platform – makes my fingers hurt just from typing it). Of particular interest is the post “Consistency Options for a VLDB” – on it, Paul provides solid, timeless advice (I use the word “timeless” because it was written in 2007, and it all applies today!) on how to perform checks on very large databases. Well, here I was trying to figure out how to make CHECKDB run faster on a restored copy of one of our databases, which happens to exceed 7TB in size. The whole thing was taking several days on multiple systems, regardless of the storage used – SAS, SATA or even SSD…and I actually didn’t pay much attention to how long it was taking, or even bothered to look at the reasons why - as long as it was finishing okay and found no consistency errors. Yes – I know. That was a huge mistake, as corruption found in a database several days after taking place could only allow for further spread of the corruption – and potentially large data loss. In the last two weeks I increased my attention towards this problem, as we noticed that CHECKDB was taking EVEN LONGER on brand new all-flash storage in the SAN! I couldn’t really explain it, and were almost ready to blame the storage vendor. The vendor told us that they could initially see the server driving decent I/O – around 450Mb/sec, and then it would settle at a very slow rate of 10Mb/sec or so. “Hum”, I thought – “CHECKDB is just not pushing the I/O subsystem hard enough”. Perfmon confirmed the vendor’s observations. Dreaded @BlobEater What was CHECKDB doing all the time while doing so little I/O? Eating Blobs. It turns out that CHECKDB was taking an extremely long time on one of our frankentables, which happens to be have 35 billion rows (yup, with a b) and sucks up several terabytes of space in the database. We do have a project ongoing to purge/split/partition this table, so it’s just a matter of time before we deal with it. But the reality today is that CHECKDB is coming to a screeching halt in performance when dealing with this particular table. Checking sys.dm_os_waiting_tasks and sys.dm_os_latch_stats showed that LATCH_EX (DBCC_OBJECT_METADATA) was by far the top wait type. I remembered hearing recently about that wait from another post that Paul Randal made, but that was related to computed-column indexes, and in fact, Paul himself reminded me of his article via twitter. But alas, our pathologic table had no non-clustered indexes on computed columns. I knew that latches are used by the database engine to do internal synchronization – but how could I help speed this up? After all, this is stuff that doesn’t have a lot of knobs to tweak. (There’s a fantastic level 500 talk by Bob Ward from Microsoft CSS [blog|twitter] called “Inside SQL Server Latches” given at PASS 2010 – and you can check it out here. DISCLAIMER: I assume no responsibility for any brain melting that might ensue from watching Bob’s talk!) Failed Hypotheses Earlier on this week I flew down to Palo Alto, CA, to visit our Headquarters – and after having a great time with my Monkey peers, I was relaxing on the plane back to Seattle watching a great talk by SQL Server MVP and fellow MCM Maciej Pilecki [twitter] called “Masterclass: A Day in the Life of a Database Transaction” where he discusses many different topics related to transaction management inside SQL Server. Very good stuff, and when I got home it was a little late – that slow DBCC CHECKDB that I had been dealing with was way in the back of my head. As I was looking at the problem at hand earlier on this week, I thought “How about I set the database to read-only?” I remembered one of the things Maciej had (jokingly) said in his talk: “if you don’t want locking and blocking, set the database to read-only” (or something to that effect, pardon my loose memory). I immediately killed the CHECKDB which had been running painfully for days, and set the database to read-only mode. Then I ran DBCC CHECKDB against it. It started going really fast (even a bit faster than before), and then throttled down again to around 10Mb/sec. All sorts of expletives went through my head at the time. Sure enough, the same latching scenario was present. Oh well. I even spent some time trying to figure out if NUMA was hurting performance. Folks on Twitter made suggestions in this regard (thanks, Lonny! [twitter]) …Eureka? This past Friday I was still scratching my head about the whole thing; I was ready to start profiling with XPERF to see if I could figure out which part of the engine was to blame and then get Microsoft to look at the evidence. After getting a bunch of good news I’ll blog about separately, I sat down for a figurative smack down with CHECKDB before the weekend. And then the light bulb went on. A sparse column. I thought that I couldn’t possibly be experiencing the same scenario that Paul blogged about back in March showing extreme latching with non-clustered indexes on computed columns. Did I even have a non-clustered index on my sparse column? As it turns out, I did. I had one filtered non-clustered index – with the sparse column as the index key (and only column). To prove that this was the problem, I went and setup a test. Yup, that'll do it The repro is very simple for this issue: I tested it on the latest public builds of SQL Server 2008 R2 SP2 (CU6) and SQL Server 2012 SP1 (CU4). First, create a test database and a test table, which only needs to contain a sparse column: CREATE DATABASE SparseColTest; GO USE SparseColTest; GO CREATE TABLE testTable (testCol smalldatetime SPARSE NULL); GO INSERT INTO testTable (testCol) VALUES (NULL); GO 1000000 That’s 1 million rows, and even though you’re inserting NULLs, that’s going to take a while. In my laptop, it took 3 minutes and 31 seconds. Next, we run DBCC CHECKDB against the database: DBCC CHECKDB('SparseColTest') WITH NO_INFOMSGS, ALL_ERRORMSGS; This runs extremely fast, as least on my test rig – 198 milliseconds. Now let’s create a filtered non-clustered index on the sparse column: CREATE NONCLUSTERED INDEX [badBadIndex] ON testTable (testCol) WHERE testCol IS NOT NULL; With the index in place now, let’s run DBCC CHECKDB one more time: DBCC CHECKDB('SparseColTest') WITH NO_INFOMSGS, ALL_ERRORMSGS; In my test system this statement completed in 11433 milliseconds. 11.43 full seconds. Quite the jump from 198 milliseconds. I went ahead and dropped the filtered non-clustered indexes on the restored copy of our production database, and ran CHECKDB against that. We went down from 7+ days to 19 hours and 20 minutes. Cue the “Argenis is not impressed” meme, please, Mr. LaRock. My pain is your gain, folks. Go check to see if you have any of such indexes – they’re likely causing your consistency checks to run very, very slow. Happy CHECKDBing, -Argenis ps: I plan to file a Connect item for this issue – I consider it a pretty serious bug in the engine. After all, filtered indexes were invented BECAUSE of the sparse column feature – and it makes a lot of sense to use them together. Watch this space and my twitter timeline for a link.

    Read the article

  • Columnstore Case Study #2: Columnstore faster than SSAS Cube at DevCon Security

    - by aspiringgeek
    Preamble This is the second in a series of posts documenting big wins encountered using columnstore indexes in SQL Server 2012 & 2014.  Many of these can be found in my big deck along with details such as internals, best practices, caveats, etc.  The purpose of sharing the case studies in this context is to provide an easy-to-consume quick-reference alternative. See also Columnstore Case Study #1: MSIT SONAR Aggregations Why Columnstore? As stated previously, If we’re looking for a subset of columns from one or a few rows, given the right indexes, SQL Server can do a superlative job of providing an answer. If we’re asking a question which by design needs to hit lots of rows—DW, reporting, aggregations, grouping, scans, etc., SQL Server has never had a good mechanism—until columnstore. Columnstore indexes were introduced in SQL Server 2012. However, they're still largely unknown. Some adoption blockers existed; yet columnstore was nonetheless a game changer for many apps.  In SQL Server 2014, potential blockers have been largely removed & they're going to profoundly change the way we interact with our data.  The purpose of this series is to share the performance benefits of columnstore & documenting columnstore is a compelling reason to upgrade to SQL Server 2014. The Customer DevCon Security provides home & business security services & has been in business for 135 years. I met DevCon personnel while speaking to the Utah County SQL User Group on 20 February 2012. (Thanks to TJ Belt (b|@tjaybelt) & Ben Miller (b|@DBADuck) for the invitation which serendipitously coincided with the height of ski season.) The App: DevCon Security Reporting: Optimized & Ad Hoc Queries DevCon users interrogate a SQL Server 2012 Analysis Services cube via SSRS. In addition, the SQL Server 2012 relational back end is the target of ad hoc queries; this DW back end is refreshed nightly during a brief maintenance window via conventional table partition switching. SSRS, SSAS, & MDX Conventional relational structures were unable to provide adequate performance for user interaction for the SSRS reports. An SSAS solution was implemented requiring personnel to ramp up technically, including learning enough MDX to satisfy requirements. Ad Hoc Queries Even though the fact table is relatively small—only 22 million rows & 33GB—the table was a typical DW table in terms of its width: 137 columns, any of which could be the target of ad hoc interrogation. As is common in DW reporting scenarios such as this, it is often nearly to optimize for such queries using conventional indexing. DevCon DBAs & developers attended PASS 2012 & were introduced to the marvels of columnstore in a session presented by Klaus Aschenbrenner (b|@Aschenbrenner) The Details Classic vs. columnstore before-&-after metrics are impressive. Scenario   Conventional Structures   Columnstore   Δ SSRS via SSAS 10 - 12 seconds 1 second >10x Ad Hoc 5-7 minutes (300 - 420 seconds) 1 - 2 seconds >100x Here are two charts characterizing this data graphically.  The first is a linear representation of Report Duration (in seconds) for Conventional Structures vs. Columnstore Indexes.  As is so often the case when we chart such significant deltas, the linear scale doesn’t expose some the dramatically improved values corresponding to the columnstore metrics.  Just to make it fair here’s the same data represented logarithmically; yet even here the values corresponding to 1 –2 seconds aren’t visible.  The Wins Performance: Even prior to columnstore implementation, at 10 - 12 seconds canned report performance against the SSAS cube was tolerable. Yet the 1 second performance afterward is clearly better. As significant as that is, imagine the user experience re: ad hoc interrogation. The difference between several minutes vs. one or two seconds is a game changer, literally changing the way users interact with their data—no mental context switching, no wondering when the results will appear, no preoccupation with the spinning mind-numbing hurry-up-&-wait indicators.  As we’ve commonly found elsewhere, columnstore indexes here provided performance improvements of one, two, or more orders of magnitude. Simplified Infrastructure: Because in this case a nonclustered columnstore index on a conventional DW table was faster than an Analysis Services cube, the entire SSAS infrastructure was rendered superfluous & was retired. PASS Rocks: Once again, the value of attending PASS is proven out. The trip to Charlotte combined with eager & enquiring minds let directly to this success story. Find out more about the next PASS Summit here, hosted this year in Seattle on November 4 - 7, 2014. DevCon BI Team Lead Nathan Allan provided this unsolicited feedback: “What we found was pretty awesome. It has been a game changer for us in terms of the flexibility we can offer people that would like to get to the data in different ways.” Summary For DW, reports, & other BI workloads, columnstore often provides significant performance enhancements relative to conventional indexing.  I have documented here, the second in a series of reports on columnstore implementations, results from DevCon Security, a live customer production app for which performance increased by factors of from 10x to 100x for all report queries, including canned queries as well as reducing time for results for ad hoc queries from 5 - 7 minutes to 1 - 2 seconds. As a result of columnstore performance, the customer retired their SSAS infrastructure. I invite you to consider leveraging columnstore in your own environment. Let me know if you have any questions.

    Read the article

  • SQL SERVER – Weekly Series – Memory Lane – #031

    - by Pinal Dave
    Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 Find Table without Clustered Index – Find Table with no Primary Key Clustered index is very important concept for any table. They impact the performance very heavily. Here is a quick script to find tables without a clustered index. Replace TEXT with VARCHAR(MAX) – Stop using TEXT, NTEXT, IMAGE Data Types Question: “Is VARCHAR (MAX) big enough to store the TEXT field?” Answer: “Yes, VARCHAR(MAX) is big enough to accommodate TEXT field. TEXT, NTEXT and IMAGE data types of SQL Server 2000 will be deprecated in a future version of SQL Server, SQL Server 2005 provides backward compatibility to data types but it is recommended to use new data types which are VARHCAR (MAX), NVARCHAR (MAX) and VARBINARY (MAX).” Limiting Result Sets by Using TABLESAMPLE – Examples Introduced in SQL Server 2005, TABLESAMPLE allows you to extract a sampling of rows from a table in the FROM clause. The rows retrieved are random and they are are not in any order. This sampling can be based on a percentage of number of rows. You can use TABLESAMPLE when only a sampling of rows is necessary for the application instead of a full result set. User Defined Functions (UDF) Limitations UDF have its own advantage and usage but in this article we will see the limitation of UDF. Things UDF can not do and why Stored Procedure are considered as more flexible then UDFs. Stored Procedure are more flexibility then User Defined Functions(UDF). However, this blog post is a good read to know what are the limitations of UDF. Change Database Compatible Level – Backward Compatibility For a long time SQL Server stayed on the compatibility level of 80 which is of SQL Server 2000. However, as soon as SQL Server 2005 introduced the issue of compatibility was quite a major issue. Since that time MS has been releasing the versions at every 2-3 years, changing compatibility is a ever popular topic. In this blog post, we learn how we can do the same using T-SQL. We can also do the same using SSMS and here is the blog post for the same: Change Database Compatible Level – Backward Compatibility – Part 2 – Management Studio. Constraint on VARCHAR(MAX) Field To Limit It Certain Length How can I limit the VARCHAR(MAX) field with maximum length of 12500 characters only. His Question was valid as our application was allowed 12500 characters. First of all – this requirement is bit strange but if someone wants to do the same, they can do it as described in this blog post. 2008 UNPIVOT Table Example Understanding UNPIVOT can be very complicated at times. In this blog post, I have attempted to explain the same concept in very simple words. Create Default Constraint Over Table Column A simple straight to script blog post – I still use this blog quite many times for my own reference. UDF – Get the Day of the Week Function It took me 4 iteration to find this very simple function which can immediately get the day of the week in a single line. 2009 Find Hostname and Current Logged In User Name There are two tricks listed in this blog post where users can find out the hostname and current logged user name immediately and very easily. Interesting Observation of Logon Trigger On All Servers When I was doing a project, I made an interesting observation of executing a logon trigger multiple times. It was absolutely unexpected for me! As I was logging only once, naturally, I was expecting the entry only once. However, it did it multiple times on different threads – indeed an eccentric phenomenon at first sight! Difference Between Candidate Keys and Primary Key One needs to be very careful in selecting the Primary Key as an incorrect selection can adversely impact the database architect and future normalization. For a Candidate Key to qualify as a Primary Key, it should be Non-NULL and unique in any domain. I have observed quite often that Primary Keys are seldom changed. I would like to have your feedback on not changing a Primary Key. Create Multiple Filegroup For Single Database Why should one create multiple file group for any database and what are the advantages of the same. In this blog post, I explain the same in detail. List All Objects Created on All Filegroups in Database In this blog post we discuss the essential question – “How can I find which object belongs to which filegroup. Is there any way to know this?” 2010 DATE and TIME in SQL Server 2008 When DATE is converted to DATETIME it adds the of midnight. When TIME is converted to DATETIME it adds the date of 1900 and it is something one wants to consider if you are going to run scripts from SQL Server 2008 to earlier version with CONVERT. Disabled Index and Update Statistics If you do not need a nonclustered index, I suggest you to drop it as keeping them disabled is an overhead on your system. This is because every time the statistics are updated for system all the statistics for disabled indexes are also updated. Precision of SMALLDATETIME – A 1 Minute Precision The precision of the datatype SMALLDATETIME is 1 minute. It discards the seconds by rounding up or rounding down any seconds greater than zero. 2011 Getting Columns Headers without Result Data – SET FMTONLY ON SET FMTONLY ON returns only metadata to the client. It can be used to test the format of the response without actually running the query. When this setting is ON the resultset only have headers of the results but no data. Copy Database from Instance to Another Instance – Copy Paste in SQL Server SQL Server has a feature which copy database from one database to another database and it can be automated as well using SSIS. Make sure you have SQL Server Agent Turned on as this feature will create a job. Puzzle – SELECT * vs SELECT COUNT(*) If you have ever wondered SELECT * gives error when executed alone but SELECT COUNT(*) does not. Why? in that case, you should read this blog post. Creating All New Database with Full Recovery Model This blog post is very based on very interesting story where the user wants to do something by default for every single new database created. Model database is a secret weapon which should be used very carefully and with proper evalution. If used carefully this can be a very much beneficiary when we need a newly created database behave in certain fashion. 2012 In year 2012 I had two interesting series ran on the blog. If there is no fun in learning, the learning becomes a burden. For the same reason, I had decided to build a three part quiz around SEQUENCE. The quiz was to identify the next value of the sequence. I encourage all of you to take part in this fun quiz. Guess the Next Value – Puzzle 1 Guess the Next Value – Puzzle 2 Guess the Next Value – Puzzle 3 Can anyone remember their final day of schooling?  This is probably a silly question because – of course you can!  Many people mark this as the most exciting, happiest day of their life.  It marks the end of testing, the end of following rules set by teachers, and the beginning of finally being able to earn money and work in your chosen field. Read five part series on developer training subject Developer Training - Importance and Significance - Part 1 Developer Training – Employee Morals and Ethics – Part 2 Developer Training – Difficult Questions and Alternative Perspective - Part 3 Developer Training – Various Options for Developer Training – Part 4 Developer Training – A Conclusive Summary- Part 5 Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • Spooling in SQL execution plans

    - by Rob Farley
    Sewing has never been my thing. I barely even know the terminology, and when discussing this with American friends, I even found out that half the words that Americans use are different to the words that English and Australian people use. That said – let’s talk about spools! In particular, the Spool operators that you find in some SQL execution plans. This post is for T-SQL Tuesday, hosted this month by me! I’ve chosen to write about spools because they seem to get a bad rap (even in my song I used the line “There’s spooling from a CTE, they’ve got recursion needlessly”). I figured it was worth covering some of what spools are about, and hopefully explain why they are remarkably necessary, and generally very useful. If you have a look at the Books Online page about Plan Operators, at http://msdn.microsoft.com/en-us/library/ms191158.aspx, and do a search for the word ‘spool’, you’ll notice it says there are 46 matches. 46! Yeah, that’s what I thought too... Spooling is mentioned in several operators: Eager Spool, Lazy Spool, Index Spool (sometimes called a Nonclustered Index Spool), Row Count Spool, Spool, Table Spool, and Window Spool (oh, and Cache, which is a special kind of spool for a single row, but as it isn’t used in SQL 2012, I won’t describe it any further here). Spool, Table Spool, Index Spool, Window Spool and Row Count Spool are all physical operators, whereas Eager Spool and Lazy Spool are logical operators, describing the way that the other spools work. For example, you might see a Table Spool which is either Eager or Lazy. A Window Spool can actually act as both, as I’ll mention in a moment. In sewing, cotton is put onto a spool to make it more useful. You might buy it in bulk on a cone, but if you’re going to be using a sewing machine, then you quite probably want to have it on a spool or bobbin, which allows it to be used in a more effective way. This is the picture that I want you to think about in relation to your data. I’m sure you use spools every time you use your sewing machine. I know I do. I can’t think of a time when I’ve got out my sewing machine to do some sewing and haven’t used a spool. However, I often run SQL queries that don’t use spools. You see, the data that is consumed by my query is typically in a useful state without a spool. It’s like I can just sew with my cotton despite it not being on a spool! Many of my favourite features in T-SQL do like to use spools though. This looks like a very similar query to before, but includes an OVER clause to return a column telling me the number of rows in my data set. I’ll describe what’s going on in a few paragraphs’ time. So what does a Spool operator actually do? The spool operator consumes a set of data, and stores it in a temporary structure, in the tempdb database. This structure is typically either a Table (ie, a heap), or an Index (ie, a b-tree). If no data is actually needed from it, then it could also be a Row Count spool, which only stores the number of rows that the spool operator consumes. A Window Spool is another option if the data being consumed is tightly linked to windows of data, such as when the ROWS/RANGE clause of the OVER clause is being used. You could maybe think about the type of spool being like whether the cotton is going onto a small bobbin to fit in the base of the sewing machine, or whether it’s a larger spool for the top. A Table or Index Spool is either Eager or Lazy in nature. Eager and Lazy are Logical operators, which talk more about the behaviour, rather than the physical operation. If I’m sewing, I can either be all enthusiastic and get all my cotton onto the spool before I start, or I can do it as I need it. “Lazy” might not the be the best word to describe a person – in the SQL world it describes the idea of either fetching all the rows to build up the whole spool when the operator is called (Eager), or populating the spool only as it’s needed (Lazy). Window Spools are both physical and logical. They’re eager on a per-window basis, but lazy between windows. And when is it needed? The way I see it, spools are needed for two reasons. 1 – When data is going to be needed AGAIN. 2 – When data needs to be kept away from the original source. If you’re someone that writes long stored procedures, you are probably quite aware of the second scenario. I see plenty of stored procedures being written this way – where the query writer populates a temporary table, so that they can make updates to it without risking the original table. SQL does this too. Imagine I’m updating my contact list, and some of my changes move data to later in the book. If I’m not careful, I might update the same row a second time (or even enter an infinite loop, updating it over and over). A spool can make sure that I don’t, by using a copy of the data. This problem is known as the Halloween Effect (not because it’s spooky, but because it was discovered in late October one year). As I’m sure you can imagine, the kind of spool you’d need to protect against the Halloween Effect would be eager, because if you’re only handling one row at a time, then you’re not providing the protection... An eager spool will block the flow of data, waiting until it has fetched all the data before serving it up to the operator that called it. In the query below I’m forcing the Query Optimizer to use an index which would be upset if the Name column values got changed, and we see that before any data is fetched, a spool is created to load the data into. This doesn’t stop the index being maintained, but it does mean that the index is protected from the changes that are being done. There are plenty of times, though, when you need data repeatedly. Consider the query I put above. A simple join, but then counting the number of rows that came through. The way that this has executed (be it ideal or not), is to ask that a Table Spool be populated. That’s the Table Spool operator on the top row. That spool can produce the same set of rows repeatedly. This is the behaviour that we see in the bottom half of the plan. In the bottom half of the plan, we see that the a join is being done between the rows that are being sourced from the spool – one being aggregated and one not – producing the columns that we need for the query. Table v Index When considering whether to use a Table Spool or an Index Spool, the question that the Query Optimizer needs to answer is whether there is sufficient benefit to storing the data in a b-tree. The idea of having data in indexes is great, but of course there is a cost to maintaining them. Here we’re creating a temporary structure for data, and there is a cost associated with populating each row into its correct position according to a b-tree, as opposed to simply adding it to the end of the list of rows in a heap. Using a b-tree could even result in page-splits as the b-tree is populated, so there had better be a reason to use that kind of structure. That all depends on how the data is going to be used in other parts of the plan. If you’ve ever thought that you could use a temporary index for a particular query, well this is it – and the Query Optimizer can do that if it thinks it’s worthwhile. It’s worth noting that just because a Spool is populated using an Index Spool, it can still be fetched using a Table Spool. The details about whether or not a Spool used as a source shows as a Table Spool or an Index Spool is more about whether a Seek predicate is used, rather than on the underlying structure. Recursive CTE I’ve already shown you an example of spooling when the OVER clause is used. You might see them being used whenever you have data that is needed multiple times, and CTEs are quite common here. With the definition of a set of data described in a CTE, if the query writer is leveraging this by referring to the CTE multiple times, and there’s no simplification to be leveraged, a spool could theoretically be used to avoid reapplying the CTE’s logic. Annoyingly, this doesn’t happen. Consider this query, which really looks like it’s using the same data twice. I’m creating a set of data (which is completely deterministic, by the way), and then joining it back to itself. There seems to be no reason why it shouldn’t use a spool for the set described by the CTE, but it doesn’t. On the other hand, if we don’t pull as many columns back, we might see a very different plan. You see, CTEs, like all sub-queries, are simplified out to figure out the best way of executing the whole query. My example is somewhat contrived, and although there are plenty of cases when it’s nice to give the Query Optimizer hints about how to execute queries, it usually doesn’t do a bad job, even without spooling (and you can always use a temporary table). When recursion is used, though, spooling should be expected. Consider what we’re asking for in a recursive CTE. We’re telling the system to construct a set of data using an initial query, and then use set as a source for another query, piping this back into the same set and back around. It’s very much a spool. The analogy of cotton is long gone here, as the idea of having a continual loop of cotton feeding onto a spool and off again doesn’t quite fit, but that’s what we have here. Data is being fed onto the spool, and getting pulled out a second time when the spool is used as a source. (This query is running on AdventureWorks, which has a ManagerID column in HumanResources.Employee, not AdventureWorks2012) The Index Spool operator is sucking rows into it – lazily. It has to be lazy, because at the start, there’s only one row to be had. However, as rows get populated onto the spool, the Table Spool operator on the right can return rows when asked, ending up with more rows (potentially) getting back onto the spool, ready for the next round. (The Assert operator is merely checking to see if we’ve reached the MAXRECURSION point – it vanishes if you use OPTION (MAXRECURSION 0), which you can try yourself if you like). Spools are useful. Don’t lose sight of that. Every time you use temporary tables or table variables in a stored procedure, you’re essentially doing the same – don’t get upset at the Query Optimizer for doing so, even if you think the spool looks like an expensive part of the query. I hope you’re enjoying this T-SQL Tuesday. Why not head over to my post that is hosting it this month to read about some other plan operators? At some point I’ll write a summary post – once I have you should find a comment below pointing at it. @rob_farley

    Read the article

  • Database Migration Scripts: Getting from place A to place B

    - by Phil Factor
    We’ll be looking at a typical database ‘migration’ script which uses an unusual technique to migrate existing ‘de-normalised’ data into a more correct form. So, the book-distribution business that uses the PUBS database has gradually grown organically, and has slipped into ‘de-normalisation’ habits. What’s this? A new column with a list of tags or ‘types’ assigned to books. Because books aren’t really in just one category, someone has ‘cured’ the mismatch between the database and the business requirements. This is fine, but it is now proving difficult for their new website that allows searches by tags. Any request for history book really has to look in the entire list of associated tags rather than the ‘Type’ field that only keeps the primary tag. We have other problems. The TypleList column has duplicates in there which will be affecting the reporting, and there is the danger of mis-spellings getting there. The reporting system can’t be persuaded to do reports based on the tags and the Database developers are complaining about the unCoddly things going on in their database. In your version of PUBS, this extra column doesn’t exist, so we’ve added it and put in 10,000 titles using SQL Data Generator. /* So how do we refactor this database? firstly, we create a table of all the tags. */IF  OBJECT_ID('TagName') IS NULL OR OBJECT_ID('TagTitle') IS NULL  BEGIN  CREATE TABLE  TagName (TagName_ID INT IDENTITY(1,1) PRIMARY KEY ,     Tag VARCHAR(20) NOT NULL UNIQUE)  /* ...and we insert into it all the tags from the list (remembering to take out any leading spaces */  INSERT INTO TagName (Tag)     SELECT DISTINCT LTRIM(x.y.value('.', 'Varchar(80)')) AS [Tag]     FROM     (SELECT  Title_ID,          CONVERT(XML, '<list><i>' + REPLACE(TypeList, ',', '</i><i>') + '</i></list>')          AS XMLkeywords          FROM   dbo.titles)g    CROSS APPLY XMLkeywords.nodes('/list/i/text()') AS x ( y )  /* we can then use this table to provide a table that relates tags to articles */  CREATE TABLE TagTitle   (TagTitle_ID INT IDENTITY(1, 1),   [title_id] [dbo].[tid] NOT NULL REFERENCES titles (Title_ID),   TagName_ID INT NOT NULL REFERENCES TagName (Tagname_ID)   CONSTRAINT [PK_TagTitle]       PRIMARY KEY CLUSTERED ([title_id] ASC, TagName_ID)       ON [PRIMARY])        CREATE NONCLUSTERED INDEX idxTagName_ID  ON  TagTitle (TagName_ID)  INCLUDE (TagTitle_ID,title_id)        /* ...and it is easy to fill this with the tags for each title ... */        INSERT INTO TagTitle (Title_ID, TagName_ID)    SELECT DISTINCT Title_ID, TagName_ID      FROM        (SELECT  Title_ID,          CONVERT(XML, '<list><i>' + REPLACE(TypeList, ',', '</i><i>') + '</i></list>')          AS XMLkeywords          FROM   dbo.titles)g    CROSS APPLY XMLkeywords.nodes('/list/i/text()') AS x ( y )    INNER JOIN TagName ON TagName.Tag=LTRIM(x.y.value('.', 'Varchar(80)'))    END    /* That's all there was to it. Now we can select all titles that have the military tag, just to try things out */SELECT Title FROM titles  INNER JOIN TagTitle ON titles.title_ID=TagTitle.Title_ID  INNER JOIN Tagname ON Tagname.TagName_ID=TagTitle.TagName_ID  WHERE tagname.tag='Military'/* and see the top ten most popular tags for titles */SELECT Tag, COUNT(*) FROM titles  INNER JOIN TagTitle ON titles.title_ID=TagTitle.Title_ID  INNER JOIN Tagname ON Tagname.TagName_ID=TagTitle.TagName_ID  GROUP BY Tag ORDER BY COUNT(*) DESC/* and if you still want your list of tags for each title, then here they are */SELECT title_ID, title, STUFF(  (SELECT ','+tagname.tag FROM titles thisTitle    INNER JOIN TagTitle ON titles.title_ID=TagTitle.Title_ID    INNER JOIN Tagname ON Tagname.TagName_ID=TagTitle.TagName_ID  WHERE ThisTitle.title_id=titles.title_ID  FOR XML PATH(''), TYPE).value('.', 'varchar(max)')  ,1,1,'')    FROM titles  ORDER BY title_ID So we’ve refactored our PUBS database without pain. We’ve even put in a check to prevent it being re-run once the new tables are created. Here is the diagram of the new tag relationship We’ve done both the DDL to create the tables and their associated components, and the DML to put the data in them. I could have also included the script to remove the de-normalised TypeList column, but I’d do a whole lot of tests first before doing that. Yes, I’ve left out the assertion tests too, which should check the edge cases and make sure the result is what you’d expect. One thing I can’t quite figure out is how to deal with an ordered list using this simple XML-based technique. We can ensure that, if we have to produce a list of tags, we can get the primary ‘type’ to be first in the list, but what if the entire order is significant? Thank goodness it isn’t in this case. If it were, we might have to revisit a string-splitter function that returns the ordinal position of each component in the sequence. You’ll see immediately that we can create a synchronisation script for deployment from a comparison tool such as SQL Compare, to change the schema (DDL). On the other hand, no tool could do the DML to stuff the data into the new table, since there is no way that any tool will be able to work out where the data should go. We used some pretty hairy code to deal with a slightly untypical problem. We would have to do this migration by hand, and it has to go into source control as a batch. If most of your database changes are to be deployed by an automated process, then there must be a way of over-riding this part of the data synchronisation process to do this part of the process taking the part of the script that fills the tables, Checking that the tables have not already been filled, and executing it as part of the transaction. Of course, you might prefer the approach I’ve taken with the script of creating the tables in the same batch as the data conversion process, and then using the presence of the tables to prevent the script from being re-run. The problem with scripting a refactoring change to a database is that it has to work both ways. If we install the new system and then have to rollback the changes, several books may have been added, or had their tags changed, in the meantime. Yes, you have to script any rollback! These have to be mercilessly tested, and put in source control just in case of the rollback of a deployment after it has been in place for any length of time. I’ve shown you how to do this with the part of the script .. /* and if you still want your list of tags for each title, then here they are */SELECT title_ID, title, STUFF(  (SELECT ','+tagname.tag FROM titles thisTitle    INNER JOIN TagTitle ON titles.title_ID=TagTitle.Title_ID    INNER JOIN Tagname ON Tagname.TagName_ID=TagTitle.TagName_ID  WHERE ThisTitle.title_id=titles.title_ID  FOR XML PATH(''), TYPE).value('.', 'varchar(max)')  ,1,1,'')    FROM titles  ORDER BY title_ID …which would be turned into an UPDATE … FROM script. UPDATE titles SET  typelist= ThisTaglistFROM     (SELECT title_ID, title, STUFF(    (SELECT ','+tagname.tag FROM titles thisTitle      INNER JOIN TagTitle ON titles.title_ID=TagTitle.Title_ID      INNER JOIN Tagname ON Tagname.TagName_ID=TagTitle.TagName_ID    WHERE ThisTitle.title_id=titles.title_ID    ORDER BY CASE WHEN tagname.tag=titles.[type] THEN 1 ELSE 0  END DESC    FOR XML PATH(''), TYPE).value('.', 'varchar(max)')    ,1,1,'')  AS ThisTagList  FROM titles)fINNER JOIN Titles ON f.title_ID=Titles.title_ID You’ll notice that it isn’t quite a round trip because the tags are in a different order, though we’ve managed to make sure that the primary tag is the first one as originally. So, we’ve improved the database for the poor book distributors using PUBS. It is not a major deal but you’ve got to be prepared to provide a migration script that will go both forwards and backwards. Ideally, database refactoring scripts should be able to go from any version to any other. Schema synchronization scripts can do this pretty easily, but no data synchronisation scripts can deal with serious refactoring jobs without the developers being able to specify how to deal with cases like this.

    Read the article

  • Performing Aggregate Functions on Multi-Million Row Tables

    - by Daniel Short
    I'm having some serious performance issues with a multi-million row table that I feel I should be able to get results from fairly quick. Here's a run down of what I have, how I'm querying it, and how long it's taking: I'm running SQL Server 2008 Standard, so Partitioning isn't currently an option I'm attempting to aggregate all views for all inventory for a specific account over the last 30 days. All views are stored in the following table: CREATE TABLE [dbo].[LogInvSearches_Daily]( [ID] [bigint] IDENTITY(1,1) NOT NULL, [Inv_ID] [int] NOT NULL, [Site_ID] [int] NOT NULL, [LogCount] [int] NOT NULL, [LogDay] [smalldatetime] NOT NULL, CONSTRAINT [PK_LogInvSearches_Daily] PRIMARY KEY CLUSTERED ( [ID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY] ) ON [PRIMARY] This table has 132,000,000 records, and is over 4 gigs. A sample of 10 rows from the table: ID Inv_ID Site_ID LogCount LogDay -------------------- ----------- ----------- ----------- ----------------------- 1 486752 48 14 2009-07-21 00:00:00 2 119314 51 16 2009-07-21 00:00:00 3 313678 48 25 2009-07-21 00:00:00 4 298863 0 1 2009-07-21 00:00:00 5 119996 0 2 2009-07-21 00:00:00 6 463777 534 7 2009-07-21 00:00:00 7 339976 503 2 2009-07-21 00:00:00 8 333501 570 4 2009-07-21 00:00:00 9 453955 0 12 2009-07-21 00:00:00 10 443291 0 4 2009-07-21 00:00:00 (10 row(s) affected) I have the following index on LogInvSearches_Daily: /****** Object: Index [IX_LogInvSearches_Daily_LogDay] Script Date: 05/12/2010 11:08:22 ******/ CREATE NONCLUSTERED INDEX [IX_LogInvSearches_Daily_LogDay] ON [dbo].[LogInvSearches_Daily] ( [LogDay] ASC ) INCLUDE ( [Inv_ID], [LogCount]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] I need to pull inventory only from the Inventory for a specific account id. I have an index on the Inventory as well. I'm using the following query to aggregate the data and give me the top 5 records. This query is currently taking 24 seconds to return the 5 rows: StmtText ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- SELECT TOP 5 Sum(LogCount) AS Views , DENSE_RANK() OVER(ORDER BY Sum(LogCount) DESC, Inv_ID DESC) AS Rank , Inv_ID FROM LogInvSearches_Daily D (NOLOCK) WHERE LogDay DateAdd(d, -30, getdate()) AND EXISTS( SELECT NULL FROM propertyControlCenter.dbo.Inventory (NOLOCK) WHERE Acct_ID = 18731 AND Inv_ID = D.Inv_ID ) GROUP BY Inv_ID (1 row(s) affected) StmtText ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |--Top(TOP EXPRESSION:((5))) |--Sequence Project(DEFINE:([Expr1007]=dense_rank)) |--Segment |--Segment |--Sort(ORDER BY:([Expr1006] DESC, [D].[Inv_ID] DESC)) |--Stream Aggregate(GROUP BY:([D].[Inv_ID]) DEFINE:([Expr1006]=SUM([LOALogs].[dbo].[LogInvSearches_Daily].[LogCount] as [D].[LogCount]))) |--Sort(ORDER BY:([D].[Inv_ID] ASC)) |--Nested Loops(Inner Join, OUTER REFERENCES:([D].[Inv_ID])) |--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1011], [Expr1012], [Expr1010])) | |--Compute Scalar(DEFINE:(([Expr1011],[Expr1012],[Expr1010])=GetRangeWithMismatchedTypes(dateadd(day,(-30),getdate()),NULL,(6)))) | | |--Constant Scan | |--Index Seek(OBJECT:([LOALogs].[dbo].[LogInvSearches_Daily].[IX_LogInvSearches_Daily_LogDay] AS [D]), SEEK:([D].[LogDay] > [Expr1011] AND [D].[LogDay] < [Expr1012]) ORDERED FORWARD) |--Index Seek(OBJECT:([propertyControlCenter].[dbo].[Inventory].[IX_Inventory_Acct_ID]), SEEK:([propertyControlCenter].[dbo].[Inventory].[Acct_ID]=(18731) AND [propertyControlCenter].[dbo].[Inventory].[Inv_ID]=[LOA (13 row(s) affected) I tried using a CTE to pick up the rows first and aggregate them, but that didn't run any faster, and gives me essentially the same execution plan. (1 row(s) affected) StmtText ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- --SET SHOWPLAN_TEXT ON; WITH getSearches AS ( SELECT LogCount -- , DENSE_RANK() OVER(ORDER BY Sum(LogCount) DESC, Inv_ID DESC) AS Rank , D.Inv_ID FROM LogInvSearches_Daily D (NOLOCK) INNER JOIN propertyControlCenter.dbo.Inventory I (NOLOCK) ON Acct_ID = 18731 AND I.Inv_ID = D.Inv_ID WHERE LogDay DateAdd(d, -30, getdate()) -- GROUP BY Inv_ID ) SELECT Sum(LogCount) AS Views, Inv_ID FROM getSearches GROUP BY Inv_ID (1 row(s) affected) StmtText ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |--Stream Aggregate(GROUP BY:([D].[Inv_ID]) DEFINE:([Expr1004]=SUM([LOALogs].[dbo].[LogInvSearches_Daily].[LogCount] as [D].[LogCount]))) |--Sort(ORDER BY:([D].[Inv_ID] ASC)) |--Nested Loops(Inner Join, OUTER REFERENCES:([D].[Inv_ID])) |--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1008], [Expr1009], [Expr1007])) | |--Compute Scalar(DEFINE:(([Expr1008],[Expr1009],[Expr1007])=GetRangeWithMismatchedTypes(dateadd(day,(-30),getdate()),NULL,(6)))) | | |--Constant Scan | |--Index Seek(OBJECT:([LOALogs].[dbo].[LogInvSearches_Daily].[IX_LogInvSearches_Daily_LogDay] AS [D]), SEEK:([D].[LogDay] > [Expr1008] AND [D].[LogDay] < [Expr1009]) ORDERED FORWARD) |--Index Seek(OBJECT:([propertyControlCenter].[dbo].[Inventory].[IX_Inventory_Acct_ID] AS [I]), SEEK:([I].[Acct_ID]=(18731) AND [I].[Inv_ID]=[LOALogs].[dbo].[LogInvSearches_Daily].[Inv_ID] as [D].[Inv_ID]) ORDERED FORWARD) (8 row(s) affected) (1 row(s) affected) So given that I'm getting good Index Seeks in my execution plan, what can I do to get this running faster? Thanks, Dan

    Read the article

< Previous Page | 1 2 3 4