Search Results

Search found 21433 results on 858 pages for 'query execution plans'.

Page 2/858 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • SQL SERVER – Merge Operations – Insert, Update, Delete in Single Execution

    - by pinaldave
    This blog post is written in response to T-SQL Tuesday hosted by Jorge Segarra (aka SQLChicken). I have been very active using these Merge operations in my development. However, I have found out from my consultancy work and friends that these amazing operations are not utilized by them most of the time. Here is my attempt to bring the necessity of using the Merge Operation to surface one more time. MERGE is a new feature that provides an efficient way to do multiple DML operations. In earlier versions of SQL Server, we had to write separate statements to INSERT, UPDATE, or DELETE data based on certain conditions; however, at present, by using the MERGE statement, we can include the logic of such data changes in one statement that even checks when the data is matched and then just update it, and similarly, when the data is unmatched, it is inserted. One of the most important advantages of MERGE statement is that the entire data are read and processed only once. In earlier versions, three different statements had to be written to process three different activities (INSERT, UPDATE or DELETE); however, by using MERGE statement, all the update activities can be done in one pass of database table. I have written about these Merge Operations earlier in my blog post over here SQL SERVER – 2008 – Introduction to Merge Statement – One Statement for INSERT, UPDATE, DELETE. I was asked by one of the readers that how do we know that this operator was doing everything in single pass and was not calling this Merge Operator multiple times. Let us run the same example which I have used earlier; I am listing the same here again for convenience. --Let’s create Student Details and StudentTotalMarks and inserted some records. USE tempdb GO CREATE TABLE StudentDetails ( StudentID INTEGER PRIMARY KEY, StudentName VARCHAR(15) ) GO INSERT INTO StudentDetails VALUES(1,'SMITH') INSERT INTO StudentDetails VALUES(2,'ALLEN') INSERT INTO StudentDetails VALUES(3,'JONES') INSERT INTO StudentDetails VALUES(4,'MARTIN') INSERT INTO StudentDetails VALUES(5,'JAMES') GO CREATE TABLE StudentTotalMarks ( StudentID INTEGER REFERENCES StudentDetails, StudentMarks INTEGER ) GO INSERT INTO StudentTotalMarks VALUES(1,230) INSERT INTO StudentTotalMarks VALUES(2,255) INSERT INTO StudentTotalMarks VALUES(3,200) GO -- Select from Table SELECT * FROM StudentDetails GO SELECT * FROM StudentTotalMarks GO -- Merge Statement MERGE StudentTotalMarks AS stm USING (SELECT StudentID,StudentName FROM StudentDetails) AS sd ON stm.StudentID = sd.StudentID WHEN MATCHED AND stm.StudentMarks > 250 THEN DELETE WHEN MATCHED THEN UPDATE SET stm.StudentMarks = stm.StudentMarks + 25 WHEN NOT MATCHED THEN INSERT(StudentID,StudentMarks) VALUES(sd.StudentID,25); GO -- Select from Table SELECT * FROM StudentDetails GO SELECT * FROM StudentTotalMarks GO -- Clean up DROP TABLE StudentDetails GO DROP TABLE StudentTotalMarks GO The Merge Join performs very well and the following result is obtained. Let us check the execution plan for the merge operator. You can click on following image to enlarge it. Let us evaluate the execution plan for the Table Merge Operator only. We can clearly see that the Number of Executions property suggests value 1. Which is quite clear that in a single PASS, the Merge Operation completes the operations of Insert, Update and Delete. I strongly suggest you all to use this operation, if possible, in your development. I have seen this operation implemented in many data warehousing applications. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Joins, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: Merge

    Read the article

  • Cardinality Estimation Bug with Lookups in SQL Server 2008 onward

    - by Paul White
    Cost-based optimization stands or falls on the quality of cardinality estimates (expected row counts).  If the optimizer has incorrect information to start with, it is quite unlikely to produce good quality execution plans except by chance.  There are many ways we can provide good starting information to the optimizer, and even more ways for cardinality estimation to go wrong.  Good database people know this, and work hard to write optimizer-friendly queries with a schema and metadata (e.g. statistics) that reduce the chances of poor cardinality estimation producing a sub-optimal plan.  Today, I am going to look at a case where poor cardinality estimation is Microsoft’s fault, and not yours. SQL Server 2005 SELECT th.ProductID, th.TransactionID, th.TransactionDate FROM Production.TransactionHistory AS th WHERE th.ProductID = 1 AND th.TransactionDate BETWEEN '20030901' AND '20031231'; The query plan on SQL Server 2005 is as follows (if you are using a more recent version of AdventureWorks, you will need to change the year on the date range from 2003 to 2007): There is an Index Seek on ProductID = 1, followed by a Key Lookup to find the Transaction Date for each row, and finally a Filter to restrict the results to only those rows where Transaction Date falls in the range specified.  The cardinality estimate of 45 rows at the Index Seek is exactly correct.  The table is not very large, there are up-to-date statistics associated with the index, so this is as expected. The estimate for the Key Lookup is also exactly right.  Each lookup into the Clustered Index to find the Transaction Date is guaranteed to return exactly one row.  The plan shows that the Key Lookup is expected to be executed 45 times.  The estimate for the Inner Join output is also correct – 45 rows from the seek joining to one row each time, gives 45 rows as output. The Filter estimate is also very good: the optimizer estimates 16.9951 rows will match the specified range of transaction dates.  Eleven rows are produced by this query, but that small difference is quite normal and certainly nothing to worry about here.  All good so far. SQL Server 2008 onward The same query executed against an identical copy of AdventureWorks on SQL Server 2008 produces a different execution plan: The optimizer has pushed the Filter conditions seen in the 2005 plan down to the Key Lookup.  This is a good optimization – it makes sense to filter rows out as early as possible.  Unfortunately, it has made a bit of a mess of the cardinality estimates. The post-Filter estimate of 16.9951 rows seen in the 2005 plan has moved with the predicate on Transaction Date.  Instead of estimating one row, the plan now suggests that 16.9951 rows will be produced by each clustered index lookup – clearly not right!  This misinformation also confuses SQL Sentry Plan Explorer: Plan Explorer shows 765 rows expected from the Key Lookup (it multiplies a rounded estimate of 17 rows by 45 expected executions to give 765 rows total). Workarounds One workaround is to provide a covering non-clustered index (avoiding the lookup avoids the problem of course): CREATE INDEX nc1 ON Production.TransactionHistory (ProductID) INCLUDE (TransactionDate); With the Transaction Date filter applied as a residual predicate in the same operator as the seek, the estimate is again as expected: We could also force the use of the ultimate covering index (the clustered one): SELECT th.ProductID, th.TransactionID, th.TransactionDate FROM Production.TransactionHistory AS th WITH (INDEX(1)) WHERE th.ProductID = 1 AND th.TransactionDate BETWEEN '20030901' AND '20031231'; Summary Providing a covering non-clustered index for all possible queries is not always practical, and scanning the clustered index will rarely be optimal.  Nevertheless, these are the best workarounds we have today. In the meantime, watch out for poor cardinality estimates when a predicate is applied as part of a lookup. The worst thing is that the estimate after the lookup join in the 2008+ plans is wrong.  It’s not hopelessly wrong in this particular case (45 versus 16.9951 is not the end of the world) but it easily can be much worse, and there’s not much you can do about it.  Any decisions made by the optimizer after such a lookup could be based on very wrong information – which can only be bad news. If you think this situation should be improved, please vote for this Connect item. © 2012 Paul White – All Rights Reserved twitter: @SQL_Kiwi email: [email protected]

    Read the article

  • SQL – Quick Start with Explorer Sections of NuoDB – Query NuoDB Database

    - by Pinal Dave
    This is the third post in the series of the blog posts I am writing about NuoDB. NuoDB is very innovative and easy-to-use product. I can clearly see how one can scale-out NuoDB with so much ease and confidence. In my very first blog post we discussed how we can install NuoDB (link), and in my second post I discussed how we can manage the NuoDB database transaction engines and storage managers with a few clicks (link). Note: You can Download NuoDB from here. In this post, we will learn how we can use the Explorer feature of NuoDB to do various SQL operations. NuoDB has a browser-based Explorer, which is very powerful and has many of the features any IDE would normally have. Let us see how it works in the following step-by-step tutorial. Let us go to the NuoDBNuoDB Console by typing the following URL in your browser: http://localhost:8080/ It will bring you to the QuickStart screen. Make sure that you have created the sample database. If you have not created sample database, click on Create Database and create it successfully. Now go to the NuoDB Explorer by clicking on the main tab, and it will ask you for your domain username and password. Enter the username as a domain and password as a bird. Alternatively you can also enter username as a quickstart and password as a quickstart. Once you enter the password you will be able to see the databases. In our example we have installed the Sample Database hence you will see the Test database in our Database Hierarchy screen. When you click on database it will ask for the database login. Note that Database Login is different from Domain login and you will have to enter your database login over here. In our case the database username is dba and password is goalie. Once you enter a valid username and password it will display your database. Further expand your database and you will notice various objects in your database. Once you explore various objects, select any database and click on Open. When you click on execute, it will display the SQL script to select the data from the table. The autogenerated script displays entire result set from the database. The NuoDB Explorer is very powerful and makes the life of developers very easy. If you click on List SQL Statements it will list all the available SQL statements right away in Query Editor. You can see the popup window in following image. Here is the cool thing for geeks. You can even click on Query Plan and it will display the text based query plan as well. In case of a SELECT, the query plan will be much simpler, however, when we write complex queries it will be very interesting. We can use the query plan tab for performance tuning of the database. Here is another feature, when we click on List Tables in NuoDB Explorer.  It lists all the available tables in the query editor. This is very helpful when we are writing a long complex query. Here is a relatively complex example I have built using Inner Join syntax. Right below I have displayed the Query Plan. The query plan displays all the little details related to the query. Well, we just wrote multi-table query and executed it against the NuoDB database. You can use the NuoDB Admin section and do various analyses of the query and its performance. NuoDB is a distributed database built on a patented emergent architecture with full support for SQL and ACID guarantees.  It allows you to add Transaction Engine processes to a running system to improve the performance of your system.  You can also add a second Storage Engine to your running system for redundancy purposes.  Conversely, you can shut down processes when you don’t need the extra database resources. NuoDB also provides developers and administrators with a single intuitive interface for centrally monitoring deployments. If you have read my blog posts and have not tried out NuoDB, I strongly suggest that you download it today and catch up with the learnings with me. Trust me though the product is very powerful, it is extremely easy to learn and use. Reference: Pinal Dave (http://blog.sqlauthority.com)   Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: NuoDB

    Read the article

  • Mysql 100% CPU + Slow query

    - by felipeclopes
    I'm using the RDS database from amazon with a some very big tables, and yesterday I started to face 100% CPU utilisation on the server and a bunch of slow query logs that were not happening before. I tried to check the queries that were running and faced this result from the explain command +----+-------------+-------------------------------+--------+----------------------------------------------------------------------------------------------+---------------------------------------+---------+-----------------------------------------------------------------+------+----------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------------------------------+--------+----------------------------------------------------------------------------------------------+---------------------------------------+---------+-----------------------------------------------------------------+------+----------------------------------------------+ | 1 | SIMPLE | businesses | const | PRIMARY | PRIMARY | 4 | const | 1 | Using index; Using temporary; Using filesort | | 1 | SIMPLE | activities_businesses | ref | PRIMARY,index_activities_users_on_business_id,index_tweets_users_on_tweet_id_and_business_id | index_activities_users_on_business_id | 9 | const | 2252 | Using index condition; Using where | | 1 | SIMPLE | activities_b_taggings_975e9c4 | ref | taggings_idx | taggings_idx | 782 | const,myapp_production.activities_businesses.id,const | 1 | Using index condition; Using where | | 1 | SIMPLE | activities | eq_ref | PRIMARY,index_activities_on_created_at | PRIMARY | 8 | myapp_production.activities_businesses.activity_id | 1 | Using where | +----+-------------+-------------------------------+--------+----------------------------------------------------------------------------------------------+---------------------------------------+---------+-----------------------------------------------------------------+------+----------------------------------------------+ Also checkin in the process list, I got something like this: +----+-----------------+-------------------------------------+----------------------------+---------+------+--------------+------------------------------------------------------------------------------------------------------+ | Id | User | Host | db | Command | Time | State | Info | +----+-----------------+-------------------------------------+----------------------------+---------+------+--------------+------------------------------------------------------------------------------------------------------+ | 1 | my_app | my_ip:57152 | my_app_production | Sleep | 0 | | NULL | | 2 | my_app | my_ip:57153 | my_app_production | Sleep | 2 | | NULL | | 3 | rdsadmin | localhost:49441 | NULL | Sleep | 9 | | NULL | | 6 | my_app | my_other_ip:47802 | my_app_production | Sleep | 242 | | NULL | | 7 | my_app | my_other_ip:47807 | my_app_production | Query | 231 | Sending data | SELECT my_fields... | | 8 | my_app | my_other_ip:47809 | my_app_production | Query | 231 | Sending data | SELECT my_fields... | | 9 | my_app | my_other_ip:47810 | my_app_production | Query | 231 | Sending data | SELECT my_fields... | | 10 | my_app | my_other_ip:47811 | my_app_production | Query | 231 | Sending data | SELECT my_fields... | | 11 | my_app | my_other_ip:47813 | my_app_production | Query | 231 | Sending data | SELECT my_fields... | ... So based on the numbers, it looks like there is no reason to have a slow query, since the worst execution plan is the one that goes through 2k rows which is not much. Edit 1 Another information that might be useful is the slow query_log SET timestamp=1401457485; SELECT my_query... # User@Host: myapp[myapp] @ ip-10-195-55-233.ec2.internal [IP] Id: 435 # Query_time: 95.830497 Lock_time: 0.000178 Rows_sent: 0 Rows_examined: 1129387 Edit 2 After profiling, I got this result. The result have approximately 250 rows with two columns each. +----------------------+----------+ | state | duration | +----------------------+----------+ | Sending data | 272 | | removing tmp table | 0 | | optimizing | 0 | | Creating sort index | 0 | | init | 0 | | cleaning up | 0 | | executing | 0 | | checking permissions | 0 | | freeing items | 0 | | Creating tmp table | 0 | | query end | 0 | | statistics | 0 | | end | 0 | | System lock | 0 | | Opening tables | 0 | | logging slow query | 0 | | Sorting result | 0 | | starting | 0 | | closing tables | 0 | | preparing | 0 | +----------------------+----------+ Edit 3 Adding query as requested SELECT activities.share_count, activities.created_at FROM `activities_businesses` INNER JOIN `businesses` ON `businesses`.`id` = `activities_businesses`.`business_id` INNER JOIN `activities` ON `activities`.`id` = `activities_businesses`.`activity_id` JOIN taggings activities_b_taggings_975e9c4 ON activities_b_taggings_975e9c4.taggable_id = activities_businesses.id AND activities_b_taggings_975e9c4.taggable_type = 'ActivitiesBusiness' AND activities_b_taggings_975e9c4.tag_id = 104 AND activities_b_taggings_975e9c4.created_at >= '2014-04-30 13:36:44' WHERE ( businesses.id = 1 ) AND ( activities.created_at > '2014-04-30 13:36:44' ) AND ( activities.created_at < '2014-05-30 12:27:03' ) ORDER BY activities.created_at; Edit 4 There may be a chance that the indexes are not being applied due to difference in column type between the taggings and the activities_businesses, on the taggable_id column. mysql> SHOW COLUMNS FROM activities_businesses; +-------------+------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-------------+------------+------+-----+---------+----------------+ | id | int(11) | NO | PRI | NULL | auto_increment | | activity_id | bigint(20) | YES | MUL | NULL | | | business_id | bigint(20) | YES | MUL | NULL | | +-------------+------------+------+-----+---------+----------------+ 3 rows in set (0.01 sec) mysql> SHOW COLUMNS FROM taggings; +---------------+--------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +---------------+--------------+------+-----+---------+----------------+ | id | int(11) | NO | PRI | NULL | auto_increment | | tag_id | int(11) | YES | MUL | NULL | | | taggable_id | bigint(20) | YES | | NULL | | | taggable_type | varchar(255) | YES | | NULL | | | tagger_id | int(11) | YES | | NULL | | | tagger_type | varchar(255) | YES | | NULL | | | context | varchar(128) | YES | | NULL | | | created_at | datetime | YES | | NULL | | +---------------+--------------+------+-----+---------+----------------+ So it is examining way more rows than it shows in the explain query, probably because some indexes are not being applied. Do you guys can help m with that?

    Read the article

  • Why Doesn’t Partition Elimination Work?

    - by Paul White
    Given a partitioned table and a simple SELECT query that compares the partitioning column to a single literal value, why does SQL Server read all the partitions when it seems obvious that only one partition needs to be examined? Sample Data The following script creates a table, partitioned on the char(3) column ‘Div’, and populates it with 100,000 rows of data: USE Sandpit; GO CREATE PARTITION FUNCTION PF ( char (3)) AS RANGE RIGHT FOR VALUES ( '1' , '2' , '3' , '4' , '5' , '6' , '7' , '8' , '9'...(read more)

    Read the article

  • SQLAuthority News – Statistics Used by the Query Optimizer in Microsoft SQL Server 2008 – Microsoft Whitepaper

    - by pinaldave
    I recently presented session on Statistics and Best Practices in Virtual Tech Days on Nov 22, 2010. The sessions was very popular and I got many questions right after the sessions. The number question I had received was where everybody can get the further information. I am very much happy that my sessions created some curiosity for one of the most important feature of the SQL Server. Statistics are the heart of the SQL Server. Microsoft has published a white paper on the subject how statistics are useful to Query Optimizer. Here is the abstract of the same white paper from Microsoft. Statistics Used by the Query Optimizer in Microsoft SQL Server 2008 Writer: Eric N. Hanson and Yavor Angelov Microsoft SQL Server 2008 collects statistical information about indexes and column data stored in the database. These statistics are used by the SQL Server query optimizer to choose the most efficient plan for retrieving or updating data. This paper describes what data is collected, where it is stored, and which commands create, update, and delete statistics. By default, SQL Server 2008 also creates and updates statistics automatically, when such an operation is considered to be useful. This paper also outlines how these defaults can be changed on different levels (column, table, and database). In addition, it presents how certain query language features, such as Transact-SQL variables, interact with use of statistics by the optimizer, and it provides guidance for using these features when writing queries so you can obtain good query performance. Link to white paper Statistics Used by the Query Optimizer in Microsoft SQL Server 2008 ?Reference: Pinal Dave (http://blog.SQLAuthority.com)   Filed under: Pinal Dave, SQL, SQL Authority, SQL Documentation, SQL Download, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQL White Papers, SQLAuthority News, T SQL, Technology

    Read the article

  • Do a query only if there are no results on previous query

    - by yes123
    Hi guys: I do this query(1): (1)SELECT * FROM t1 WHERE title LIKE 'key%' LIMIT 1 I need to do a second(2) query only if this previous query has no results (2)SELECT * FROM t1 WHERE title LIKE '%key%' LIMIT 1 basically i need only 1 row who got the most close title to my key. Atm i am using an UNION query with a custom field to order it and a LIMIT 1. Problem is I don't want to do the others query if already the first made the result. Thanks

    Read the article

  • SQL SERVER – Monday Morning Puzzle – Query Returns Results Sometimes but Not Always

    - by pinaldave
    The amount of email I receive sometime it is impossible for me to answer every email. Nonetheless I try to answer pretty much every email I receive. However, quite often I receive such questions in email that I have no answer to them because either emails are not complete or they are out of my domain expertise. In recent times I received one email which had only one or two lines but indeed attracted my attention to it. The question was bit vague but it indeed made me think. The answer was not straightforward so I had to keep on writing the answer as I remember it. However, after writing the answer I do not feel satisfied. Let me put this question in front of you and see if we all can come up with a comprehensive answer. Question: I am beginner with SQL Server. I have one query, it sometime returns a result and sometime it does not return me the result. Where should I start looking for a solution and what kind of information I should send to you so you can help me with solving. I have no clue, please guide me. Well, if you read the question, it is indeed incomplete and it does not contain much of the information at all. I decided to help him and here is the answer, which I started to compose. Answer: As there are not much information in the original question, I am not confident what will solve your problem. However, here are the few things which you can try to look at and see if that solves your problem. Check parameter which is passed to the query. Is the parameter changing at various executions? Check connection string – is there some kind of logic around it? Do you have a non-deterministic component in your query logic? (In other words – does your result is based on current date time or any other time based function?) Are you facing time out while running your query? Is there any error in error log? What is the business logic in your query? Do you have all the valid permissions to all the objects used in the query? Are permissions changing or query accessing a different object in various executions? (Add your suggestions here) Meanwhile, have you ever faced this situation? If yes, do share your experience in the comment area. I will send a copy of my book SQL Server Interview Questions and Answers to one of the most interesting comment. The winner will be announced by next Monday.  Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Interview Questions and Answers, SQL Puzzle, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • Spooling in SQL execution plans

    - by Rob Farley
    Sewing has never been my thing. I barely even know the terminology, and when discussing this with American friends, I even found out that half the words that Americans use are different to the words that English and Australian people use. That said – let’s talk about spools! In particular, the Spool operators that you find in some SQL execution plans. This post is for T-SQL Tuesday, hosted this month by me! I’ve chosen to write about spools because they seem to get a bad rap (even in my song I used the line “There’s spooling from a CTE, they’ve got recursion needlessly”). I figured it was worth covering some of what spools are about, and hopefully explain why they are remarkably necessary, and generally very useful. If you have a look at the Books Online page about Plan Operators, at http://msdn.microsoft.com/en-us/library/ms191158.aspx, and do a search for the word ‘spool’, you’ll notice it says there are 46 matches. 46! Yeah, that’s what I thought too... Spooling is mentioned in several operators: Eager Spool, Lazy Spool, Index Spool (sometimes called a Nonclustered Index Spool), Row Count Spool, Spool, Table Spool, and Window Spool (oh, and Cache, which is a special kind of spool for a single row, but as it isn’t used in SQL 2012, I won’t describe it any further here). Spool, Table Spool, Index Spool, Window Spool and Row Count Spool are all physical operators, whereas Eager Spool and Lazy Spool are logical operators, describing the way that the other spools work. For example, you might see a Table Spool which is either Eager or Lazy. A Window Spool can actually act as both, as I’ll mention in a moment. In sewing, cotton is put onto a spool to make it more useful. You might buy it in bulk on a cone, but if you’re going to be using a sewing machine, then you quite probably want to have it on a spool or bobbin, which allows it to be used in a more effective way. This is the picture that I want you to think about in relation to your data. I’m sure you use spools every time you use your sewing machine. I know I do. I can’t think of a time when I’ve got out my sewing machine to do some sewing and haven’t used a spool. However, I often run SQL queries that don’t use spools. You see, the data that is consumed by my query is typically in a useful state without a spool. It’s like I can just sew with my cotton despite it not being on a spool! Many of my favourite features in T-SQL do like to use spools though. This looks like a very similar query to before, but includes an OVER clause to return a column telling me the number of rows in my data set. I’ll describe what’s going on in a few paragraphs’ time. So what does a Spool operator actually do? The spool operator consumes a set of data, and stores it in a temporary structure, in the tempdb database. This structure is typically either a Table (ie, a heap), or an Index (ie, a b-tree). If no data is actually needed from it, then it could also be a Row Count spool, which only stores the number of rows that the spool operator consumes. A Window Spool is another option if the data being consumed is tightly linked to windows of data, such as when the ROWS/RANGE clause of the OVER clause is being used. You could maybe think about the type of spool being like whether the cotton is going onto a small bobbin to fit in the base of the sewing machine, or whether it’s a larger spool for the top. A Table or Index Spool is either Eager or Lazy in nature. Eager and Lazy are Logical operators, which talk more about the behaviour, rather than the physical operation. If I’m sewing, I can either be all enthusiastic and get all my cotton onto the spool before I start, or I can do it as I need it. “Lazy” might not the be the best word to describe a person – in the SQL world it describes the idea of either fetching all the rows to build up the whole spool when the operator is called (Eager), or populating the spool only as it’s needed (Lazy). Window Spools are both physical and logical. They’re eager on a per-window basis, but lazy between windows. And when is it needed? The way I see it, spools are needed for two reasons. 1 – When data is going to be needed AGAIN. 2 – When data needs to be kept away from the original source. If you’re someone that writes long stored procedures, you are probably quite aware of the second scenario. I see plenty of stored procedures being written this way – where the query writer populates a temporary table, so that they can make updates to it without risking the original table. SQL does this too. Imagine I’m updating my contact list, and some of my changes move data to later in the book. If I’m not careful, I might update the same row a second time (or even enter an infinite loop, updating it over and over). A spool can make sure that I don’t, by using a copy of the data. This problem is known as the Halloween Effect (not because it’s spooky, but because it was discovered in late October one year). As I’m sure you can imagine, the kind of spool you’d need to protect against the Halloween Effect would be eager, because if you’re only handling one row at a time, then you’re not providing the protection... An eager spool will block the flow of data, waiting until it has fetched all the data before serving it up to the operator that called it. In the query below I’m forcing the Query Optimizer to use an index which would be upset if the Name column values got changed, and we see that before any data is fetched, a spool is created to load the data into. This doesn’t stop the index being maintained, but it does mean that the index is protected from the changes that are being done. There are plenty of times, though, when you need data repeatedly. Consider the query I put above. A simple join, but then counting the number of rows that came through. The way that this has executed (be it ideal or not), is to ask that a Table Spool be populated. That’s the Table Spool operator on the top row. That spool can produce the same set of rows repeatedly. This is the behaviour that we see in the bottom half of the plan. In the bottom half of the plan, we see that the a join is being done between the rows that are being sourced from the spool – one being aggregated and one not – producing the columns that we need for the query. Table v Index When considering whether to use a Table Spool or an Index Spool, the question that the Query Optimizer needs to answer is whether there is sufficient benefit to storing the data in a b-tree. The idea of having data in indexes is great, but of course there is a cost to maintaining them. Here we’re creating a temporary structure for data, and there is a cost associated with populating each row into its correct position according to a b-tree, as opposed to simply adding it to the end of the list of rows in a heap. Using a b-tree could even result in page-splits as the b-tree is populated, so there had better be a reason to use that kind of structure. That all depends on how the data is going to be used in other parts of the plan. If you’ve ever thought that you could use a temporary index for a particular query, well this is it – and the Query Optimizer can do that if it thinks it’s worthwhile. It’s worth noting that just because a Spool is populated using an Index Spool, it can still be fetched using a Table Spool. The details about whether or not a Spool used as a source shows as a Table Spool or an Index Spool is more about whether a Seek predicate is used, rather than on the underlying structure. Recursive CTE I’ve already shown you an example of spooling when the OVER clause is used. You might see them being used whenever you have data that is needed multiple times, and CTEs are quite common here. With the definition of a set of data described in a CTE, if the query writer is leveraging this by referring to the CTE multiple times, and there’s no simplification to be leveraged, a spool could theoretically be used to avoid reapplying the CTE’s logic. Annoyingly, this doesn’t happen. Consider this query, which really looks like it’s using the same data twice. I’m creating a set of data (which is completely deterministic, by the way), and then joining it back to itself. There seems to be no reason why it shouldn’t use a spool for the set described by the CTE, but it doesn’t. On the other hand, if we don’t pull as many columns back, we might see a very different plan. You see, CTEs, like all sub-queries, are simplified out to figure out the best way of executing the whole query. My example is somewhat contrived, and although there are plenty of cases when it’s nice to give the Query Optimizer hints about how to execute queries, it usually doesn’t do a bad job, even without spooling (and you can always use a temporary table). When recursion is used, though, spooling should be expected. Consider what we’re asking for in a recursive CTE. We’re telling the system to construct a set of data using an initial query, and then use set as a source for another query, piping this back into the same set and back around. It’s very much a spool. The analogy of cotton is long gone here, as the idea of having a continual loop of cotton feeding onto a spool and off again doesn’t quite fit, but that’s what we have here. Data is being fed onto the spool, and getting pulled out a second time when the spool is used as a source. (This query is running on AdventureWorks, which has a ManagerID column in HumanResources.Employee, not AdventureWorks2012) The Index Spool operator is sucking rows into it – lazily. It has to be lazy, because at the start, there’s only one row to be had. However, as rows get populated onto the spool, the Table Spool operator on the right can return rows when asked, ending up with more rows (potentially) getting back onto the spool, ready for the next round. (The Assert operator is merely checking to see if we’ve reached the MAXRECURSION point – it vanishes if you use OPTION (MAXRECURSION 0), which you can try yourself if you like). Spools are useful. Don’t lose sight of that. Every time you use temporary tables or table variables in a stored procedure, you’re essentially doing the same – don’t get upset at the Query Optimizer for doing so, even if you think the spool looks like an expensive part of the query. I hope you’re enjoying this T-SQL Tuesday. Why not head over to my post that is hosting it this month to read about some other plan operators? At some point I’ll write a summary post – once I have you should find a comment below pointing at it. @rob_farley

    Read the article

  • Enforcing a query in MySql to use a specific index

    - by Hossein
    Hi, I have large table. consisting of only 3 columns (id(INT),bookmarkID(INT),tagID(INT)).I have two BTREE indexes one for each bookmarkID and tagID columns.This table has about 21 Million records. I am trying to run this query: SELECT bookmarkID,COUNT(bookmarkID) AS count FROM bookmark_tag_map GROUP BY tagID,bookmarkID HAVING tagID IN (-----"tagIDList"-----) AND count >= N which takes ages to return the results.I read somewhere that if make an index in which it has tagID,bookmarkID together, i will get a much faster result. I created the index after some time. Tried the query again, but it seems that this query is not using the new index that I have made.I ran EXPLAIN and saw that it is actually true. My question now is that how I can enforce a query to use a specific index? also comments on other ways to make the query faster are welcome. Thanks

    Read the article

  • Understanding #DAX Query Plans for #powerpivot and #tabular

    - by Marco Russo (SQLBI)
    Alberto Ferrari wrote a very interesting white paper about DAX query plans. We published it on a page where we'll gather articles and tools about DAX query plans: http://www.sqlbi.com/topics/query-plans/I reviewed the paper and this is the result of many months of study - we know that we just scratched the surface of this topic, also because we still don't have enough information about internal behavior of many of the operators contained in a query plan. However, by reading the paper you will start reading a query plan and you will understand how it works the optimization found by Chris Webb one month ago to the events-in-progress scenario. The white paper also contains a more optimized query (10 time faster), even if the performance depends on data distribution and the best choice really depends on the data you have. Now you should be curious enough to read the paper until the end, because the more optimized query is the last example in the paper!

    Read the article

  • DNS Query.log - Multiple query’s for ripe.net

    - by Christopher Wilson
    Currently I run a DNS server (bind9) that handles queries from clients over the internet lately I have noticed hundreds of queries from all different address's that look like this (Server IP removed) client 216.59.33.210#53: query: ripe.net IN ANY +ED (0.0.0.0) client 216.59.33.204#53: query: ripe.net IN ANY +ED (0.0.0.0) client 208.64.127.5#53: query: ripe.net IN ANY +ED (0.0.0.0) client 184.107.255.202#53: query: ripe.net IN ANY +ED (0.0.0.0) client 208.64.127.5#53: query: ripe.net IN ANY +ED (0.0.0.0) client 208.64.127.5#53: query: ripe.net IN ANY +ED (0.0.0.0) client 205.204.65.83#53: query: ripe.net IN ANY +ED (0.0.0.0) client 69.162.110.106#53: query: ripe.net IN ANY +ED (0.0.0.0) client 216.59.33.210#53: query: ripe.net IN ANY +ED (0.0.0.0) client 69.162.110.106#53: query: ripe.net IN ANY +ED (0.0.0.0) client 216.59.33.204#53: query: ripe.net IN ANY +ED (0.0.0.0) client 208.64.127.5#53: query: ripe.net IN ANY +ED (0.0.0.0) Can someone please explain why there are so many clients querying for ripe.net ?

    Read the article

  • SQLAuthority News – Download Whitepaper – Understanding and Controlling Parallel Query Processing in SQL Server

    - by pinaldave
    My recently article SQL SERVER – Reducing CXPACKET Wait Stats for High Transactional Database has received many good comments regarding MAXDOP 1 and MAXDOP 0. I really enjoyed reading the comments as the comments are received from industry leaders and gurus. I was further researching on the subject and I end up on following white paper written by Microsoft. Understanding and Controlling Parallel Query Processing in SQL Server Data warehousing and general reporting applications tend to be CPU intensive because they need to read and process a large number of rows. To facilitate quick data processing for queries that touch a large amount of data, Microsoft SQL Server exploits the power of multiple logical processors to provide parallel query processing operations such as parallel scans. Through extensive testing, we have learned that, for most large queries that are executed in a parallel fashion, SQL Server can deliver linear or nearly linear response time speedup as the number of logical processors increases. However, some queries in high parallelism scenarios perform suboptimally. There are also some parallelism issues that can occur in a multi-user parallel query workload. This white paper describes parallel performance problems you might encounter when you run such queries and workloads, and it explains why these issues occur. In addition, it presents how data warehouse developers can detect these issues, and how they can work around them or mitigate them. To review the document, please download the Understanding and Controlling Parallel Query Processing in SQL Server Word document. Note: Above abstract has been taken from here. The real question is what does the parallel queries has made life of DBA much simpler or is it looked at with potential issue related to degradation of the performance? Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQL White Papers, SQLAuthority News, T SQL, Technology

    Read the article

  • SQL SERVER – Quiz and Video – Introduction to Hierarchical Query using a Recursive CTE

    - by pinaldave
    This blog post is inspired from SQL Queries Joes 2 Pros: SQL Query Techniques For Microsoft SQL Server 2008 – SQL Exam Prep Series 70-433 – Volume 2.[Amazon] | [Flipkart] | [Kindle] | [IndiaPlaza] This is follow up blog post of my earlier blog post on the same subject - SQL SERVER – Introduction to Hierarchical Query using a Recursive CTE – A Primer. In the article we discussed various basics terminology of the CTE. The article further covers following important concepts of common table expression. What is a Common Table Expression (CTE) Building a Recursive CTE Identify the Anchor and Recursive Query Add the Anchor and Recursive query to a CTE Add an expression to track hierarchical level Add a self-referencing INNER JOIN statement Above six are the most important concepts related to CTE and SQL Server.  There are many more things one has to learn but without beginners fundamentals one can’t learn the advanced  concepts. Let us have small quiz and check how many of you get the fundamentals right. Quiz 1) You have an employee table with the following data. EmpID FirstName LastName MgrID 1 David Kennson 11 2 Eric Bender 11 3 Lisa Kendall 4 4 David Lonning 11 5 John Marshbank 4 6 James Newton 3 7 Sally Smith NULL You need to write a recursive CTE that shows the EmpID, FirstName, LastName, MgrID, and employee level. The CEO should be listed at Level 1. All people who work for the CEO will be listed at Level 2. All of the people who work for those people will be listed at Level 3. Which CTE code will achieve this result? WITH EmpList AS (SELECT Boss.EmpID, Boss.FName, Boss.LName, Boss.MgrID, 1 AS Lvl FROM Employee AS Boss WHERE Boss.MgrID IS NULL UNION ALL SELECT E.EmpID, E.FirstName, E.LastName, E.MgrID, EmpList.Lvl + 1 FROM Employee AS E INNER JOIN EmpList ON E.MgrID = EmpList.EmpID) SELECT * FROM EmpList WITH EmpListAS (SELECT EmpID, FirstName, LastName, MgrID, 1 as Lvl FROM Employee WHERE MgrID IS NULL UNION ALL SELECT EmpID, FirstName, LastName, MgrID, 2 as Lvl ) SELECT * FROM BossList WITH EmpList AS (SELECT EmpID, FirstName, LastName, MgrID, 1 as Lvl FROM Employee WHERE MgrID is NOT NULL UNION SELECT EmpID, FirstName, LastName, MgrID, BossList.Lvl + 1 FROM Employee INNER JOIN EmpList BossList ON Employee.MgrID = BossList.EmpID) SELECT * FROM EmpList 2) You have a table named Employee. The EmployeeID of each employee’s manager is in the ManagerID column. You need to write a recursive query that produces a list of employees and their manager. The query must also include the employee’s level in the hierarchy. You write the following code segment: WITH EmployeeList (EmployeeID, FullName, ManagerName, Level) AS ( –PICK ANSWER CODE HERE ) SELECT EmployeeID, FullName, ” AS [ManagerID], 1 AS [Level] FROM Employee WHERE ManagerID IS NULL UNION ALL SELECT emp.EmployeeID, emp.FullName mgr.FullName, 1 + 1 AS [Level] FROM Employee emp JOIN Employee mgr ON emp.ManagerID = mgr.EmployeeId SELECT EmployeeID, FullName, ” AS [ManagerID], 1 AS [Level] FROM Employee WHERE ManagerID IS NULL UNION ALL SELECT emp.EmployeeID, emp.FullName, mgr.FullName, mgr.Level + 1 FROM EmployeeList mgr JOIN Employee emp ON emp.ManagerID = mgr.EmployeeId Now make sure that you write down all the answers on the piece of paper. Watch following video and read earlier article over here. If you want to change the answer you still have chance. Solution 1) 1 2) 2 Now compare let us check the answers and compare your answers to following answers. I am very confident you will get them correct. Available at USA: Amazon India: Flipkart | IndiaPlaza Volume: 1, 2, 3, 4, 5 Please leave your feedback in the comment area for the quiz and video. Did you know all the answers of the quiz? Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Joes 2 Pros, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • SQL SERVER – Relationship with Parallelism with Locks and Query Wait – Question for You

    - by Pinal Dave
    Today, I have one very simple question based on following image. A full disclaimer is that I have no idea why it is like that. I tried to reach out to few of my friends who know a lot about SQL Server but no one has any answer. Here is the question: If you go to server properties and click on Advanced you will see the following screen. Under the Parallelism section if you noticed there are four options: Cost Threshold for Parallelism Locks Max Degree of Parallelism Query Wait I can clearly understand why Cost Threshold for Parallelism and Max Degree of Parallelism belongs to Parallelism but I am not sure why we have two other options Locks and Query Wait belongs to Parallelism section. I can see that the options are ordered alphabetically but I do not understand the reason for locks and query wait to list under Parallelism. Here is the question for you – Why Locks and Query Wait options are listed under Parallelism section in SQL Server Advanced Properties? Please leave a comment with your explanation. I will publish valid answers on this blog with due credit. Reference: Pinal Dave (http://blog.sqlauthority.com)   Filed under: PostADay, SQL, SQL Authority, SQL Puzzle, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • specify query timeout when using toplink essential query hint

    - by yhzs8
    Hi, For glassfish v2, I have searched through the web and I cannot find anyway to specify query timeout when using TopLink essential query hint. We have another option to migrate to EclipseLink but that is not feasible. have tried the solution in http://forums.oracle.com/forums/thread.jspa?threadID=974732&tstart=-1 but it seems the DatabaseQuery which one could set a timeout value is actually for Toplink, not TopLink essential. Do we have some other way to instruct the JDBC driver for this timeout value other than the query hint? I need to do it on query-basis and not system-basis (which is just to change the value of DISTRIBUTED_LOCK_TIMEOUT)

    Read the article

  • mysql query query

    - by nightcoder1
    basically i need to write a query for mysql, but i have no experience in this and i cant find good tutorials on the old tinternet. i have a table called rels with columns "hosd_id" "linkedhost_id" "text link" and a table called hostlist with columns "id" "hostname" all i am trying to achieve is a query which outputs the "hostname" and "linked_id" when "host_id" is equal to "id" any help or pointers on syntax or code would be helpfull, or even a good mysql query guide

    Read the article

  • mod_rewrite to redirect URL with query string

    - by meeble
    I've searched all over stackoverflow, but none of the answers seem to be working for this situation. I have a lot of working mod_rewrite rules already in my httpd.conf file. I just recently found that Google had indexed one of my non-rewritten URLs with a query string in it: http://domain.com/?state=arizona I would like to use mod_rewrite to do a 301 redirect to this URL: http://domain.com/arizona The issue is that later on in my rewrite rules, that 2nd URL is being rewritten to pass query variables on to WordPress. It ends up getting rewritten to: http://domain.com/index.php?state=arizona Which is the proper functionality. Everything I have tried so far has either not worked at all or put me in an endless rewrite loop. This is what I have right now, which is getting stuck in a loop: RewriteCond %{QUERY_STRING} state=arizona [NC] RewriteRule .* http://domain.com/arizona [R=301,L] #older rewrite rule that passes query string based on URL: RewriteRule ^([A-Za-z-]+)$ index.php?state=$1 [L] which gives me an endless rewrite loop and takes me to this URL: http://domain.com/arizona?state=arizona I then tried this: RewriteRule .* http://domain.com/arizona? [R=301,L] which got rid of the query string in the URL, but still creates a loop.

    Read the article

  • Combining two-part SQL query into one query

    - by user332523
    Hello, I have a SQL query that I'm currently solving by doing two queries. I am wondering if there is a way to do it in a single query that makes it more efficient. Consider two tables: Transaction_Entries table and Transactions, each one defined below: Transactions - id - reference_number (varchar) Transaction_Entries - id - account_id - transaction_id (references Transactions table) Notes: There are multiple transaction entries per transaction. Some transactions are related, and will have the same reference_number string. To get all transaction entries for Account X, then I would do SELECT E.*, T.reference_number FROM Transaction_Entries E JOIN Transactions T ON (E.transaction_id=T.id) where E.account_id = X The next part is the hard part. I want to find all related transactions, regardless of the account id. First I make a list of all the unique reference numbers I found in the previous result set. Then for each one, I can query all the transactions that have that reference number. Assume that I hold all the rows from the previous query in PreviousResultSet UniqueReferenceNumbers = GetUniqueReferenceNumbers(PreviousResultSet) // in Java foreach R in UniqueReferenceNumbers // in Java SELECT * FROM Transaction_Entries where transaction_id IN (SELECT * FROM Transactions WHERE reference_number=R Any suggestions how I can put this into a single efficient query?

    Read the article

  • query optimization

    - by Gaurav
    I have a query of the form SELECT uid1,uid2 FROM friend WHERE uid1 IN (SELECT uid2 FROM friend WHERE uid1='.$user_id.') and uid2 IN (SELECT uid2 FROM friend WHERE uid1='.$user_id.') The problem now is that the nested query SELECT uid2 FROM friend WHERE uid1='.$user_id.' returns a very large number of ids(approx. 5000). The table structure of the friend table is uid1(int), uid2(int). This table is used to determine whether two users are linked together as friends. Any workaround? Can I write the query in a different way? Or is there some other way to solve this issue. I'm sure I am not the first person to face such a problem. Any help would be greatly appreciated.

    Read the article

  • Why this query is so slow?

    - by Silver Light
    This query appears in mysql slow query log: it takes 11 seconds. INSERT INTO record_visits ( record_id, visit_day ) VALUES ( '567', NOW() ); The table has 501043 records and it's structure looks like this: CREATE TABLE IF NOT EXISTS `record_visits` ( `id` int(11) NOT NULL AUTO_INCREMENT, `record_id` int(11) DEFAULT NULL, `visit_day` date DEFAULT NULL, `visit_cnt` bigint(20) DEFAULT '1', PRIMARY KEY (`id`), UNIQUE KEY `record_id_visit_day` (`record_id`,`visit_day`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8 ; What could be wrong? Why this INSERT takes so long?

    Read the article

  • Is there any way to send a column value from outer query to inner sub query? [closed]

    - by chetan
    'Discussions' table schema title description desid replyto upvote downvote views browser used a1 none 1 1 12 - bad topic b2 a1 2 3 14 sql database a3 none 4 5 34 - crome b4 a3 3 4 12 The above table has two types of content types Main Topics and Comments. Unique content identifier 'desid' used to identify that its a main topic or a comment. 'desid' starts with 'a' for Main Topic and for comment 'desid' starts with 'b'. For comment 'replyto' is the 'desid' of main topic to which this comment is associated. I like to find out the list of the top main topics that are arranged on the basis of (upvote+downvote+visits+number of comments to it) addition. The following query gives top topics list in order of (upvote+downvote+visits) select * with highest number of upvote+downvote+views by query "select * from [DB_user1212].[dbo].[discussions] where desid like 'a%' order by (upvote+downvote+visited) desc For (comments+upvote+downvote+views ) I tried select * from [DB_user1212].[dbo].[discussions] where desid like 'a%' order by ((select count(*) from [DB_user1212].[dbo].[discussions] where replyto = desid )+upvote+downvote+visited) desc but it didn't work because its not possible to send desid from outer query to inner subquery. How to solve this? Please note that I want solution in query language only.

    Read the article

  • Why is postgresql update query so slow sometimes, even with index

    - by Matija
    i have a simple update query (foo column type is BOOLEAN (default false)): update tablename set foo = true where id = 234; which "id" is set to (primary) key, and if i run "explain analyze" i got: Index Cond: (id = 234) Total runtime: 0.358 ms but still, i have plenty of unexplained queries at slow log (pgfouine), which took more than 200s (?!): Times executed: 99, Av. duration (s): 70 can anyone please explain, whats the reason for that? (1.5 mio rows in table, postgresql 8.4)

    Read the article

  • Query Optimizing Request

    - by mithilatw
    I am very sorry if this question is structured in not a very helpful manner or the question itself is not a very good one! I need to update a MSSQL table call component every 10 minutes based on information from another table call materials_progress I have nearly 60000 records in component and more than 10000 records in materials_progress I wrote an update query to do the job, but it takes longer than 4 minutes to complete execution! Here is the query : UPDATE component SET stage_id = CASE WHEN t.required_quantity <= t.total_received THEN 27 WHEN t.total_ordered < t.total_received THEN 18 ELSE 18 END FROM ( SELECT mp.job_id, mp.line_no, mp.component, l.quantity AS line_quantity, CASE WHEN mp.component_name_id = 2 THEN l.quantity*2 ELSE l.quantity END AS required_quantity, SUM(ordered) AS total_ordered, SUM(received) AS total_received , c.component_id FROM line l LEFT JOIN component c ON c.line_id = l.line_id LEFT JOIN materials_progress mp ON l.job_id = mp.job_id AND l.line_no = mp.line_no AND c.component_name_id = mp.component_name_id WHERE mp.job_id IS NOT NULL AND (mp.cancelled IS NULL OR mp.cancelled = 0) AND (mp.manual_override IS NULL OR mp.manual_override = 0) AND c.stage_id = 18 GROUP BY mp.job_id, mp.line_no, mp.component, l.quantity, mp.component_name_id, component_id ) AS t WHERE component.component_id = t.component_id I am not going to explain the scenario as it too complex.. could somebody please please tell me what makes this query this much expensive and a way to get around it? Thank you very very much in advance!!!

    Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >