Search Results

Search found 4109 results on 165 pages for 'plan'.

Page 11/165 | < Previous Page | 7 8 9 10 11 12 13 14 15 16 17 18 | Next Page >

Does Your Website Do the Job?

Before you even think of starting to build a website, you must put together a proper plan of how you want every part of you site to fit together, if you don't have a plan, you might as well forget about any success as failure to plan is a plan for failure. Even if you have a small one page website, you must make sure you run it to its optimum level, or you will see no rewards for your efforts whatsoever.

Read the article
Heaps of Trouble?

- by Paul White NZ

If you’re not already a regular reader of Brad Schulz’s blog, you’re missing out on some great material. In his latest entry, he is tasked with optimizing a query run against tables that have no indexes at all. The problem is, predictably, that performance is not very good. The catch is that we are not allowed to create any indexes (or even new statistics) as part of our optimization efforts. In this post, I’m going to look at the problem from a slightly different angle, and present an alternative solution to the one Brad found. Inevitably, there’s going to be some overlap between our entries, and while you don’t necessarily need to read Brad’s post before this one, I do strongly recommend that you read it at some stage; he covers some important points that I won’t cover again here. The Example We’ll use data from the AdventureWorks database, copied to temporary unindexed tables. A script to create these structures is shown below: CREATE TABLE #Custs ( CustomerID INTEGER NOT NULL, TerritoryID INTEGER NULL, CustomerType NCHAR(1) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, ); GO CREATE TABLE #Prods ( ProductMainID INTEGER NOT NULL, ProductSubID INTEGER NOT NULL, ProductSubSubID INTEGER NOT NULL, Name NVARCHAR(50) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, ); GO CREATE TABLE #OrdHeader ( SalesOrderID INTEGER NOT NULL, OrderDate DATETIME NOT NULL, SalesOrderNumber NVARCHAR(25) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, CustomerID INTEGER NOT NULL, ); GO CREATE TABLE #OrdDetail ( SalesOrderID INTEGER NOT NULL, OrderQty SMALLINT NOT NULL, LineTotal NUMERIC(38,6) NOT NULL, ProductMainID INTEGER NOT NULL, ProductSubID INTEGER NOT NULL, ProductSubSubID INTEGER NOT NULL, ); GO INSERT #Custs ( CustomerID, TerritoryID, CustomerType ) SELECT C.CustomerID, C.TerritoryID, C.CustomerType FROM AdventureWorks.Sales.Customer C WITH (TABLOCK); GO INSERT #Prods ( ProductMainID, ProductSubID, ProductSubSubID, Name ) SELECT P.ProductID, P.ProductID, P.ProductID, P.Name FROM AdventureWorks.Production.Product P WITH (TABLOCK); GO INSERT #OrdHeader ( SalesOrderID, OrderDate, SalesOrderNumber, CustomerID ) SELECT H.SalesOrderID, H.OrderDate, H.SalesOrderNumber, H.CustomerID FROM AdventureWorks.Sales.SalesOrderHeader H WITH (TABLOCK); GO INSERT #OrdDetail ( SalesOrderID, OrderQty, LineTotal, ProductMainID, ProductSubID, ProductSubSubID ) SELECT D.SalesOrderID, D.OrderQty, D.LineTotal, D.ProductID, D.ProductID, D.ProductID FROM AdventureWorks.Sales.SalesOrderDetail D WITH (TABLOCK); The query itself is a simple join of the four tables: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #Prods P JOIN #OrdDetail D ON P.ProductMainID = D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID JOIN #OrdHeader H ON D.SalesOrderID = H.SalesOrderID JOIN #Custs C ON H.CustomerID = C.CustomerID ORDER BY P.ProductMainID ASC OPTION (RECOMPILE, MAXDOP 1); Remember that these tables have no indexes at all, and only the single-column sampled statistics SQL Server automatically creates (assuming default settings). The estimated query plan produced for the test query looks like this (click to enlarge): The Problem The problem here is one of cardinality estimation – the number of rows SQL Server expects to find at each step of the plan. The lack of indexes and useful statistical information means that SQL Server does not have the information it needs to make a good estimate. Every join in the plan shown above estimates that it will produce just a single row as output. Brad covers the factors that lead to the low estimates in his post. In reality, the join between the #Prods and #OrdDetail tables will produce 121,317 rows. It should not surprise you that this has rather dire consequences for the remainder of the query plan. In particular, it makes a nonsense of the optimizer’s decision to use Nested Loops to join to the two remaining tables. Instead of scanning the #OrdHeader and #Custs tables once (as it expected), it has to perform 121,317 full scans of each. The query takes somewhere in the region of twenty minutes to run to completion on my development machine. A Solution At this point, you may be thinking the same thing I was: if we really are stuck with no indexes, the best we can do is to use hash joins everywhere. We can force the exclusive use of hash joins in several ways, the two most common being join and query hints. A join hint means writing the query using the INNER HASH JOIN syntax; using a query hint involves adding OPTION (HASH JOIN) at the bottom of the query. The difference is that using join hints also forces the order of the join, whereas the query hint gives the optimizer freedom to reorder the joins at its discretion. Adding the OPTION (HASH JOIN) hint results in this estimated plan: That produces the correct output in around seven seconds, which is quite an improvement! As a purely practical matter, and given the rigid rules of the environment we find ourselves in, we might leave things there. (We can improve the hashing solution a bit – I’ll come back to that later on). Faster Nested Loops It might surprise you to hear that we can beat the performance of the hash join solution shown above using nested loops joins exclusively, and without breaking the rules we have been set. The key to this part is to realize that a condition like (A = B) can be expressed as (A <= B) AND (A >= B). Armed with this tremendous new insight, we can rewrite the join predicates like so: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #OrdDetail D JOIN #OrdHeader H ON D.SalesOrderID >= H.SalesOrderID AND D.SalesOrderID <= H.SalesOrderID JOIN #Custs C ON H.CustomerID >= C.CustomerID AND H.CustomerID <= C.CustomerID JOIN #Prods P ON P.ProductMainID >= D.ProductMainID AND P.ProductMainID <= D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID ORDER BY D.ProductMainID OPTION (RECOMPILE, LOOP JOIN, MAXDOP 1, FORCE ORDER); I’ve also added LOOP JOIN and FORCE ORDER query hints to ensure that only nested loops joins are used, and that the tables are joined in the order they appear. The new estimated execution plan is: This new query runs in under 2 seconds. Why Is It Faster? The main reason for the improvement is the appearance of the eager Index Spools, which are also known as index-on-the-fly spools. If you read my Inside The Optimiser series you might be interested to know that the rule responsible is called JoinToIndexOnTheFly. An eager index spool consumes all rows from the table it sits above, and builds a index suitable for the join to seek on. Taking the index spool above the #Custs table as an example, it reads all the CustomerID and TerritoryID values with a single scan of the table, and builds an index keyed on CustomerID. The term ‘eager’ means that the spool consumes all of its input rows when it starts up. The index is built in a work table in tempdb, has no associated statistics, and only exists until the query finishes executing. The result is that each unindexed table is only scanned once, and just for the columns necessary to build the temporary index. From that point on, every execution of the inner side of the join is answered by a seek on the temporary index – not the base table. A second optimization is that the sort on ProductMainID (required by the ORDER BY clause) is performed early, on just the rows coming from the #OrdDetail table. The optimizer has a good estimate for the number of rows it needs to sort at that stage – it is just the cardinality of the table itself. The accuracy of the estimate there is important because it helps determine the memory grant given to the sort operation. Nested loops join preserves the order of rows on its outer input, so sorting early is safe. (Hash joins do not preserve order in this way, of course). The extra lazy spool on the #Prods branch is a further optimization that avoids executing the seek on the temporary index if the value being joined (the ‘outer reference’) hasn’t changed from the last row received on the outer input. It takes advantage of the fact that rows are still sorted on ProductMainID, so if duplicates exist, they will arrive at the join operator one after the other. The optimizer is quite conservative about introducing index spools into a plan, because creating and dropping a temporary index is a relatively expensive operation. It’s presence in a plan is often an indication that a useful index is missing. I want to stress that I rewrote the query in this way primarily as an educational exercise – I can’t imagine having to do something so horrible to a production system. Improving the Hash Join I promised I would return to the solution that uses hash joins. You might be puzzled that SQL Server can create three new indexes (and perform all those nested loops iterations) faster than it can perform three hash joins. The answer, again, is down to the poor information available to the optimizer. Let’s look at the hash join plan again: Two of the hash joins have single-row estimates on their build inputs. SQL Server fixes the amount of memory available for the hash table based on this cardinality estimate, so at run time the hash join very quickly runs out of memory. This results in the join spilling hash buckets to disk, and any rows from the probe input that hash to the spilled buckets also get written to disk. The join process then continues, and may again run out of memory. This is a recursive process, which may eventually result in SQL Server resorting to a bailout join algorithm, which is guaranteed to complete eventually, but may be very slow. The data sizes in the example tables are not large enough to force a hash bailout, but it does result in multiple levels of hash recursion. You can see this for yourself by tracing the Hash Warning event using the Profiler tool. The final sort in the plan also suffers from a similar problem: it receives very little memory and has to perform multiple sort passes, saving intermediate runs to disk (the Sort Warnings Profiler event can be used to confirm this). Notice also that because hash joins don’t preserve sort order, the sort cannot be pushed down the plan toward the #OrdDetail table, as in the nested loops plan. Ok, so now we understand the problems, what can we do to fix it? We can address the hash spilling by forcing a different order for the joins: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #Prods P JOIN #Custs C JOIN #OrdHeader H ON H.CustomerID = C.CustomerID JOIN #OrdDetail D ON D.SalesOrderID = H.SalesOrderID ON P.ProductMainID = D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID ORDER BY D.ProductMainID OPTION (MAXDOP 1, HASH JOIN, FORCE ORDER); With this plan, each of the inputs to the hash joins has a good estimate, and no hash recursion occurs. The final sort still suffers from the one-row estimate problem, and we get a single-pass sort warning as it writes rows to disk. Even so, the query runs to completion in three or four seconds. That’s around half the time of the previous hashing solution, but still not as fast as the nested loops trickery. Final Thoughts SQL Server’s optimizer makes cost-based decisions, so it is vital to provide it with accurate information. We can’t really blame the performance problems highlighted here on anything other than the decision to use completely unindexed tables, and not to allow the creation of additional statistics. I should probably stress that the nested loops solution shown above is not one I would normally contemplate in the real world. It’s there primarily for its educational and entertainment value. I might perhaps use it to demonstrate to the sceptical that SQL Server itself is crying out for an index. Be sure to read Brad’s original post for more details. My grateful thanks to him for granting permission to reuse some of his material. Paul White Email: [email protected] Twitter: @PaulWhiteNZ

Read the article
How to plan my web based project before starting code ?

- by Arsheep

Me and my friend started working together as partners , we have decided to make Kick-as* website after website. We have the ideas written down like 100's of them (yes we are choosing best and easy among them first). My friend does the layout design and arranging things , and my part is coding and server management. The little problem i am facing is lack of experience in planing a project. What i do is, I just start the code straight away and along with code I make DB, like when i need a table i make it. I know this is very bad approach for a medium sized project. Here at stackoverflow i saw lots of experienced coders. Need to learn a lot from you guys :) . So can you plese help me on how to plan a project and what coding standard/structure/frameworks to be used (I do PHP code). Thanks in advance.

Read the article
How to plan mine web based project before starting code ?

- by Arsheep

Me and mine friend started working together as partners , we have decided to make Kick-as* website after website. We have the ideas written down like 100's of them (yes we are choosing best and easy among them first). Mine friend do the layout design and arranging things , and mine part is coding and server management. The little problem i am facing is lack of experience in planing a project .What i do is , i just start the code straight away and along with code I make DB , Like when i need a table i make it. I know this is very bad approach for a medium sized project. Here at stackoverflow i saw lots of experienced coders . Need to learn a lot from you guys :) . So can you plese help me on how to plan a project and what coding standard/structure/frameworks to be used (I do PHP code). Thanks in advance.

Read the article
Query Execution Plan - When is the Where clause executed?

- by Alex

I have a query like this (created by LINQ): SELECT [t0].[Id], [t0].[CreationDate], [t0].[CreatorId] FROM [dbo].[DataFTS]('test', 100) AS [t0] WHERE [t0].[CreatorId] = 1 ORDER BY [t0].[RANK] DataFTS is a full-text search table valued function. The query execution plan looks like this: SELECT (0%) - Sort (23%) - Nested Loops (Inner Join) (1%) - Sort (Top N Sort) (25%) - Stream Aggregate (0%) - Stream Aggregate (0%) - Compute Scalar (0%) - Table Valued Function (FullTextMatch) (13%) | | - Clustered Index Seek (38%) Does this mean that the WHERE clause ([CreatorId] = 1) is executed prior to the TVF ( full text search) or after the full text search? Thank you.

Read the article
Why isn't INT more efficient than UNIQUEIDENTIFIER (according to the execution plan)?

- by ck

I have a parent table and child table where the columns that join them together are the UNIQUEIDENTIFIER type. The child table has a clustered index on the column that joins it to the parent table (its PK, which is also clustered). I have created a copy of both of these tables but changed the relationship columns to be INTs instead, have rebuilt the indexes so that they are essentially the same structure and can be queried in the same way. When I query for a known 20 records from the parent table, pulling in all the related records from the child tables, I get identical query costs across both, i.e. 50/50 cost for the batches. If this is true, then my giant project to change all of the tables like this appears to be pointless, other than speeding up inserts. Can anyone provide any light on the situation? EDIT: The question is not about which is more efficient, but why is the query execution plan showing both queries as having the same cost?

Read the article
More CPU cores may not always lead to better performance – MAXDOP and query memory distribution in spotlight

- by sqlworkshops

More hardware normally delivers better performance, but there are exceptions where it can hinder performance. Understanding these exceptions and working around it is a major part of SQL Server performance tuning. When a memory allocating query executes in parallel, SQL Server distributes memory to each task that is executing part of the query in parallel. In our example the sort operator that executes in parallel divides the memory across all tasks assuming even distribution of rows. Common memory allocating queries are that perform Sort and do Hash Match operations like Hash Join or Hash Aggregation or Hash Union. In reality, how often are column values evenly distributed, think about an example; are employees working for your company distributed evenly across all the Zip codes or mainly concentrated in the headquarters? What happens when you sort result set based on Zip codes? Do all products in the catalog sell equally or are few products hot selling items? One of my customers tested the below example on a 24 core server with various MAXDOP settings and here are the results:MAXDOP 1: CPU time = 1185 ms, elapsed time = 1188 msMAXDOP 4: CPU time = 1981 ms, elapsed time = 1568 msMAXDOP 8: CPU time = 1918 ms, elapsed time = 1619 msMAXDOP 12: CPU time = 2367 ms, elapsed time = 2258 msMAXDOP 16: CPU time = 2540 ms, elapsed time = 2579 msMAXDOP 20: CPU time = 2470 ms, elapsed time = 2534 msMAXDOP 0: CPU time = 2809 ms, elapsed time = 2721 ms - all 24 cores.In the above test, when the data was evenly distributed, the elapsed time of parallel query was always lower than serial query. Why does the query get slower and slower with more CPU cores / higher MAXDOP? Maybe you can answer this question after reading the article; let me know: [email protected]. Well you get the point, let’s see an example. The best way to learn is to practice. To create the below tables and reproduce the behavior, join the mailing list by using this link: www.sqlworkshops.com/ml and I will send you the table creation script. Let’s update the Employees table with 49 out of 50 employees located in Zip code 2001. update Employees set Zip = EmployeeID / 400 + 1 where EmployeeID % 50 = 1 update Employees set Zip = 2001 where EmployeeID % 50 != 1 go update statistics Employees with fullscan go Let’s create the temporary table #FireDrill with all possible Zip codes. drop table #FireDrill go create table #FireDrill (Zip int primary key) insert into #FireDrill select distinct Zip from Employees update statistics #FireDrill with fullscan go Let’s execute the query serially with MAXDOP 1. --Example provided by www.sqlworkshops.com --Execute query with uneven Zip code distribution --First serially with MAXDOP 1 set statistics time on go declare @EmployeeID int, @EmployeeName varchar(48),@zip int select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e inner join #FireDrill fd on (e.Zip = fd.Zip) order by e.Zip option (maxdop 1) goThe query took 1011 ms to complete. The execution plan shows the 77816 KB of memory was granted while the estimated rows were 799624. No Sort Warnings in SQL Server Profiler. Now let’s execute the query in parallel with MAXDOP 0. --Example provided by www.sqlworkshops.com --Execute query with uneven Zip code distribution --In parallel with MAXDOP 0 set statistics time on go declare @EmployeeID int, @EmployeeName varchar(48),@zip int select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e inner join #FireDrill fd on (e.Zip = fd.Zip) order by e.Zip option (maxdop 0) go The query took 1912 ms to complete. The execution plan shows the 79360 KB of memory was granted while the estimated rows were 799624. The estimated number of rows between serial and parallel plan are the same. The parallel plan has slightly more memory granted due to additional overhead. Sort properties shows the rows are unevenly distributed over the 4 threads. Sort Warnings in SQL Server Profiler. Intermediate Summary: The reason for the higher duration with parallel plan was sort spill. This is due to uneven distribution of employees over Zip codes, especially concentration of 49 out of 50 employees in Zip code 2001. Now let’s update the Employees table and distribute employees evenly across all Zip codes. update Employees set Zip = EmployeeID / 400 + 1 go update statistics Employees with fullscan go Let’s execute the query serially with MAXDOP 1. --Example provided by www.sqlworkshops.com --Execute query with uneven Zip code distribution --Serially with MAXDOP 1 set statistics time on go declare @EmployeeID int, @EmployeeName varchar(48),@zip int select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e inner join #FireDrill fd on (e.Zip = fd.Zip) order by e.Zip option (maxdop 1) go The query took 751 ms to complete. The execution plan shows the 77816 KB of memory was granted while the estimated rows were 784707. No Sort Warnings in SQL Server Profiler. Now let’s execute the query in parallel with MAXDOP 0. --Example provided by www.sqlworkshops.com --Execute query with uneven Zip code distribution --In parallel with MAXDOP 0 set statistics time on go declare @EmployeeID int, @EmployeeName varchar(48),@zip int select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e inner join #FireDrill fd on (e.Zip = fd.Zip) order by e.Zip option (maxdop 0) go The query took 661 ms to complete. The execution plan shows the 79360 KB of memory was granted while the estimated rows were 784707. Sort properties shows the rows are evenly distributed over the 4 threads. No Sort Warnings in SQL Server Profiler. Intermediate Summary: When employees were distributed unevenly, concentrated on 1 Zip code, parallel sort spilled while serial sort performed well without spilling to tempdb. When the employees were distributed evenly across all Zip codes, parallel sort and serial sort did not spill to tempdb. This shows uneven data distribution may affect the performance of some parallel queries negatively. For detailed discussion of memory allocation, refer to webcasts available at www.sqlworkshops.com/webcasts. Some of you might conclude from the above execution times that parallel query is not faster even when there is no spill. Below you can see when we are joining limited amount of Zip codes, parallel query will be fasted since it can use Bitmap Filtering. Let’s update the Employees table with 49 out of 50 employees located in Zip code 2001. update Employees set Zip = EmployeeID / 400 + 1 where EmployeeID % 50 = 1 update Employees set Zip = 2001 where EmployeeID % 50 != 1 go update statistics Employees with fullscan go Let’s create the temporary table #FireDrill with limited Zip codes. drop table #FireDrill go create table #FireDrill (Zip int primary key) insert into #FireDrill select distinct Zip from Employees where Zip between 1800 and 2001 update statistics #FireDrill with fullscan go Let’s execute the query serially with MAXDOP 1. --Example provided by www.sqlworkshops.com --Execute query with uneven Zip code distribution --Serially with MAXDOP 1 set statistics time on go declare @EmployeeID int, @EmployeeName varchar(48),@zip int select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e inner join #FireDrill fd on (e.Zip = fd.Zip) order by e.Zip option (maxdop 1) go The query took 989 ms to complete. The execution plan shows the 77816 KB of memory was granted while the estimated rows were 785594. No Sort Warnings in SQL Server Profiler. Now let’s execute the query in parallel with MAXDOP 0. --Example provided by www.sqlworkshops.com --Execute query with uneven Zip code distribution --In parallel with MAXDOP 0 set statistics time on go declare @EmployeeID int, @EmployeeName varchar(48),@zip int select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e inner join #FireDrill fd on (e.Zip = fd.Zip) order by e.Zip option (maxdop 0) go The query took 1799 ms to complete. The execution plan shows the 79360 KB of memory was granted while the estimated rows were 785594. Sort Warnings in SQL Server Profiler. The estimated number of rows between serial and parallel plan are the same. The parallel plan has slightly more memory granted due to additional overhead. Intermediate Summary: The reason for the higher duration with parallel plan even with limited amount of Zip codes was sort spill. This is due to uneven distribution of employees over Zip codes, especially concentration of 49 out of 50 employees in Zip code 2001. Now let’s update the Employees table and distribute employees evenly across all Zip codes. update Employees set Zip = EmployeeID / 400 + 1 go update statistics Employees with fullscan go Let’s execute the query serially with MAXDOP 1. --Example provided by www.sqlworkshops.com --Execute query with uneven Zip code distribution --Serially with MAXDOP 1 set statistics time on go declare @EmployeeID int, @EmployeeName varchar(48),@zip int select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e inner join #FireDrill fd on (e.Zip = fd.Zip) order by e.Zip option (maxdop 1) go The query took 250 ms to complete. The execution plan shows the 9016 KB of memory was granted while the estimated rows were 79973.8. No Sort Warnings in SQL Server Profiler. Now let’s execute the query in parallel with MAXDOP 0. --Example provided by www.sqlworkshops.com --Execute query with uneven Zip code distribution --In parallel with MAXDOP 0 set statistics time on go declare @EmployeeID int, @EmployeeName varchar(48),@zip int select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e inner join #FireDrill fd on (e.Zip = fd.Zip) order by e.Zip option (maxdop 0) go The query took 85 ms to complete. The execution plan shows the 13152 KB of memory was granted while the estimated rows were 784707. No Sort Warnings in SQL Server Profiler. Here you see, parallel query is much faster than serial query since SQL Server is using Bitmap Filtering to eliminate rows before the hash join. Parallel queries are very good for performance, but in some cases it can hinder performance. If one identifies the reason for these hindrances, then it is possible to get the best out of parallelism. I covered many aspects of monitoring and tuning parallel queries in webcasts (www.sqlworkshops.com/webcasts) and articles (www.sqlworkshops.com/articles). I suggest you to watch the webcasts and read the articles to better understand how to identify and tune parallel query performance issues. Summary: One has to avoid sort spill over tempdb and the chances of spills are higher when a query executes in parallel with uneven data distribution. Parallel query brings its own advantage, reduced elapsed time and reduced work with Bitmap Filtering. So it is important to understand how to avoid spills over tempdb and when to execute a query in parallel. I explain these concepts with detailed examples in my webcasts (www.sqlworkshops.com/webcasts), I recommend you to watch them. The best way to learn is to practice. To create the above tables and reproduce the behavior, join the mailing list at www.sqlworkshops.com/ml and I will send you the relevant SQL Scripts. Register for the upcoming 3 Day Level 400 Microsoft SQL Server 2008 and SQL Server 2005 Performance Monitoring & Tuning Hands-on Workshop in London, United Kingdom during March 15-17, 2011, click here to register / Microsoft UK TechNet.These are hands-on workshops with a maximum of 12 participants and not lectures. For consulting engagements click here. Disclaimer and copyright information:This article refers to organizations and products that may be the trademarks or registered trademarks of their various owners. Copyright of this article belongs to R Meyyappan / www.sqlworkshops.com. You may freely use the ideas and concepts discussed in this article with acknowledgement (www.sqlworkshops.com), but you may not claim any of it as your own work. This article is for informational purposes only; you use any of the suggestions given here entirely at your own risk. Register for the upcoming 3 Day Level 400 Microsoft SQL Server 2008 and SQL Server 2005 Performance Monitoring & Tuning Hands-on Workshop in London, United Kingdom during March 15-17, 2011, click here to register / Microsoft UK TechNet.These are hands-on workshops with a maximum of 12 participants and not lectures. For consulting engagements click here. R Meyyappan [email protected] LinkedIn: http://at.linkedin.com/in/rmeyyappan

Read the article
MERGE Bug with Filtered Indexes

- by Paul White

A MERGE statement can fail, and incorrectly report a unique key violation when: The target table uses a unique filtered index; and No key column of the filtered index is updated; and A column from the filtering condition is updated; and Transient key violations are possible Example Tables Say we have two tables, one that is the target of a MERGE statement, and another that contains updates to be applied to the target. The target table contains three columns, an integer primary key, a single character alternate key, and a status code column. A filtered unique index exists on the alternate key, but is only enforced where the status code is ‘a’: CREATE TABLE #Target ( pk integer NOT NULL, ak character(1) NOT NULL, status_code character(1) NOT NULL, PRIMARY KEY (pk) ); CREATE UNIQUE INDEX uq1 ON #Target (ak) INCLUDE (status_code) WHERE status_code = 'a'; The changes table contains just an integer primary key (to identify the target row to change) and the new status code: CREATE TABLE #Changes ( pk integer NOT NULL, status_code character(1) NOT NULL, PRIMARY KEY (pk) ); Sample Data The sample data for the example is: INSERT #Target (pk, ak, status_code) VALUES (1, 'A', 'a'), (2, 'B', 'a'), (3, 'C', 'a'), (4, 'A', 'd'); INSERT #Changes (pk, status_code) VALUES (1, 'd'), (4, 'a'); Target Changes +-----------------------+ +------------------+ ¦ pk ¦ ak ¦ status_code ¦ ¦ pk ¦ status_code ¦ ¦----+----+-------------¦ ¦----+-------------¦ ¦ 1 ¦ A ¦ a ¦ ¦ 1 ¦ d ¦ ¦ 2 ¦ B ¦ a ¦ ¦ 4 ¦ a ¦ ¦ 3 ¦ C ¦ a ¦ +------------------+ ¦ 4 ¦ A ¦ d ¦ +-----------------------+ The target table’s alternate key (ak) column is unique, for rows where status_code = ‘a’. Applying the changes to the target will change row 1 from status ‘a’ to status ‘d’, and row 4 from status ‘d’ to status ‘a’. The result of applying all the changes will still satisfy the filtered unique index, because the ‘A’ in row 1 will be deleted from the index and the ‘A’ in row 4 will be added. Merge Test One Let’s now execute a MERGE statement to apply the changes: MERGE #Target AS t USING #Changes AS c ON c.pk = t.pk WHEN MATCHED AND c.status_code <> t.status_code THEN UPDATE SET status_code = c.status_code; The MERGE changes the two target rows as expected. The updated target table now contains: +-----------------------+ ¦ pk ¦ ak ¦ status_code ¦ ¦----+----+-------------¦ ¦ 1 ¦ A ¦ d ¦ <—changed from ‘a’ ¦ 2 ¦ B ¦ a ¦ ¦ 3 ¦ C ¦ a ¦ ¦ 4 ¦ A ¦ a ¦ <—changed from ‘d’ +-----------------------+ Merge Test Two Now let’s repopulate the changes table to reverse the updates we just performed: TRUNCATE TABLE #Changes; INSERT #Changes (pk, status_code) VALUES (1, 'a'), (4, 'd'); This will change row 1 back to status ‘a’ and row 4 back to status ‘d’. As a reminder, the current state of the tables is: Target Changes +-----------------------+ +------------------+ ¦ pk ¦ ak ¦ status_code ¦ ¦ pk ¦ status_code ¦ ¦----+----+-------------¦ ¦----+-------------¦ ¦ 1 ¦ A ¦ d ¦ ¦ 1 ¦ a ¦ ¦ 2 ¦ B ¦ a ¦ ¦ 4 ¦ d ¦ ¦ 3 ¦ C ¦ a ¦ +------------------+ ¦ 4 ¦ A ¦ a ¦ +-----------------------+ We execute the same MERGE statement: MERGE #Target AS t USING #Changes AS c ON c.pk = t.pk WHEN MATCHED AND c.status_code <> t.status_code THEN UPDATE SET status_code = c.status_code; However this time we receive the following message: Msg 2601, Level 14, State 1, Line 1 Cannot insert duplicate key row in object 'dbo.#Target' with unique index 'uq1'. The duplicate key value is (A). The statement has been terminated. Applying the changes using UPDATE Let’s now rewrite the MERGE to use UPDATE instead: UPDATE t SET status_code = c.status_code FROM #Target AS t JOIN #Changes AS c ON t.pk = c.pk WHERE c.status_code <> t.status_code; This query succeeds where the MERGE failed. The two rows are updated as expected: +-----------------------+ ¦ pk ¦ ak ¦ status_code ¦ ¦----+----+-------------¦ ¦ 1 ¦ A ¦ a ¦ <—changed back to ‘a’ ¦ 2 ¦ B ¦ a ¦ ¦ 3 ¦ C ¦ a ¦ ¦ 4 ¦ A ¦ d ¦ <—changed back to ‘d’ +-----------------------+ What went wrong with the MERGE? In this test, the MERGE query execution happens to apply the changes in the order of the ‘pk’ column. In test one, this was not a problem: row 1 is removed from the unique filtered index by changing status_code from ‘a’ to ‘d’ before row 4 is added. At no point does the table contain two rows where ak = ‘A’ and status_code = ‘a’. In test two, however, the first change was to change row 1 from status ‘d’ to status ‘a’. This change means there would be two rows in the filtered unique index where ak = ‘A’ (both row 1 and row 4 meet the index filtering criteria ‘status_code = a’). The storage engine does not allow the query processor to violate a unique key (unless IGNORE_DUP_KEY is ON, but that is a different story, and doesn’t apply to MERGE in any case). This strict rule applies regardless of the fact that if all changes were applied, there would be no unique key violation (row 4 would eventually be changed from ‘a’ to ‘d’, removing it from the filtered unique index, and resolving the key violation). Why it went wrong The query optimizer usually detects when this sort of temporary uniqueness violation could occur, and builds a plan that avoids the issue. I wrote about this a couple of years ago in my post Beware Sneaky Reads with Unique Indexes (you can read more about the details on pages 495-497 of Microsoft SQL Server 2008 Internals or in Craig Freedman’s blog post on maintaining unique indexes). To summarize though, the optimizer introduces Split, Filter, Sort, and Collapse operators into the query plan to: Split each row update into delete followed by an inserts Filter out rows that would not change the index (due to the filter on the index, or a non-updating update) Sort the resulting stream by index key, with deletes before inserts Collapse delete/insert pairs on the same index key back into an update The effect of all this is that only net changes are applied to an index (as one or more insert, update, and/or delete operations). In this case, the net effect is a single update of the filtered unique index: changing the row for ak = ‘A’ from pk = 4 to pk = 1. In case that is less than 100% clear, let’s look at the operation in test two again: Target Changes Result +-----------------------+ +------------------+ +-----------------------+ ¦ pk ¦ ak ¦ status_code ¦ ¦ pk ¦ status_code ¦ ¦ pk ¦ ak ¦ status_code ¦ ¦----+----+-------------¦ ¦----+-------------¦ ¦----+----+-------------¦ ¦ 1 ¦ A ¦ d ¦ ¦ 1 ¦ d ¦ ¦ 1 ¦ A ¦ a ¦ ¦ 2 ¦ B ¦ a ¦ ¦ 4 ¦ a ¦ ¦ 2 ¦ B ¦ a ¦ ¦ 3 ¦ C ¦ a ¦ +------------------+ ¦ 3 ¦ C ¦ a ¦ ¦ 4 ¦ A ¦ a ¦ ¦ 4 ¦ A ¦ d ¦ +-----------------------+ +-----------------------+ From the filtered index’s point of view (filtered for status_code = ‘a’ and shown in nonclustered index key order) the overall effect of the query is: Before After +---------+ +---------+ ¦ pk ¦ ak ¦ ¦ pk ¦ ak ¦ ¦----+----¦ ¦----+----¦ ¦ 4 ¦ A ¦ ¦ 1 ¦ A ¦ ¦ 2 ¦ B ¦ ¦ 2 ¦ B ¦ ¦ 3 ¦ C ¦ ¦ 3 ¦ C ¦ +---------+ +---------+ The single net change there is a change of pk from 4 to 1 for the nonclustered index entry ak = ‘A’. This is the magic performed by the split, sort, and collapse. Notice in particular how the original changes to the index key (on the ‘ak’ column) have been transformed into an update of a non-key column (pk is included in the nonclustered index). By not updating any nonclustered index keys, we are guaranteed to avoid transient key violations. The Execution Plans The estimated MERGE execution plan that produces the incorrect key-violation error looks like this (click to enlarge in a new window): The successful UPDATE execution plan is (click to enlarge in a new window): The MERGE execution plan is a narrow (per-row) update. The single Clustered Index Merge operator maintains both the clustered index and the filtered nonclustered index. The UPDATE plan is a wide (per-index) update. The clustered index is maintained first, then the Split, Filter, Sort, Collapse sequence is applied before the nonclustered index is separately maintained. There is always a wide update plan for any query that modifies the database. The narrow form is a performance optimization where the number of rows is expected to be relatively small, and is not available for all operations. One of the operations that should disallow a narrow plan is maintaining a unique index where intermediate key violations could occur. Workarounds The MERGE can be made to work (producing a wide update plan with split, sort, and collapse) by: Adding all columns referenced in the filtered index’s WHERE clause to the index key (INCLUDE is not sufficient); or Executing the query with trace flag 8790 set e.g. OPTION (QUERYTRACEON 8790). Undocumented trace flag 8790 forces a wide update plan for any data-changing query (remember that a wide update plan is always possible). Either change will produce a successfully-executing wide update plan for the MERGE that failed previously. Conclusion The optimizer fails to spot the possibility of transient unique key violations with MERGE under the conditions listed at the start of this post. It incorrectly chooses a narrow plan for the MERGE, which cannot provide the protection of a split/sort/collapse sequence for the nonclustered index maintenance. The MERGE plan may fail at execution time depending on the order in which rows are processed, and the distribution of data in the database. Worse, a previously solid MERGE query may suddenly start to fail unpredictably if a filtered unique index is added to the merge target table at any point. Connect bug filed here Tests performed on SQL Server 2012 SP1 CUI (build 11.0.3321) x64 Developer Edition © 2012 Paul White – All Rights Reserved Twitter: @SQL_Kiwi Email: [email protected]

Read the article
What is a ‘best practice’ backup plan for a website?

- by HollerTrain

I have a website which is very large and has a large user-base. I am trying to think of a 'best practice' way to create a back up or mirror website, so if something happens on domain.com, I can quickly point the site to backup.domain.com via 401 redirect. This would give me time to troubleshoot domain.com while everyone is viewing backup.domain.com and not knowing the difference. Is my method the ideal method, or have you enacted better methods to creating a backup site? I don't want to have the site go down and then get yelled at every minute while I'm trying to fix it. Ideally I would just 'flip the switch' and it would redirect the user to a backup. Any insight would be greatly appreciated.

Read the article
What is a 'best practice' backup plan for a website?

- by HollerTrain

I have a website which is very large and has a large user-base. I am trying to think of a 'best practice' way to create a back up or mirror website, so if something happens on domain.com, I can quickly point the site to backup.domain.com via 401 redirect. This would give me time to troubleshoot domain.com while everyone is viewing backup.domain.com and not knowing the difference. Is my method the ideal method, or have you enacted better methods to creating a backup site? I don't want to have the site go down and then get yelled at every minute while I'm trying to fix it. Ideally I would just 'flip the switch' and it would redirect the user to a backup. Any insight would be greatly appreciated.

Read the article
How to store 250TB of data and develop a backup/recovery plan?

- by luccio

I'm really new to this topic, so big apology for stupid questions. I have a school project and I want to know how to store 250TB of data with life-cycle for 18 months. It means every record is stored for 18 months and after this period of time can be deleted. There are 2 issues: store data backup data Due to amount of data I will probably need to combine data tapes and hard drives. I'd like to have "fast" access to 3 month old data, so ~42TB on disk. I really don't know what RAID should I use, or is here any better solution than combining disk and data tapes? Thanks for any advice, article, anything. I'm getting lost.

Read the article
Troubleshooting Application Timeouts in SQL Server

- by Tara Kizer

I recently received the following email from a blog reader: "We are having an OLTP database instance, using SQL Server 2005 with little to moderate traffic (10-20 requests/min). There are also bulk imports that occur at regular intervals in this DB and the import duration ranges between 10secs to 1 min, depending on the data size. Intermittently (2-3 times in a week), we face an issue, where queries get timed out (default of 30 secs set in application). On analyzing, we found two stored procedures, having queries with multiple table joins inside them of taking a long time (5-10 mins) in getting executed, when ideally the execution duration ranges between 5-10 secs. Execution plan of the same displayed Clustered Index Scan happening instead of Clustered Index Seek. All required Indexes are found to be present and Index fragmentation is also minimal as we Rebuild Indexes regularly alongwith Updating Statistics. With no other alternate options occuring to us, we restarted SQL server and thereafter the performance was back on track. But sometimes it was still giving timeout errors for some hits and so we also restarted IIS and that stopped the problem as of now." Rather than respond directly to the blog reader, I thought it would be more interesting to share my thoughts on this issue in a blog. There are a few things that I can think of that could cause abnormal timeouts: Blocking Bad plan in cache Outdated statistics Hardware bottleneck To determine if blocking is the issue, we can easily run sp_who/sp_who2 or a query directly on sysprocesses (select * from master..sysprocesses where blocking <> 0). If blocking is present and consistent, then you'll need to determine whether or not to kill the parent blocking process. Killing a process will cause the transaction to rollback, so you need to proceed with caution. Killing the parent blocking process is only a temporary solution, so you'll need to do more thorough analysis to figure out why the blocking was present. You should look into missing indexes and perhaps consider changing the database's isolation level to READ_COMMITTED_SNAPSHOT. The blog reader mentions that the execution plan shows a clustered index scan when a clustered index seek is normal for the stored procedure. A clustered index scan might have been chosen either because that is what is in cache already or because of out of date statistics. The blog reader mentions that bulk imports occur at regular intervals, so outdated statistics is definitely something that could cause this issue. The blog reader may need to update statistics after imports are done if the imports are changing a lot of data (greater than 10%). If the statistics are good, then the query optimizer might have chosen to scan rather than seek in a previous execution because the scan was determined to be less costly due to the value of an input parameter. If this parameter value is rare, then its execution plan in cache is what we call a bad plan. You want the best plan in cache for the most frequent parameter values. If a bad plan is a recurring problem on your system, then you should consider rewriting the stored procedure. You might want to break up the code into multiple stored procedures so that each can have a different execution plan in cache. To remove a bad plan from cache, you can recompile the stored procedure. An alternative method is to run DBCC FREEPROCACHE which drops the procedure cache. It is better to recompile stored procedures rather than dropping the procedure cache as dropping the procedure cache affects all plans in cache rather than just the ones that were bad, so there will be a temporary performance penalty until the plans are loaded into cache again. To determine if there is a hardware bottleneck occurring such as slow I/O or high CPU utilization, you will need to run Performance Monitor on the database server. Hopefully you already have a baseline of the server so you know what is normal and what is not. Be on the lookout for I/O requests taking longer than 12 milliseconds and CPU utilization over 90%. The servers that I support typically are under 30% CPU utilization, but your baseline could be higher and be within a normal range. If restarting the SQL Server service fixes the problem, then the problem was most likely due to blocking or a bad plan in the procedure cache. Rather than restarting the SQL Server service, which causes downtime, the blog reader should instead analyze the above mentioned things. Proceed with caution when restarting the SQL Server service as all transactions that have not completed will be rolled back at startup. This crash recovery process could take longer than normal if there was a long-running transaction running when the service was stopped. Until the crash recovery process is completed on the database, it is unavailable to your applications. If restarting IIS fixes the problem, then the problem might not have been inside SQL Server. Prior to taking this step, you should do analysis of the above mentioned things. If you can think of other reasons why the blog reader is facing this issue a few times a week, I'd love to hear your thoughts via a blog comment.

Read the article
How does C#'s DateTime.Now affect query plan caching in SQL Server?

- by Bill Paetzke

Given: Let's say we have a stored procedure. It reports data back to a user on a webpage. The user can set a date range. If the user sets today's date as the "end date," which includes today's data, the web app passes DateTime.Now to the sql proc. Let's say that one user runs a report--5/1/2010 to now--over and over several times. On the webpage, the user sees "5/1/2010" to "5/4/2010." But the web app passes DateTime.Now to the sql proc as the end date. So, the end date in the proc will always be different, although the user is querying a similar date range. Assume the number of records in the table and number of users are large. So any performance gains matter. Hence the importance of the question. Question: Does passing DateTime.Now as a parameter to a proc prevent SQL Server from caching the query plan? If so, then is the web app missing out on huge performance gains? Possible Solution: I thought DateTime.Today.AddDays(1) would be a possible solution. It would allow the user to get the latest data and always pass the same end date to the sql proc--"5/5/2010" in this case. Please speak to this as well. Sample proc and execution (if that helps to understand): CREATE PROCEDURE GetFooData @StartDate datetime @EndDate datetime AS SELECT * FROM Foo WHERE LogDate >= @StartDate AND LogDate < @EndDate Here's a sample execution using DateTime.Now: EXEC GetFooData '2010-05-01', '2010-05-04 15:41:27' -- passed in DateTime.Now Here's a sample execution using DateTime.Today.AddDays(1) EXEC GetFooData '2010-05-01', '2010-05-05' -- passed in DateTime.Today.AddDays(1) The same data is returned for both procs, since the current time is: 2010-05-04 15:41:27.

Read the article
How does DateTime.Now affect query plan caching in SQL Server?

- by Bill Paetzke

Question: Does passing DateTime.Now as a parameter to a proc prevent SQL Server from caching the query plan? If so, then is the web app missing out on huge performance gains? Possible Solution: I thought DateTime.Today.AddDays(1) would be a possible solution. It would pass the same end-date to the sql proc (per day). And the user would still get the latest data. Please speak to this as well. Given Example: Let's say we have a stored procedure. It reports data back to a user on a webpage. The user can set a date range. If the user sets today's date as the "end date," which includes today's data, the web app passes DateTime.Now to the sql proc. Let's say that one user runs a report--5/1/2010 to now--over and over several times. On the webpage, the user sees 5/1/2010 to 5/4/2010. But the web app passes DateTime.Now to the sql proc as the end date. So, the end date in the proc will always be different, although the user is querying a similar date range. Assume the number of records in the table and number of users are large. So any performance gains matter. Hence the importance of the question. Example proc and execution (if that helps to understand): CREATE PROCEDURE GetFooData @StartDate datetime @EndDate datetime AS SELECT * FROM Foo WHERE LogDate >= @StartDate AND LogDate < @EndDate Here's a sample execution using DateTime.Now: EXEC GetFooData '2010-05-01', '2010-05-04 15:41:27' -- passed in DateTime.Now Here's a sample execution using DateTime.Today.AddDays(1) EXEC GetFooData '2010-05-01', '2010-05-05' -- passed in DateTime.Today.AddDays(1) The same data is returned for both procs, since the current time is: 2010-05-04 15:41:27.

Read the article
Evidence-Based-Scheduling - are estimations only as accurate as the work-plan they're based on?

- by Assaf Lavie

I've been using FogBugz's Evidence Based Scheduling (for the uninitiated, Joel explains) for a while now and there's an inherent problem I can't seem to work around. The system is good at telling me the probability that a given project will be delivered at some date, given the detailed list of tasks that comprise the project. However, it does not take into account the fact that during development additional tasks always pop up. Now, there's the garbage-can approach of creating a generic task/scheduled-item for "last minute hacks" or "integration tasks", or what have you, but that clearly goes against the idea of aggregating the estimates of many small cases. It's often the case that during the development stage of a project you realize that there's a whole area your planning didn't cover, because, well, that's the nature of developing stuff that hasn't been developed before. So now your ~3 month project may very well turn into a 6 month project, but not because your estimations were off (you could be the best estimator in the world, for those task the comprised your initial work plan); rather because you ended up adding a whole bunch of new tasks that weren't there to begin with. EBS doesn't help you with that. It could, theoretically (I guess). It could, perhaps, measure the amount of work you add to a project over time and take that into consideration when estimating the time remaining on a given project. Just a thought. In other words, EBS works on a task basis, but not on a project/release basis - but the latter is what's important. It's what your boss typically cares about - delivery date, not the time it takes to finish each task along the way, and not the time it would have taken, if your planning was perfect. So the question is (yes, there's a question here, don't close it): What's your methodology when it comes to using EBS in FogBugz and how do you solve the problem above, which seems to be a main cause of schedule delays and mispredictions? Edit Some more thoughts after reading a few answers: If it comes down to having to choose which delivery date you're comfortable presenting to your higher-ups by squinting at the delivery-probability graph and choosing 80%, or 95%, or 60% (based on what, exactly?) then we've resorted to plain old buffering/factoring of our estimates. In which case, couldn't we have skipped the meticulous case by case hour-sized estimation effort step? By forcing ourselves to break down tasks that take more than a day into smaller chunks of work haven't we just deluded ourselves into thinking our planning is as tight and thorough as it could be? People may be consistently bad estimators that do not even learn from their past mistakes. In that respect, having an EBS system is certainly better than not having one. But what can we do about the fact that we're not that good in planning as well? I'm not sure it's a problem that can be solved by a similar system. Our estimates are wrong because of tendencies to be overly optimistic/pessimistic about certain tasks, and because of neglect to account for systematic delays (e.g. sick days, major bug crisis) - and usually not because we lack knowledge about the work that needs to be done. Our planning, on the other hand, is often incomplete because we simply don't have enough knowledge in this early stage; and I don't see how an EBS-like system could fill that gap. So we're back to methodology. We need to find a way to accommodate bad or incomplete work plans that's better than voodoo-multiplication.

Read the article
Android passing an arraylist back to parent activity

- by Nicklas O

Hi there. I've been searching for a simple example of this with no luck. In my android application I have two activities: 1. The main activity which is launched at startup 2. A second activity which is launched by pressing a button on the main activty. When the second activity is finished (by pressing a button) I want it to send back an ArrayList of type MyObject to the main activity and close itself, which the main activity can then do whatever with it. How would I go about achieving this? I have been trying a few things but it is crashing my application when I start the second activity. When the user presses button to launch second activity: Intent i = new Intent(MainActivity.this, secondactivity.class); startActivityForResult(i, 1); The array which is bundled back after pressing a button on the second activity: Intent intent= getIntent(); Bundle b = new Bundle(); b.putParcelableArrayList("myarraylist", mylist); intent.putExtras(b); setResult(RESULT_OK, intent); finish(); And finally a listener on the main activity (although I'm not sure of 100% when this code launches...) protected void onActivityResult(int requestCode, int resultCode, Intent data) { super.onActivityResult(requestCode, resultCode, data); if(resultCode==RESULT_OK && requestCode==1){ Bundle extras = data.getExtras(); final ArrayList<MyObject> mylist = extras.getParcelableArrayList("myarraylist"); Toast.makeText(MainActivity.this, mylist.get(0).getName(), Toast.LENGTH_SHORT).show(); } } Any ideas where I am going wrong? The onActivityResult() seems to be crashing my application. EDIT: This is my class MyObject, its called plan and has a name and an id import android.os.Parcel; import android.os.Parcelable; public class Plan implements Parcelable{ private String name; private String id; public Plan(){ } public Plan(String name, String id){ this.name = name; this.id = id; } public String getName(){ return name; } public void setName(String name){ this.name = name; } public String getId(){ return id; } public void setId(String id){ this.id = id; } public String toString(){ return "Plan ID: " + id + " Plan Name: " + name; } @Override public int describeContents() { // TODO Auto-generated method stub return 0; } @Override public void writeToParcel(Parcel dest, int flags) { dest.writeString(id); dest.writeString(name); } public static final Parcelable.Creator<Plan> CREATOR = new Parcelable.Creator<Plan>() { public Plan createFromParcel(Parcel in) { return new Plan(); } @Override public Plan[] newArray(int size) { // TODO Auto-generated method stub return new Plan[size]; } }; } This is my logcat E/AndroidRuntime( 293): java.lang.RuntimeException: Unable to instantiate activ ity ComponentInfo{com.daniel.android.groupproject/com.me.android.projec t.secondactivity}: java.lang.NullPointerException E/AndroidRuntime( 293): at android.app.ActivityThread.performLaunchActiv ity(ActivityThread.java:2417) E/AndroidRuntime( 293): at android.app.ActivityThread.handleLaunchActivi ty(ActivityThread.java:2512) E/AndroidRuntime( 293): at android.app.ActivityThread.access$2200(Activi tyThread.java:119) E/AndroidRuntime( 293): at android.app.ActivityThread$H.handleMessage(Ac tivityThread.java:1863) E/AndroidRuntime( 293): at android.os.Handler.dispatchMessage(Handler.ja va:99) E/AndroidRuntime( 293): at android.os.Looper.loop(Looper.java:123) E/AndroidRuntime( 293): at android.app.ActivityThread.main(ActivityThrea d.java:4363) E/AndroidRuntime( 293): at java.lang.reflect.Method.invokeNative(Native Method) E/AndroidRuntime( 293): at java.lang.reflect.Method.invoke(Method.java:5 21) E/AndroidRuntime( 293): at com.android.internal.os.ZygoteInit$MethodAndA rgsCaller.run(ZygoteInit.java:860) E/AndroidRuntime( 293): at com.android.internal.os.ZygoteInit.main(Zygot eInit.java:618) E/AndroidRuntime( 293): at dalvik.system.NativeStart.main(Native Method) E/AndroidRuntime( 293): Caused by: java.lang.NullPointerException E/AndroidRuntime( 293): at com.daniel.android.groupproject.login.<init>( login.java:51) E/AndroidRuntime( 293): at java.lang.Class.newInstanceImpl(Native Method ) E/AndroidRuntime( 293): at java.lang.Class.newInstance(Class.java:1479) E/AndroidRuntime( 293): at android.app.Instrumentation.newActivity(Instr umentation.java:1021) E/AndroidRuntime( 293): at android.app.ActivityThread.performLaunchActiv ity(ActivityThread.java:2409) E/AndroidRuntime( 293): ... 11 more

Read the article
How to implement Restricted access to application features

- by DroidUser

I'm currently developing a web application, that provides some 'service' to the user. The user will have to select a 'plan' according to which she/he will be allowed to perform application specific actions but up to a limit defined by the plan. A Plan will also limit access to certain features, which will not be available at all for some plans. As an example : say there are 3 plans, 2 actions throughout the application users in plan-1 can perform action-1 3 times, and they can't perform action-2 at all users in plan-2 can perform action-1 10 times, action-2 5 times users in plan-3 can perform action-1 20 times, action-2 10 times So i'm looking for the best way to get this done, and my main concerns besides implementing it, are the following(in no particular order) maintainability/changeability : the number of plans, and type of features/actions will change in the final product industry standard/best practice : for future readiness!! efficiency : ofcourse, i want fast code!! I have never done anything like this before, so i have no clue about how do i go about implementing these functionalities. Any tips/guides/patterns/resources/examples? I did read a little about ACL, RBAC, are they the patterns that i need to follow? Really any sort of feedback will help.

Read the article
Partition Wise Joins II

- by jean-pierre.dijcks

One of the things that I did not talk about in the initial partition wise join post was the effect it has on resource allocation on the database server. When Oracle applies a different join method - e.g. not PWJ - what you will see in SQL Monitor (in Enterprise Manager) or in an Explain Plan is a set of producers and a set of consumers. The producers scan the tables in the the join. If there are two tables the producers first scan one table, then the other. The producers thus provide data to the consumers, and when the consumers have the data from both scans they do the join and give the data to the query coordinator. Now that behavior means that if you choose a degree of parallelism of 4 to run such query with, Oracle will allocate 8 parallel processes. Of these 8 processes 4 are producers and 4 are consumers. The consumers only actually do work once the producers are fully done with scanning both sides of the join. In the plan above you can see that the producers access table SALES [line 11] and then do a PX SEND [line 9]. That is the producer set of processes working. The consumers receive that data [line 8] and twiddle their thumbs while the producers go on and scan CUSTOMERS. The producers send that data to the consumer indicated by PX SEND [line 5]. After receiving that data [line 4] the consumers do the actual join [line 3] and give the data to the QC [line 2]. BTW, the myth that you see twice the number of processes due to the setting PARALLEL_THREADS_PER_CPU=2 is obviously not true. The above is why you will see 2 times the processes of the DOP. In a PWJ plan the consumers are not present. Instead of producing rows and giving those to different processes, a PWJ only uses a single set of processes. Each process reads its piece of the join across the two tables and performs the join. The plan here is notably different from the initial plan. First of all the hash join is done right on top of both table scans [line 8]. This query is a little more complex than the previous so there is a bit of noise above that bit of info, but for this post, lets ignore that (sort stuff). The important piece here is that the PWJ plan typically will be faster and from a PX process number / resources typically cheaper. You may want to look out for those plans and try to get those to appear a lot... CREDITS: credits for the plans and some of the info on the plans go to Maria, as she actually produced these plans and is the expert on plans in general... You can see her talk about explaining the explain plan and other optimizer stuff over here: ODTUG in Washington DC, June 27 - July 1 On the Optimizer blog At OpenWorld in San Francisco, September 19 - 23 Happy joining and hope to see you all at ODTUG and OOW...

Read the article
Cardinality Estimation Bug with Lookups in SQL Server 2008 onward

- by Paul White

Cost-based optimization stands or falls on the quality of cardinality estimates (expected row counts). If the optimizer has incorrect information to start with, it is quite unlikely to produce good quality execution plans except by chance. There are many ways we can provide good starting information to the optimizer, and even more ways for cardinality estimation to go wrong. Good database people know this, and work hard to write optimizer-friendly queries with a schema and metadata (e.g. statistics) that reduce the chances of poor cardinality estimation producing a sub-optimal plan. Today, I am going to look at a case where poor cardinality estimation is Microsoft’s fault, and not yours. SQL Server 2005 SELECT th.ProductID, th.TransactionID, th.TransactionDate FROM Production.TransactionHistory AS th WHERE th.ProductID = 1 AND th.TransactionDate BETWEEN '20030901' AND '20031231'; The query plan on SQL Server 2005 is as follows (if you are using a more recent version of AdventureWorks, you will need to change the year on the date range from 2003 to 2007): There is an Index Seek on ProductID = 1, followed by a Key Lookup to find the Transaction Date for each row, and finally a Filter to restrict the results to only those rows where Transaction Date falls in the range specified. The cardinality estimate of 45 rows at the Index Seek is exactly correct. The table is not very large, there are up-to-date statistics associated with the index, so this is as expected. The estimate for the Key Lookup is also exactly right. Each lookup into the Clustered Index to find the Transaction Date is guaranteed to return exactly one row. The plan shows that the Key Lookup is expected to be executed 45 times. The estimate for the Inner Join output is also correct – 45 rows from the seek joining to one row each time, gives 45 rows as output. The Filter estimate is also very good: the optimizer estimates 16.9951 rows will match the specified range of transaction dates. Eleven rows are produced by this query, but that small difference is quite normal and certainly nothing to worry about here. All good so far. SQL Server 2008 onward The same query executed against an identical copy of AdventureWorks on SQL Server 2008 produces a different execution plan: The optimizer has pushed the Filter conditions seen in the 2005 plan down to the Key Lookup. This is a good optimization – it makes sense to filter rows out as early as possible. Unfortunately, it has made a bit of a mess of the cardinality estimates. The post-Filter estimate of 16.9951 rows seen in the 2005 plan has moved with the predicate on Transaction Date. Instead of estimating one row, the plan now suggests that 16.9951 rows will be produced by each clustered index lookup – clearly not right! This misinformation also confuses SQL Sentry Plan Explorer: Plan Explorer shows 765 rows expected from the Key Lookup (it multiplies a rounded estimate of 17 rows by 45 expected executions to give 765 rows total). Workarounds One workaround is to provide a covering non-clustered index (avoiding the lookup avoids the problem of course): CREATE INDEX nc1 ON Production.TransactionHistory (ProductID) INCLUDE (TransactionDate); With the Transaction Date filter applied as a residual predicate in the same operator as the seek, the estimate is again as expected: We could also force the use of the ultimate covering index (the clustered one): SELECT th.ProductID, th.TransactionID, th.TransactionDate FROM Production.TransactionHistory AS th WITH (INDEX(1)) WHERE th.ProductID = 1 AND th.TransactionDate BETWEEN '20030901' AND '20031231'; Summary Providing a covering non-clustered index for all possible queries is not always practical, and scanning the clustered index will rarely be optimal. Nevertheless, these are the best workarounds we have today. In the meantime, watch out for poor cardinality estimates when a predicate is applied as part of a lookup. The worst thing is that the estimate after the lookup join in the 2008+ plans is wrong. It’s not hopelessly wrong in this particular case (45 versus 16.9951 is not the end of the world) but it easily can be much worse, and there’s not much you can do about it. Any decisions made by the optimizer after such a lookup could be based on very wrong information – which can only be bad news. If you think this situation should be improved, please vote for this Connect item. © 2012 Paul White – All Rights Reserved twitter: @SQL_Kiwi email: [email protected]

Read the article
sys.dm_exec_query_stats interaction with recompilation

- by Sam Saffron

We use sys.dm_exec_query_stats to track down slow queries and queries that are IO offenders. This works great, we get a lot of very insightful stats. It is clear this is not as accurate as running a profiler trace, as you have no idea when SQL Server will decide to chuck out a an execution plan. We have quite a few queries where the wrong execution plan is cached. For example queries like the following: SELECT TOP 30 a.Id FROM Posts a JOIN Posts q ON q.Id = a.ParentId JOIN PostTags pt ON q.Id = pt.PostId WHERE a.PostTypeId = 2 AND a.DeletionDate IS NULL AND a.CommunityOwnedDate IS NULL AND a.CreationDate @date AND LEN(a.Body) 300 AND pt.Tag = @tag AND a.Score 0 ORDER BY a.Score DESC The problem is that the ideal plan really depends on the date selected (screenshot of ideal plan): However if the wrong plan is cached, it totally chokes when the date range is big: (notice the big fat lines) To overcome this we were recommended to use either OPTION (OPTIMIZE FOR UNKNOWN) or OPTION (RECOMPILE) OPTIMIZE FOR UNKNOWN results in a slightly better plan, which is far from optimal. Executions are tracked in sys.dm_exec_query_stats. RECOMPILE results in the best plan being chosen, however no execution counts and stats are tracked in sys.dm_exec_query_stats. Is there another DMV we could use to track stats on queries with OPTION (RECOMPILE)? Is this behavior by-design? Is there another way we can for recompilation while keeping stats tracked in sys.dm_exec_query_stats? Note: the framework will always execute parameterized queries using sp_executesql

Read the article
?Oracle????SELECT????UNDO

- by Liu Maclean(???)

????????Oracle?????(dirty read),?Oracle??????Asktom????????Oracle???????, ???undo??????????(before image)??????Consistent, ???????????????Oracle????????????? ????????? ??,??,Oracle?????????????RDBMS,???????????? ?????????2?????: _offline_rollback_segments or _corrupted_rollback_segments ?2?????????Oracle???????????ORA-600[4XXX]???????????????,???2??????Undo??Corruption????????????,?????2????????????????? ??????????????_offline_rollback_segments ? _corrupted_rollback_segments ?2?????: ???????(FORCE OPEN DATABASE) ????????????(consistent read & delayed block cleanout) ??????rollback segment??? ?????:???????Oracle????????,??????????2?????,?????????????!! _offline_rollback_segments ? _corrupted_rollback_segments ???????????: ??2???????Undo Segments(???/???)????????online ?UNDO$???????????OFFLINE??? ???instance??????????????????? ??????Undo Segments????????active transaction????????????dead??SMON???(????????SMON??(?):Recover Dead transaction) _OFFLINE_ROLLBACK_SEGMENTS(offline undo segment list)????(hidden parameter)?????: ???startup???open database???????_OFFLINE_ROLLBACK_SEGMENTS????Undo segments(???/???),?????undo segments????????alert.log???TRACE?????,???????startup?? ?????????????,?ITL?????undo segments?: ???undo segments?transaction table?????????????????? ???????????commit,?????CR??? ????undo segments????(???corrupted??,???missed??)???????????alert.log,??????? ?DML?????????????????????????????????CPU,????????????????????? _CORRUPTED_ROLLBACK_SEGMENTS(corrupted undo segment list)??????????: ?????startup?open database???_CORRUPTED_ROLLBACK_SEGMENTS????undo segments(???/???)???????? ???????_CORRUPTED_ROLLBACK_SEGMENTS???undo segments????????????commit,???undo segments???drop??? ??????????? ??????????????????,?????????????????? ??bootstrap???????????,?????????ORA-00704: bootstrap process failure??,???????????(???Oracle????:??ORA-00600:[4000] ORA-00704: bootstrap process failure????) ??????_CORRUPTED_ROLLBACK_SEGMENTS????????????????????,??????????????? Oracle???????TXChecker??????????? ???????2?????,??????????????_CORRUPTED_ROLLBACK_SEGMENTS?????SELECT????UNDO???????: SQL> alter system set event= '10513 trace name context forever, level 2' scope=spfile; System altered. SQL> alter system set "_in_memory_undo"=false scope=spfile; System altered. 10513 level 2 event????SMON ??rollback ??? dead transaction _in_memory_undo ?? in memory undo ?? SQL> startup force; ORACLE instance started. Total System Global Area 3140026368 bytes Fixed Size 2232472 bytes Variable Size 1795166056 bytes Database Buffers 1325400064 bytes Redo Buffers 17227776 bytes Database mounted. Database opened. session A: SQL> conn maclean/maclean Connected. SQL> create table maclean tablespace users as select 1 t1 from dual connect by level exec dbms_stats.gather_table_stats('','MACLEAN'); PL/SQL procedure successfully completed. SQL> set autotrace on; SQL> select sum(t1) from maclean; SUM(T1) ---------- 501 Execution Plan ---------------------------------------------------------- Plan hash value: 1679547536 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 3 | 3 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 3 | | | | 2 | TABLE ACCESS FULL| MACLEAN | 501 | 1503 | 3 (0)| 00:00:01 | ------------------------------------------------------------------------------ Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 3 consistent gets 0 physical reads 0 redo size 515 bytes sent via SQL*Net to client 492 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processe ???????????,????current block, ????????,consistent gets??3? SQL> update maclean set t1=0; 501 rows updated. SQL> alter system checkpoint; System altered. ??session A?commit; ???? session: SQL> conn maclean/maclean Connected. SQL> SQL> set autotrace on; SQL> select sum(t1) from maclean; SUM(T1) ---------- 501 Execution Plan ---------------------------------------------------------- Plan hash value: 1679547536 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 3 | 3 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 3 | | | | 2 | TABLE ACCESS FULL| MACLEAN | 501 | 1503 | 3 (0)| 00:00:01 | ------------------------------------------------------------------------------ Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 505 consistent gets 0 physical reads 108 redo size 515 bytes sent via SQL*Net to client 492 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed ?????? ?????????undo??CR?,???consistent gets??? 505 [oracle@vrh8 ~]$ ps -ef|grep LOCAL=YES |grep -v grep oracle 5841 5839 0 09:17 ? 00:00:00 oracleG10R25 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq))) [oracle@vrh8 ~]$ kill -9 5841 ??session A???Server Process????,???dead transaction ????smon?? select ktuxeusn, to_char(sysdate, 'DD-MON-YYYY HH24:MI:SS') "Time", ktuxesiz, ktuxesta from x$ktuxe where ktuxecfl = 'DEAD'; KTUXEUSN Time KTUXESIZ KTUXESTA ---------- -------------------- ---------- ---------------- 2 06-AUG-2012 09:20:45 7 ACTIVE ???1?active rollback segment SQL> conn maclean/maclean Connected. SQL> set autotrace on; SQL> select sum(t1) from maclean; SUM(T1) ---------- 501 Execution Plan ---------------------------------------------------------- Plan hash value: 1679547536 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 3 | 3 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 3 | | | | 2 | TABLE ACCESS FULL| MACLEAN | 501 | 1503 | 3 (0)| 00:00:01 | ------------------------------------------------------------------------------ Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 411 consistent gets 0 physical reads 108 redo size 515 bytes sent via SQL*Net to client 492 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed ????? ????kill?? ???smon ??dead transaction , ???????????? ?????undo??????? ????active?rollback segment??? SQL> select segment_name from dba_rollback_segs where segment_id=2; SEGMENT_NAME ------------------------------ _SYSSMU2$ SQL> alter system set "_corrupted_rollback_segments"='_SYSSMU2$' scope=spfile; System altered. ? _corrupted_rollback_segments ?? ???2?rollback segment, ????????undo SQL> startup force; ORACLE instance started. Total System Global Area 3140026368 bytes Fixed Size 2232472 bytes Variable Size 1795166056 bytes Database Buffers 1325400064 bytes Redo Buffers 17227776 bytes Database mounted. Database opened. SQL> conn maclean/maclean Connected. SQL> set autotrace on; SQL> select sum(t1) from maclean; SUM(T1) ---------- 94 Execution Plan ---------------------------------------------------------- Plan hash value: 1679547536 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 3 | 3 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 3 | | | | 2 | TABLE ACCESS FULL| MACLEAN | 501 | 1503 | 3 (0)| 00:00:01 | ------------------------------------------------------------------------------ Statistics ---------------------------------------------------------- 228 recursive calls 0 db block gets 29 consistent gets 5 physical reads 116 redo size 514 bytes sent via SQL*Net to client 492 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 4 sorts (memory) 0 sorts (disk) 1 rows processed SQL> / SUM(T1) ---------- 94 Execution Plan ---------------------------------------------------------- Plan hash value: 1679547536 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 3 | 3 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 3 | | | | 2 | TABLE ACCESS FULL| MACLEAN | 501 | 1503 | 3 (0)| 00:00:01 | ------------------------------------------------------------------------------ Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 3 consistent gets 0 physical reads 0 redo size 514 bytes sent via SQL*Net to client 492 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed ?????? consistent gets???3,?????????????????,??ITL???UNDO SEGMENTS?_corrupted_rollback_segments????,???????????COMMIT??,????UNDO? ???????,?????????????????????????(????????????????????),????????????????? ???? , ?????

Read the article
Is there a way to programmatically convert a SQL Server query plan to an image?

- by Ben Challenor

I'd like to be able to convert SQL Server query plans from XML to images. Ideally a vector format, but a bitmap would do. Is there an open source library to do this? Or can I use one of the SQL Server Management Studio DLLs? Thanks.

Read the article
When a professional should plan to leave a job ?

- by Indigo Praveen

Hi All, I don't know whether this should be asked or not but I think it happens with every programmer in his/her career. The question is when should someone start for looking another job. Some guys remain in one company for 10-15-20 years, mostlay in Europe. But if we see the trend in India guys are changing their jobs only in 1-2 years. If it's happening in India then there must be something behind it. So, I want to know the impacts on someone's career of changing jobs frequently. Please share your experiences.

Read the article
How much detail should be in a project plan or spec?

- by DeanMc

I have an issue that I feel many programmers can relate to... I have worked on many small scale projects. After my initial paper brain storm I tend to start coding. What I come up with is usually a rough working model of the actual application. I design in a disconnected fashion so I am talking about underlying code libraries, user interfaces are the last thing as the library usually dictates what is needed in the UI. As my projects get bigger I worry that so should my "spec" or design document. The above paragraph, from my investigations, is echoed all across the internet in one fashion or another. When a UI is concerned there is a bit more information but it is UI specific and does not relate to code libraries. What I am beginning to realise is that maybe code is code is code. It seems from my extensive research that there is no 1:1 mapping between a design document and the code. When I need to research a topic I dump information into OneNote and from there I prioritise features into versions and then into related chunks so that development runs in a fairly linear fashion, my tasks tend to look like so: Implement Binary File Reader Implement Binary File Writer Create Object to encapsulate Data for expression to the caller Now any programmer worth his salt is aware that between those three to do items could be a potential wall of code that could expand out to multiple files. I have tried to map the complete code process for each task but I simply don't think it can be done effectively. By the time one mangles pseudo code it is essentially code anyway so the time investment is negated. So my question is this: Am I right in assuming that the best documentation is the code itself. We are all in agreement that a high level overview is needed. How high should this be? Do you design to statement, class or concept level? What works for you?

Read the article
Where can I find a good software implementation plan template?

- by Corpsekicker

This is not "programming" related as much as it is "software engineering" related. I am required to produce an implementation for additional functionality to a complete system. All I am armed with is knowledge of the existing architecture and a functional spec with visual requirements, user stories and use cases. Is there a standardised way to go about this? I suck at documentation.

Read the article

< Previous Page | 7 8 9 10 11 12 13 14 15 16 17 18 | Next Page >