Search Results

Search found 8441 results on 338 pages for 'paul de goede'.

Page 149/338 | < Previous Page | 145 146 147 148 149 150 151 152 153 154 155 156  | Next Page >

  • When is a Seek not a Seek?

    - by Paul White
    The following script creates a single-column clustered table containing the integers from 1 to 1,000 inclusive. IF OBJECT_ID(N'tempdb..#Test', N'U') IS NOT NULL DROP TABLE #Test ; GO CREATE TABLE #Test ( id INTEGER PRIMARY KEY CLUSTERED ); ; INSERT #Test (id) SELECT V.number FROM master.dbo.spt_values AS V WHERE V.[type] = N'P' AND V.number BETWEEN 1 AND 1000 ; Let’s say we need to find the rows with values from 100 to 170, excluding any values that divide exactly by 10.  One way to write that query would be: SELECT T.id FROM #Test AS T WHERE T.id IN ( 101,102,103,104,105,106,107,108,109, 111,112,113,114,115,116,117,118,119, 121,122,123,124,125,126,127,128,129, 131,132,133,134,135,136,137,138,139, 141,142,143,144,145,146,147,148,149, 151,152,153,154,155,156,157,158,159, 161,162,163,164,165,166,167,168,169 ) ; That query produces a pretty efficient-looking query plan: Knowing that the source column is defined as an INTEGER, we could also express the query this way: SELECT T.id FROM #Test AS T WHERE T.id >= 101 AND T.id <= 169 AND T.id % 10 > 0 ; We get a similar-looking plan: If you look closely, you might notice that the line connecting the two icons is a little thinner than before.  The first query is estimated to produce 61.9167 rows – very close to the 63 rows we know the query will return.  The second query presents a tougher challenge for SQL Server because it doesn’t know how to predict the selectivity of the modulo expression (T.id % 10 > 0).  Without that last line, the second query is estimated to produce 68.1667 rows – a slight overestimate.  Adding the opaque modulo expression results in SQL Server guessing at the selectivity.  As you may know, the selectivity guess for a greater-than operation is 30%, so the final estimate is 30% of 68.1667, which comes to 20.45 rows. The second difference is that the Clustered Index Seek is costed at 99% of the estimated total for the statement.  For some reason, the final SELECT operator is assigned a small cost of 0.0000484 units; I have absolutely no idea why this is so, or what it models.  Nevertheless, we can compare the total cost for both queries: the first one comes in at 0.0033501 units, and the second at 0.0034054.  The important point is that the second query is costed very slightly higher than the first, even though it is expected to produce many fewer rows (20.45 versus 61.9167). If you run the two queries, they produce exactly the same results, and both complete so quickly that it is impossible to measure CPU usage for a single execution.  We can, however, compare the I/O statistics for a single run by running the queries with STATISTICS IO ON: Table '#Test'. Scan count 63, logical reads 126, physical reads 0. Table '#Test'. Scan count 01, logical reads 002, physical reads 0. The query with the IN list uses 126 logical reads (and has a ‘scan count’ of 63), while the second query form completes with just 2 logical reads (and a ‘scan count’ of 1).  It is no coincidence that 126 = 63 * 2, by the way.  It is almost as if the first query is doing 63 seeks, compared to one for the second query. In fact, that is exactly what it is doing.  There is no indication of this in the graphical plan, or the tool-tip that appears when you hover your mouse over the Clustered Index Seek icon.  To see the 63 seek operations, you have click on the Seek icon and look in the Properties window (press F4, or right-click and choose from the menu): The Seek Predicates list shows a total of 63 seek operations – one for each of the values from the IN list contained in the first query.  I have expanded the first seek node to show the details; it is seeking down the clustered index to find the entry with the value 101.  Each of the other 62 nodes expands similarly, and the same information is contained (even more verbosely) in the XML form of the plan. Each of the 63 seek operations starts at the root of the clustered index B-tree and navigates down to the leaf page that contains the sought key value.  Our table is just large enough to need a separate root page, so each seek incurs 2 logical reads (one for the root, and one for the leaf).  We can see the index depth using the INDEXPROPERTY function, or by using the a DMV: SELECT S.index_type_desc, S.index_depth FROM sys.dm_db_index_physical_stats ( DB_ID(N'tempdb'), OBJECT_ID(N'tempdb..#Test', N'U'), 1, 1, DEFAULT ) AS S ; Let’s look now at the Properties window when the Clustered Index Seek from the second query is selected: There is just one seek operation, which starts at the root of the index and navigates the B-tree looking for the first key that matches the Start range condition (id >= 101).  It then continues to read records at the leaf level of the index (following links between leaf-level pages if necessary) until it finds a row that does not meet the End range condition (id <= 169).  Every row that meets the seek range condition is also tested against the Residual Predicate highlighted above (id % 10 > 0), and is only returned if it matches that as well. You will not be surprised that the single seek (with a range scan and residual predicate) is much more efficient than 63 singleton seeks.  It is not 63 times more efficient (as the logical reads comparison would suggest), but it is around three times faster.  Let’s run both query forms 10,000 times and measure the elapsed time: DECLARE @i INTEGER, @n INTEGER = 10000, @s DATETIME = GETDATE() ; SET NOCOUNT ON; SET STATISTICS XML OFF; ; WHILE @n > 0 BEGIN SELECT @i = T.id FROM #Test AS T WHERE T.id IN ( 101,102,103,104,105,106,107,108,109, 111,112,113,114,115,116,117,118,119, 121,122,123,124,125,126,127,128,129, 131,132,133,134,135,136,137,138,139, 141,142,143,144,145,146,147,148,149, 151,152,153,154,155,156,157,158,159, 161,162,163,164,165,166,167,168,169 ) ; SET @n -= 1; END ; PRINT DATEDIFF(MILLISECOND, @s, GETDATE()) ; GO DECLARE @i INTEGER, @n INTEGER = 10000, @s DATETIME = GETDATE() ; SET NOCOUNT ON ; WHILE @n > 0 BEGIN SELECT @i = T.id FROM #Test AS T WHERE T.id >= 101 AND T.id <= 169 AND T.id % 10 > 0 ; SET @n -= 1; END ; PRINT DATEDIFF(MILLISECOND, @s, GETDATE()) ; On my laptop, running SQL Server 2008 build 4272 (SP2 CU2), the IN form of the query takes around 830ms and the range query about 300ms.  The main point of this post is not performance, however – it is meant as an introduction to the next few parts in this mini-series that will continue to explore scans and seeks in detail. When is a seek not a seek?  When it is 63 seeks © Paul White 2011 email: [email protected] twitter: @SQL_kiwi

    Read the article

  • Fun with Aggregates

    - by Paul White
    There are interesting things to be learned from even the simplest queries.  For example, imagine you are given the task of writing a query to list AdventureWorks product names where the product has at least one entry in the transaction history table, but fewer than ten. One possible query to meet that specification is: SELECT p.Name FROM Production.Product AS p JOIN Production.TransactionHistory AS th ON p.ProductID = th.ProductID GROUP BY p.ProductID, p.Name HAVING COUNT_BIG(*) < 10; That query correctly returns 23 rows (execution plan and data sample shown below): The execution plan looks a bit different from the written form of the query: the base tables are accessed in reverse order, and the aggregation is performed before the join.  The general idea is to read all rows from the history table, compute the count of rows grouped by ProductID, merge join the results to the Product table on ProductID, and finally filter to only return rows where the count is less than ten. This ‘fully-optimized’ plan has an estimated cost of around 0.33 units.  The reason for the quote marks there is that this plan is not quite as optimal as it could be – surely it would make sense to push the Filter down past the join too?  To answer that, let’s look at some other ways to formulate this query.  This being SQL, there are any number of ways to write logically-equivalent query specifications, so we’ll just look at a couple of interesting ones.  The first query is an attempt to reverse-engineer T-SQL from the optimized query plan shown above.  It joins the result of pre-aggregating the history table to the Product table before filtering: SELECT p.Name FROM ( SELECT th.ProductID, cnt = COUNT_BIG(*) FROM Production.TransactionHistory AS th GROUP BY th.ProductID ) AS q1 JOIN Production.Product AS p ON p.ProductID = q1.ProductID WHERE q1.cnt < 10; Perhaps a little surprisingly, we get a slightly different execution plan: The results are the same (23 rows) but this time the Filter is pushed below the join!  The optimizer chooses nested loops for the join, because the cardinality estimate for rows passing the Filter is a bit low (estimate 1 versus 23 actual), though you can force a merge join with a hint and the Filter still appears below the join.  In yet another variation, the < 10 predicate can be ‘manually pushed’ by specifying it in a HAVING clause in the “q1” sub-query instead of in the WHERE clause as written above. The reason this predicate can be pushed past the join in this query form, but not in the original formulation is simply an optimizer limitation – it does make efforts (primarily during the simplification phase) to encourage logically-equivalent query specifications to produce the same execution plan, but the implementation is not completely comprehensive. Moving on to a second example, the following query specification results from phrasing the requirement as “list the products where there exists fewer than ten correlated rows in the history table”: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) < 10 ); Unfortunately, this query produces an incorrect result (86 rows): The problem is that it lists products with no history rows, though the reasons are interesting.  The COUNT_BIG(*) in the EXISTS clause is a scalar aggregate (meaning there is no GROUP BY clause) and scalar aggregates always produce a value, even when the input is an empty set.  In the case of the COUNT aggregate, the result of aggregating the empty set is zero (the other standard aggregates produce a NULL).  To make the point really clear, let’s look at product 709, which happens to be one for which no history rows exist: -- Scalar aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709;   -- Vector aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709 GROUP BY th.ProductID; The estimated execution plans for these two statements are almost identical: You might expect the Stream Aggregate to have a Group By for the second statement, but this is not the case.  The query includes an equality comparison to a constant value (709), so all qualified rows are guaranteed to have the same value for ProductID and the Group By is optimized away. In fact there are some minor differences between the two plans (the first is auto-parameterized and qualifies for trivial plan, whereas the second is not auto-parameterized and requires cost-based optimization), but there is nothing to indicate that one is a scalar aggregate and the other is a vector aggregate.  This is something I would like to see exposed in show plan so I suggested it on Connect.  Anyway, the results of running the two queries show the difference at runtime: The scalar aggregate (no GROUP BY) returns a result of zero, whereas the vector aggregate (with a GROUP BY clause) returns nothing at all.  Returning to our EXISTS query, we could ‘fix’ it by changing the HAVING clause to reject rows where the scalar aggregate returns zero: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) BETWEEN 1 AND 9 ); The query now returns the correct 23 rows: Unfortunately, the execution plan is less efficient now – it has an estimated cost of 0.78 compared to 0.33 for the earlier plans.  Let’s try adding a redundant GROUP BY instead of changing the HAVING clause: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY th.ProductID HAVING COUNT_BIG(*) < 10 ); Not only do we now get correct results (23 rows), this is the execution plan: I like to compare that plan to quantum physics: if you don’t find it shocking, you haven’t understood it properly :)  The simple addition of a redundant GROUP BY has resulted in the EXISTS form of the query being transformed into exactly the same optimal plan we found earlier.  What’s more, in SQL Server 2008 and later, we can replace the odd-looking GROUP BY with an explicit GROUP BY on the empty set: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ); I offer that as an alternative because some people find it more intuitive (and it perhaps has more geek value too).  Whichever way you prefer, it’s rather satisfying to note that the result of the sub-query does not exist for a particular correlated value where a vector aggregate is used (the scalar COUNT aggregate always returns a value, even if zero, so it always ‘EXISTS’ regardless which ProductID is logically being evaluated). The following query forms also produce the optimal plan and correct results, so long as a vector aggregate is used (you can probably find more equivalent query forms): WHERE Clause SELECT p.Name FROM Production.Product AS p WHERE ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) < 10; APPLY SELECT p.Name FROM Production.Product AS p CROSS APPLY ( SELECT NULL FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ) AS ca (dummy); FROM Clause SELECT q1.Name FROM ( SELECT p.Name, cnt = ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) FROM Production.Product AS p ) AS q1 WHERE q1.cnt < 10; This last example uses SUM(1) instead of COUNT and does not require a vector aggregate…you should be able to work out why :) SELECT q.Name FROM ( SELECT p.Name, cnt = ( SELECT SUM(1) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID ) FROM Production.Product AS p ) AS q WHERE q.cnt < 10; The semantics of SQL aggregates are rather odd in places.  It definitely pays to get to know the rules, and to be careful to check whether your queries are using scalar or vector aggregates.  As we have seen, query plans do not show in which ‘mode’ an aggregate is running and getting it wrong can cause poor performance, wrong results, or both. © 2012 Paul White Twitter: @SQL_Kiwi email: [email protected]

    Read the article

  • Heaps of Trouble?

    - by Paul White NZ
    If you’re not already a regular reader of Brad Schulz’s blog, you’re missing out on some great material.  In his latest entry, he is tasked with optimizing a query run against tables that have no indexes at all.  The problem is, predictably, that performance is not very good.  The catch is that we are not allowed to create any indexes (or even new statistics) as part of our optimization efforts. In this post, I’m going to look at the problem from a slightly different angle, and present an alternative solution to the one Brad found.  Inevitably, there’s going to be some overlap between our entries, and while you don’t necessarily need to read Brad’s post before this one, I do strongly recommend that you read it at some stage; he covers some important points that I won’t cover again here. The Example We’ll use data from the AdventureWorks database, copied to temporary unindexed tables.  A script to create these structures is shown below: CREATE TABLE #Custs ( CustomerID INTEGER NOT NULL, TerritoryID INTEGER NULL, CustomerType NCHAR(1) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, ); GO CREATE TABLE #Prods ( ProductMainID INTEGER NOT NULL, ProductSubID INTEGER NOT NULL, ProductSubSubID INTEGER NOT NULL, Name NVARCHAR(50) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, ); GO CREATE TABLE #OrdHeader ( SalesOrderID INTEGER NOT NULL, OrderDate DATETIME NOT NULL, SalesOrderNumber NVARCHAR(25) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, CustomerID INTEGER NOT NULL, ); GO CREATE TABLE #OrdDetail ( SalesOrderID INTEGER NOT NULL, OrderQty SMALLINT NOT NULL, LineTotal NUMERIC(38,6) NOT NULL, ProductMainID INTEGER NOT NULL, ProductSubID INTEGER NOT NULL, ProductSubSubID INTEGER NOT NULL, ); GO INSERT #Custs ( CustomerID, TerritoryID, CustomerType ) SELECT C.CustomerID, C.TerritoryID, C.CustomerType FROM AdventureWorks.Sales.Customer C WITH (TABLOCK); GO INSERT #Prods ( ProductMainID, ProductSubID, ProductSubSubID, Name ) SELECT P.ProductID, P.ProductID, P.ProductID, P.Name FROM AdventureWorks.Production.Product P WITH (TABLOCK); GO INSERT #OrdHeader ( SalesOrderID, OrderDate, SalesOrderNumber, CustomerID ) SELECT H.SalesOrderID, H.OrderDate, H.SalesOrderNumber, H.CustomerID FROM AdventureWorks.Sales.SalesOrderHeader H WITH (TABLOCK); GO INSERT #OrdDetail ( SalesOrderID, OrderQty, LineTotal, ProductMainID, ProductSubID, ProductSubSubID ) SELECT D.SalesOrderID, D.OrderQty, D.LineTotal, D.ProductID, D.ProductID, D.ProductID FROM AdventureWorks.Sales.SalesOrderDetail D WITH (TABLOCK); The query itself is a simple join of the four tables: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #Prods P JOIN #OrdDetail D ON P.ProductMainID = D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID JOIN #OrdHeader H ON D.SalesOrderID = H.SalesOrderID JOIN #Custs C ON H.CustomerID = C.CustomerID ORDER BY P.ProductMainID ASC OPTION (RECOMPILE, MAXDOP 1); Remember that these tables have no indexes at all, and only the single-column sampled statistics SQL Server automatically creates (assuming default settings).  The estimated query plan produced for the test query looks like this (click to enlarge): The Problem The problem here is one of cardinality estimation – the number of rows SQL Server expects to find at each step of the plan.  The lack of indexes and useful statistical information means that SQL Server does not have the information it needs to make a good estimate.  Every join in the plan shown above estimates that it will produce just a single row as output.  Brad covers the factors that lead to the low estimates in his post. In reality, the join between the #Prods and #OrdDetail tables will produce 121,317 rows.  It should not surprise you that this has rather dire consequences for the remainder of the query plan.  In particular, it makes a nonsense of the optimizer’s decision to use Nested Loops to join to the two remaining tables.  Instead of scanning the #OrdHeader and #Custs tables once (as it expected), it has to perform 121,317 full scans of each.  The query takes somewhere in the region of twenty minutes to run to completion on my development machine. A Solution At this point, you may be thinking the same thing I was: if we really are stuck with no indexes, the best we can do is to use hash joins everywhere. We can force the exclusive use of hash joins in several ways, the two most common being join and query hints.  A join hint means writing the query using the INNER HASH JOIN syntax; using a query hint involves adding OPTION (HASH JOIN) at the bottom of the query.  The difference is that using join hints also forces the order of the join, whereas the query hint gives the optimizer freedom to reorder the joins at its discretion. Adding the OPTION (HASH JOIN) hint results in this estimated plan: That produces the correct output in around seven seconds, which is quite an improvement!  As a purely practical matter, and given the rigid rules of the environment we find ourselves in, we might leave things there.  (We can improve the hashing solution a bit – I’ll come back to that later on). Faster Nested Loops It might surprise you to hear that we can beat the performance of the hash join solution shown above using nested loops joins exclusively, and without breaking the rules we have been set. The key to this part is to realize that a condition like (A = B) can be expressed as (A <= B) AND (A >= B).  Armed with this tremendous new insight, we can rewrite the join predicates like so: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #OrdDetail D JOIN #OrdHeader H ON D.SalesOrderID >= H.SalesOrderID AND D.SalesOrderID <= H.SalesOrderID JOIN #Custs C ON H.CustomerID >= C.CustomerID AND H.CustomerID <= C.CustomerID JOIN #Prods P ON P.ProductMainID >= D.ProductMainID AND P.ProductMainID <= D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID ORDER BY D.ProductMainID OPTION (RECOMPILE, LOOP JOIN, MAXDOP 1, FORCE ORDER); I’ve also added LOOP JOIN and FORCE ORDER query hints to ensure that only nested loops joins are used, and that the tables are joined in the order they appear.  The new estimated execution plan is: This new query runs in under 2 seconds. Why Is It Faster? The main reason for the improvement is the appearance of the eager Index Spools, which are also known as index-on-the-fly spools.  If you read my Inside The Optimiser series you might be interested to know that the rule responsible is called JoinToIndexOnTheFly. An eager index spool consumes all rows from the table it sits above, and builds a index suitable for the join to seek on.  Taking the index spool above the #Custs table as an example, it reads all the CustomerID and TerritoryID values with a single scan of the table, and builds an index keyed on CustomerID.  The term ‘eager’ means that the spool consumes all of its input rows when it starts up.  The index is built in a work table in tempdb, has no associated statistics, and only exists until the query finishes executing. The result is that each unindexed table is only scanned once, and just for the columns necessary to build the temporary index.  From that point on, every execution of the inner side of the join is answered by a seek on the temporary index – not the base table. A second optimization is that the sort on ProductMainID (required by the ORDER BY clause) is performed early, on just the rows coming from the #OrdDetail table.  The optimizer has a good estimate for the number of rows it needs to sort at that stage – it is just the cardinality of the table itself.  The accuracy of the estimate there is important because it helps determine the memory grant given to the sort operation.  Nested loops join preserves the order of rows on its outer input, so sorting early is safe.  (Hash joins do not preserve order in this way, of course). The extra lazy spool on the #Prods branch is a further optimization that avoids executing the seek on the temporary index if the value being joined (the ‘outer reference’) hasn’t changed from the last row received on the outer input.  It takes advantage of the fact that rows are still sorted on ProductMainID, so if duplicates exist, they will arrive at the join operator one after the other. The optimizer is quite conservative about introducing index spools into a plan, because creating and dropping a temporary index is a relatively expensive operation.  It’s presence in a plan is often an indication that a useful index is missing. I want to stress that I rewrote the query in this way primarily as an educational exercise – I can’t imagine having to do something so horrible to a production system. Improving the Hash Join I promised I would return to the solution that uses hash joins.  You might be puzzled that SQL Server can create three new indexes (and perform all those nested loops iterations) faster than it can perform three hash joins.  The answer, again, is down to the poor information available to the optimizer.  Let’s look at the hash join plan again: Two of the hash joins have single-row estimates on their build inputs.  SQL Server fixes the amount of memory available for the hash table based on this cardinality estimate, so at run time the hash join very quickly runs out of memory. This results in the join spilling hash buckets to disk, and any rows from the probe input that hash to the spilled buckets also get written to disk.  The join process then continues, and may again run out of memory.  This is a recursive process, which may eventually result in SQL Server resorting to a bailout join algorithm, which is guaranteed to complete eventually, but may be very slow.  The data sizes in the example tables are not large enough to force a hash bailout, but it does result in multiple levels of hash recursion.  You can see this for yourself by tracing the Hash Warning event using the Profiler tool. The final sort in the plan also suffers from a similar problem: it receives very little memory and has to perform multiple sort passes, saving intermediate runs to disk (the Sort Warnings Profiler event can be used to confirm this).  Notice also that because hash joins don’t preserve sort order, the sort cannot be pushed down the plan toward the #OrdDetail table, as in the nested loops plan. Ok, so now we understand the problems, what can we do to fix it?  We can address the hash spilling by forcing a different order for the joins: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #Prods P JOIN #Custs C JOIN #OrdHeader H ON H.CustomerID = C.CustomerID JOIN #OrdDetail D ON D.SalesOrderID = H.SalesOrderID ON P.ProductMainID = D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID ORDER BY D.ProductMainID OPTION (MAXDOP 1, HASH JOIN, FORCE ORDER); With this plan, each of the inputs to the hash joins has a good estimate, and no hash recursion occurs.  The final sort still suffers from the one-row estimate problem, and we get a single-pass sort warning as it writes rows to disk.  Even so, the query runs to completion in three or four seconds.  That’s around half the time of the previous hashing solution, but still not as fast as the nested loops trickery. Final Thoughts SQL Server’s optimizer makes cost-based decisions, so it is vital to provide it with accurate information.  We can’t really blame the performance problems highlighted here on anything other than the decision to use completely unindexed tables, and not to allow the creation of additional statistics. I should probably stress that the nested loops solution shown above is not one I would normally contemplate in the real world.  It’s there primarily for its educational and entertainment value.  I might perhaps use it to demonstrate to the sceptical that SQL Server itself is crying out for an index. Be sure to read Brad’s original post for more details.  My grateful thanks to him for granting permission to reuse some of his material. Paul White Email: [email protected] Twitter: @PaulWhiteNZ

    Read the article

  • How To Clear An Alert - Part 2

    - by werner.de.gruyter
    There were some interesting comments and remarks on the original posting, so I decided to do a follow-up and address some of the issues that got raised... Handling Metric Errors First of all, there is a significant difference between an 'error' and an 'alert'. An 'alert' is the violation of a condition (a threshold) specified for a given metric. That means that the Agent is collecting and gathering the data for the metric, but there is a situation that requires the attention of an administrator. An 'error' on the other hand however, is a failure to collect metric data: The Agent is throwing the error because it cannot determine the value for the metric Whereas the 'alert' guarantees continuity of the metric data, an 'error' signals a big unknown. And the unknown aspect of all this is what makes an error a lot more serious than a regular alert: If you don't know what the current state of affairs is, there could be some serious issues brewing that nobody is aware of... The life-cycle of a Metric Error Clearing a metric error is pretty much the same workflow as a metric 'alert': The Agent signals the error after it failed to execute the metric The error is uploaded to the OMS/repository, where it becomes visible in the Console The error will remain active until the Agent is able to execute the metric successfully. Even though the metric is still getting scheduled and executed on a regular basis, the error will remain outstanding as long as the Agent is not capable of executing the metric correctly Knowing this, the way to fix the metric error should be obvious: Take the 'problem' away, and as soon as the metric is executed again (based on the frequency of the metric), the error will go away. The same tricks used to clear alerts can be used here too: Wait for the next scheduled execution. For those metrics that are executed regularly (like every 15 minutes or so), it's just a matter of waiting those minutes to see the updates. The 'Reevaluate Alert' button can be used to force a re-execution of the metric. In case a metric is executed once a day, this will be a better way to make sure that the underlying problem has been solved. And if it has been, the metric error will be removed, and the regular data points will be uploaded to the repository. And just in case you have to 'force' the issue a little: If you disable and re-enable a metric, it will get re-scheduled. And that means a new metric execution, and an update of the (hopefully) fixed problem. Database server-generated alerts and problem checkers There are various ways the Agent can collect metric data: Via a script or a SQL statement, reading a log file, getting a value from an SNMP OID or listening for SNMP traps or via the DBMS_SERVER_ALERTS mechanism of an Oracle database. For those alert which are generated by the database (like tablespace metrics for 10g and above databases), the Agent just 'waits' for the database to report any new findings. If the Agent has lost the current state of the server-side metrics (due to an incomplete recovery after a disaster, or after an improper use of the 'emctl clearstate' command), the Agent might be still aware of an alert that the database no longer has (or vice versa). The same goes for 'problem checker' alerts: Those metrics that only report data if there is a problem (like the 'invalid objects' metric) will also have a problem if the Agent state has been tampered with (again, the incomplete recovery, and after improper use of 'emctl clearstate' are the two main causes for this). The best way to deal with these kinds of mismatches, is to simple disable and re-enable the metric again: The disabling will clear the state of the metric, and the re-enabling will force a re-execution of the metric, so the new and updated results can get uploaded to the repository. Starting 10gR5, the Agent performs additional checks and verifications after each restart of the Agent and/or each state change of the database (shutdown/startup or failover in case of DataGuard) to catch these kinds of mismatches.

    Read the article

  • So…is it a Seek or a Scan?

    - by Paul White
    You’re probably most familiar with the terms ‘Seek’ and ‘Scan’ from the graphical plans produced by SQL Server Management Studio (SSMS).  The image to the left shows the most common ones, with the three types of scan at the top, followed by four types of seek.  You might look to the SSMS tool-tip descriptions to explain the differences between them: Not hugely helpful are they?  Both mention scans and ranges (nothing about seeks) and the Index Seek description implies that it will not scan the index entirely (which isn’t necessarily true). Recall also yesterday’s post where we saw two Clustered Index Seek operations doing very different things.  The first Seek performed 63 single-row seeking operations; and the second performed a ‘Range Scan’ (more on those later in this post).  I hope you agree that those were two very different operations, and perhaps you are wondering why there aren’t different graphical plan icons for Range Scans and Seeks?  I have often wondered about that, and the first person to mention it after yesterday’s post was Erin Stellato (twitter | blog): Before we go on to make sense of all this, let’s look at another example of how SQL Server confusingly mixes the terms ‘Scan’ and ‘Seek’ in different contexts.  The diagram below shows a very simple heap table with two columns, one of which is the non-clustered Primary Key, and the other has a non-unique non-clustered index defined on it.  The right hand side of the diagram shows a simple query, it’s associated query plan, and a couple of extracts from the SSMS tool-tip and Properties windows. Notice the ‘scan direction’ entry in the Properties window snippet.  Is this a seek or a scan?  The different references to Scans and Seeks are even more pronounced in the XML plan output that the graphical plan is based on.  This fragment is what lies behind the single Index Seek icon shown above: You’ll find the same confusing references to Seeks and Scans throughout the product and its documentation. Making Sense of Seeks Let’s forget all about scans for a moment, and think purely about seeks.  Loosely speaking, a seek is the process of navigating an index B-tree to find a particular index record, most often at the leaf level.  A seek starts at the root and navigates down through the levels of the index to find the point of interest: Singleton Lookups The simplest sort of seek predicate performs this traversal to find (at most) a single record.  This is the case when we search for a single value using a unique index and an equality predicate.  It should be readily apparent that this type of search will either find one record, or none at all.  This operation is known as a singleton lookup.  Given the example table from before, the following query is an example of a singleton lookup seek: Sadly, there’s nothing in the graphical plan or XML output to show that this is a singleton lookup – you have to infer it from the fact that this is a single-value equality seek on a unique index.  The other common examples of a singleton lookup are bookmark lookups – both the RID and Key Lookup forms are singleton lookups (an RID lookup finds a single record in a heap from the unique row locator, and a Key Lookup does much the same thing on a clustered table).  If you happen to run your query with STATISTICS IO ON, you will notice that ‘Scan Count’ is always zero for a singleton lookup. Range Scans The other type of seek predicate is a ‘seek plus range scan’, which I will refer to simply as a range scan.  The seek operation makes an initial descent into the index structure to find the first leaf row that qualifies, and then performs a range scan (either backwards or forwards in the index) until it reaches the end of the scan range. The ability of a range scan to proceed in either direction comes about because index pages at the same level are connected by a doubly-linked list – each page has a pointer to the previous page (in logical key order) as well as a pointer to the following page.  The doubly-linked list is represented by the green and red dotted arrows in the index diagram presented earlier.  One subtle (but important) point is that the notion of a ‘forward’ or ‘backward’ scan applies to the logical key order defined when the index was built.  In the present case, the non-clustered primary key index was created as follows: CREATE TABLE dbo.Example ( key_col INTEGER NOT NULL, data INTEGER NOT NULL, CONSTRAINT [PK dbo.Example key_col] PRIMARY KEY NONCLUSTERED (key_col ASC) ) ; Notice that the primary key index specifies an ascending sort order for the single key column.  This means that a forward scan of the index will retrieve keys in ascending order, while a backward scan would retrieve keys in descending key order.  If the index had been created instead on key_col DESC, a forward scan would retrieve keys in descending order, and a backward scan would return keys in ascending order. A range scan seek predicate may have a Start condition, an End condition, or both.  Where one is missing, the scan starts (or ends) at one extreme end of the index, depending on the scan direction.  Some examples might help clarify that: the following diagram shows four queries, each of which performs a single seek against a column holding every integer from 1 to 100 inclusive.  The results from each query are shown in the blue columns, and relevant attributes from the Properties window appear on the right: Query 1 specifies that all key_col values less than 5 should be returned in ascending order.  The query plan achieves this by seeking to the start of the index leaf (there is no explicit starting value) and scanning forward until the End condition (key_col < 5) is no longer satisfied (SQL Server knows it can stop looking as soon as it finds a key_col value that isn’t less than 5 because all later index entries are guaranteed to sort higher). Query 2 asks for key_col values greater than 95, in descending order.  SQL Server returns these results by seeking to the end of the index, and scanning backwards (in descending key order) until it comes across a row that isn’t greater than 95.  Sharp-eyed readers may notice that the end-of-scan condition is shown as a Start range value.  This is a bug in the XML show plan which bubbles up to the Properties window – when a backward scan is performed, the roles of the Start and End values are reversed, but the plan does not reflect that.  Oh well. Query 3 looks for key_col values that are greater than or equal to 10, and less than 15, in ascending order.  This time, SQL Server seeks to the first index record that matches the Start condition (key_col >= 10) and then scans forward through the leaf pages until the End condition (key_col < 15) is no longer met. Query 4 performs much the same sort of operation as Query 3, but requests the output in descending order.  Again, we have to mentally reverse the Start and End conditions because of the bug, but otherwise the process is the same as always: SQL Server finds the highest-sorting record that meets the condition ‘key_col < 25’ and scans backward until ‘key_col >= 20’ is no longer true. One final point to note: seek operations always have the Ordered: True attribute.  This means that the operator always produces rows in a sorted order, either ascending or descending depending on how the index was defined, and whether the scan part of the operation is forward or backward.  You cannot rely on this sort order in your queries of course (you must always specify an ORDER BY clause if order is important) but SQL Server can make use of the sort order internally.  In the four queries above, the query optimizer was able to avoid an explicit Sort operator to honour the ORDER BY clause, for example. Multiple Seek Predicates As we saw yesterday, a single index seek plan operator can contain one or more seek predicates.  These seek predicates can either be all singleton seeks or all range scans – SQL Server does not mix them.  For example, you might expect the following query to contain two seek predicates, a singleton seek to find the single record in the unique index where key_col = 10, and a range scan to find the key_col values between 15 and 20: SELECT key_col FROM dbo.Example WHERE key_col = 10 OR key_col BETWEEN 15 AND 20 ORDER BY key_col ASC ; In fact, SQL Server transforms the singleton seek (key_col = 10) to the equivalent range scan, Start:[key_col >= 10], End:[key_col <= 10].  This allows both range scans to be evaluated by a single seek operator.  To be clear, this query results in two range scans: one from 10 to 10, and one from 15 to 20. Final Thoughts That’s it for today – tomorrow we’ll look at monitoring singleton lookups and range scans, and I’ll show you a seek on a heap table. Yes, a seek.  On a heap.  Not an index! If you would like to run the queries in this post for yourself, there’s a script below.  Thanks for reading! IF OBJECT_ID(N'dbo.Example', N'U') IS NOT NULL BEGIN DROP TABLE dbo.Example; END ; -- Test table is a heap -- Non-clustered primary key on 'key_col' CREATE TABLE dbo.Example ( key_col INTEGER NOT NULL, data INTEGER NOT NULL, CONSTRAINT [PK dbo.Example key_col] PRIMARY KEY NONCLUSTERED (key_col) ) ; -- Non-unique non-clustered index on the 'data' column CREATE NONCLUSTERED INDEX [IX dbo.Example data] ON dbo.Example (data) ; -- Add 100 rows INSERT dbo.Example WITH (TABLOCKX) ( key_col, data ) SELECT key_col = V.number, data = V.number FROM master.dbo.spt_values AS V WHERE V.[type] = N'P' AND V.number BETWEEN 1 AND 100 ; -- ================ -- Singleton lookup -- ================ ; -- Single value equality seek in a unique index -- Scan count = 0 when STATISTIS IO is ON -- Check the XML SHOWPLAN SELECT E.key_col FROM dbo.Example AS E WHERE E.key_col = 32 ; -- =========== -- Range Scans -- =========== ; -- Query 1 SELECT E.key_col FROM dbo.Example AS E WHERE E.key_col <= 5 ORDER BY E.key_col ASC ; -- Query 2 SELECT E.key_col FROM dbo.Example AS E WHERE E.key_col > 95 ORDER BY E.key_col DESC ; -- Query 3 SELECT E.key_col FROM dbo.Example AS E WHERE E.key_col >= 10 AND E.key_col < 15 ORDER BY E.key_col ASC ; -- Query 4 SELECT E.key_col FROM dbo.Example AS E WHERE E.key_col >= 20 AND E.key_col < 25 ORDER BY E.key_col DESC ; -- Final query (singleton + range = 2 range scans) SELECT E.key_col FROM dbo.Example AS E WHERE E.key_col = 10 OR E.key_col BETWEEN 15 AND 20 ORDER BY E.key_col ASC ; -- === TIDY UP === DROP TABLE dbo.Example; © 2011 Paul White email: [email protected] twitter: @SQL_Kiwi

    Read the article

  • index.html redirecting to cgi-sys/defaultwebpage.cgi

    - by Andrew De Forest
    This problem has been persisting over the last couple of days. I thought I fixed it Friday only to get in on Monday morning and see that cPanel is still giving me issues. On Friday, all incoming traffic to my index.html page were being redirected to cgi-sys/defaultwebpage.cgi. Upon further investigation, I found that my entire index.html code had been overwritten and contained only a single meta tag, which was causing the aforementioned redirect. I re-uploaded the original index page and overwrote the one that was causing the redirects and it seemed to have fixed the problem. Fast forward to Monday. The problem began happening again and I took the same steps as before to fix it, but now I'm wondering how to permanently stop this from happening. From what I've found, it's related to cPanel. I personally did not make any changes to the server, but there are a few people with access who might have. I'm on a VPS with my own dedicated IP address. I've been hosting fine for a few months without this problem. The server is runs a traditional LAMP stack.

    Read the article

  • Hello Operator, My Switch Is Bored

    - by Paul White
    This is a post for T-SQL Tuesday #43 hosted by my good friend Rob Farley. The topic this month is Plan Operators. I haven’t taken part in T-SQL Tuesday before, but I do like to write about execution plans, so this seemed like a good time to start. This post is in two parts. The first part is primarily an excuse to use a pretty bad play on words in the title of this blog post (if you’re too young to know what a telephone operator or a switchboard is, I hate you). The second part of the post looks at an invisible query plan operator (so to speak). 1. My Switch Is Bored Allow me to present the rare and interesting execution plan operator, Switch: Books Online has this to say about Switch: Following that description, I had a go at producing a Fast Forward Cursor plan that used the TOP operator, but had no luck. That may be due to my lack of skill with cursors, I’m not too sure. The only application of Switch in SQL Server 2012 that I am familiar with requires a local partitioned view: CREATE TABLE dbo.T1 (c1 int NOT NULL CHECK (c1 BETWEEN 00 AND 24)); CREATE TABLE dbo.T2 (c1 int NOT NULL CHECK (c1 BETWEEN 25 AND 49)); CREATE TABLE dbo.T3 (c1 int NOT NULL CHECK (c1 BETWEEN 50 AND 74)); CREATE TABLE dbo.T4 (c1 int NOT NULL CHECK (c1 BETWEEN 75 AND 99)); GO CREATE VIEW V1 AS SELECT c1 FROM dbo.T1 UNION ALL SELECT c1 FROM dbo.T2 UNION ALL SELECT c1 FROM dbo.T3 UNION ALL SELECT c1 FROM dbo.T4; Not only that, but it needs an updatable local partitioned view. We’ll need some primary keys to meet that requirement: ALTER TABLE dbo.T1 ADD CONSTRAINT PK_T1 PRIMARY KEY (c1);   ALTER TABLE dbo.T2 ADD CONSTRAINT PK_T2 PRIMARY KEY (c1);   ALTER TABLE dbo.T3 ADD CONSTRAINT PK_T3 PRIMARY KEY (c1);   ALTER TABLE dbo.T4 ADD CONSTRAINT PK_T4 PRIMARY KEY (c1); We also need an INSERT statement that references the view. Even more specifically, to see a Switch operator, we need to perform a single-row insert (multi-row inserts use a different plan shape): INSERT dbo.V1 (c1) VALUES (1); And now…the execution plan: The Constant Scan manufactures a single row with no columns. The Compute Scalar works out which partition of the view the new value should go in. The Assert checks that the computed partition number is not null (if it is, an error is returned). The Nested Loops Join executes exactly once, with the partition id as an outer reference (correlated parameter). The Switch operator checks the value of the parameter and executes the corresponding input only. If the partition id is 0, the uppermost Clustered Index Insert is executed, adding a row to table T1. If the partition id is 1, the next lower Clustered Index Insert is executed, adding a row to table T2…and so on. In case you were wondering, here’s a query and execution plan for a multi-row insert to the view: INSERT dbo.V1 (c1) VALUES (1), (2); Yuck! An Eager Table Spool and four Filters! I prefer the Switch plan. My guess is that almost all the old strategies that used a Switch operator have been replaced over time, using things like a regular Concatenation Union All combined with Start-Up Filters on its inputs. Other new (relative to the Switch operator) features like table partitioning have specific execution plan support that doesn’t need the Switch operator either. This feels like a bit of a shame, but perhaps it is just nostalgia on my part, it’s hard to know. Please do let me know if you encounter a query that can still use the Switch operator in 2012 – it must be very bored if this is the only possible modern usage! 2. Invisible Plan Operators The second part of this post uses an example based on a question Dave Ballantyne asked using the SQL Sentry Plan Explorer plan upload facility. If you haven’t tried that yet, make sure you’re on the latest version of the (free) Plan Explorer software, and then click the Post to SQLPerformance.com button. That will create a site question with the query plan attached (which can be anonymized if the plan contains sensitive information). Aaron Bertrand and I keep a close eye on questions there, so if you have ever wanted to ask a query plan question of either of us, that’s a good way to do it. The problem The issue I want to talk about revolves around a query issued against a calendar table. The script below creates a simplified version and adds 100 years of per-day information to it: USE tempdb; GO CREATE TABLE dbo.Calendar ( dt date NOT NULL, isWeekday bit NOT NULL, theYear smallint NOT NULL,   CONSTRAINT PK__dbo_Calendar_dt PRIMARY KEY CLUSTERED (dt) ); GO -- Monday is the first day of the week for me SET DATEFIRST 1;   -- Add 100 years of data INSERT dbo.Calendar WITH (TABLOCKX) (dt, isWeekday, theYear) SELECT CA.dt, isWeekday = CASE WHEN DATEPART(WEEKDAY, CA.dt) IN (6, 7) THEN 0 ELSE 1 END, theYear = YEAR(CA.dt) FROM Sandpit.dbo.Numbers AS N CROSS APPLY ( VALUES (DATEADD(DAY, N.n - 1, CONVERT(date, '01 Jan 2000', 113))) ) AS CA (dt) WHERE N.n BETWEEN 1 AND 36525; The following query counts the number of weekend days in 2013: SELECT Days = COUNT_BIG(*) FROM dbo.Calendar AS C WHERE theYear = 2013 AND isWeekday = 0; It returns the correct result (104) using the following execution plan: The query optimizer has managed to estimate the number of rows returned from the table exactly, based purely on the default statistics created separately on the two columns referenced in the query’s WHERE clause. (Well, almost exactly, the unrounded estimate is 104.289 rows.) There is already an invisible operator in this query plan – a Filter operator used to apply the WHERE clause predicates. We can see it by re-running the query with the enormously useful (but undocumented) trace flag 9130 enabled: Now we can see the full picture. The whole table is scanned, returning all 36,525 rows, before the Filter narrows that down to just the 104 we want. Without the trace flag, the Filter is incorporated in the Clustered Index Scan as a residual predicate. It is a little bit more efficient than using a separate operator, but residual predicates are still something you will want to avoid where possible. The estimates are still spot on though: Anyway, looking to improve the performance of this query, Dave added the following filtered index to the Calendar table: CREATE NONCLUSTERED INDEX Weekends ON dbo.Calendar(theYear) WHERE isWeekday = 0; The original query now produces a much more efficient plan: Unfortunately, the estimated number of rows produced by the seek is now wrong (365 instead of 104): What’s going on? The estimate was spot on before we added the index! Explanation You might want to grab a coffee for this bit. Using another trace flag or two (8606 and 8612) we can see that the cardinality estimates were exactly right initially: The highlighted information shows the initial cardinality estimates for the base table (36,525 rows), the result of applying the two relational selects in our WHERE clause (104 rows), and after performing the COUNT_BIG(*) group by aggregate (1 row). All of these are correct, but that was before cost-based optimization got involved :) Cost-based optimization When cost-based optimization starts up, the logical tree above is copied into a structure (the ‘memo’) that has one group per logical operation (roughly speaking). The logical read of the base table (LogOp_Get) ends up in group 7; the two predicates (LogOp_Select) end up in group 8 (with the details of the selections in subgroups 0-6). These two groups still have the correct cardinalities as trace flag 8608 output (initial memo contents) shows: During cost-based optimization, a rule called SelToIdxStrategy runs on group 8. It’s job is to match logical selections to indexable expressions (SARGs). It successfully matches the selections (theYear = 2013, is Weekday = 0) to the filtered index, and writes a new alternative into the memo structure. The new alternative is entered into group 8 as option 1 (option 0 was the original LogOp_Select): The new alternative is to do nothing (PhyOp_NOP = no operation), but to instead follow the new logical instructions listed below the NOP. The LogOp_GetIdx (full read of an index) goes into group 21, and the LogOp_SelectIdx (selection on an index) is placed in group 22, operating on the result of group 21. The definition of the comparison ‘the Year = 2013’ (ScaOp_Comp downwards) was already present in the memo starting at group 2, so no new memo groups are created for that. New Cardinality Estimates The new memo groups require two new cardinality estimates to be derived. First, LogOp_Idx (full read of the index) gets a predicted cardinality of 10,436. This number comes from the filtered index statistics: DBCC SHOW_STATISTICS (Calendar, Weekends) WITH STAT_HEADER; The second new cardinality derivation is for the LogOp_SelectIdx applying the predicate (theYear = 2013). To get a number for this, the cardinality estimator uses statistics for the column ‘theYear’, producing an estimate of 365 rows (there are 365 days in 2013!): DBCC SHOW_STATISTICS (Calendar, theYear) WITH HISTOGRAM; This is where the mistake happens. Cardinality estimation should have used the filtered index statistics here, to get an estimate of 104 rows: DBCC SHOW_STATISTICS (Calendar, Weekends) WITH HISTOGRAM; Unfortunately, the logic has lost sight of the link between the read of the filtered index (LogOp_GetIdx) in group 22, and the selection on that index (LogOp_SelectIdx) that it is deriving a cardinality estimate for, in group 21. The correct cardinality estimate (104 rows) is still present in the memo, attached to group 8, but that group now has a PhyOp_NOP implementation. Skipping over the rest of cost-based optimization (in a belated attempt at brevity) we can see the optimizer’s final output using trace flag 8607: This output shows the (incorrect, but understandable) 365 row estimate for the index range operation, and the correct 104 estimate still attached to its PhyOp_NOP. This tree still has to go through a few post-optimizer rewrites and ‘copy out’ from the memo structure into a tree suitable for the execution engine. One step in this process removes PhyOp_NOP, discarding its 104-row cardinality estimate as it does so. To finish this section on a more positive note, consider what happens if we add an OVER clause to the query aggregate. This isn’t intended to be a ‘fix’ of any sort, I just want to show you that the 104 estimate can survive and be used if later cardinality estimation needs it: SELECT Days = COUNT_BIG(*) OVER () FROM dbo.Calendar AS C WHERE theYear = 2013 AND isWeekday = 0; The estimated execution plan is: Note the 365 estimate at the Index Seek, but the 104 lives again at the Segment! We can imagine the lost predicate ‘isWeekday = 0’ as sitting between the seek and the segment in an invisible Filter operator that drops the estimate from 365 to 104. Even though the NOP group is removed after optimization (so we don’t see it in the execution plan) bear in mind that all cost-based choices were made with the 104-row memo group present, so although things look a bit odd, it shouldn’t affect the optimizer’s plan selection. I should also mention that we can work around the estimation issue by including the index’s filtering columns in the index key: CREATE NONCLUSTERED INDEX Weekends ON dbo.Calendar(theYear, isWeekday) WHERE isWeekday = 0 WITH (DROP_EXISTING = ON); There are some downsides to doing this, including that changes to the isWeekday column may now require Halloween Protection, but that is unlikely to be a big problem for a static calendar table ;)  With the updated index in place, the original query produces an execution plan with the correct cardinality estimation showing at the Index Seek: That’s all for today, remember to let me know about any Switch plans you come across on a modern instance of SQL Server! Finally, here are some other posts of mine that cover other plan operators: Segment and Sequence Project Common Subexpression Spools Why Plan Operators Run Backwards Row Goals and the Top Operator Hash Match Flow Distinct Top N Sort Index Spools and Page Splits Singleton and Range Seeks Bitmaps Hash Join Performance Compute Scalar © 2013 Paul White – All Rights Reserved Twitter: @SQL_Kiwi

    Read the article

  • Pyglet vs. PyQt

    - by L. De Leo
    I need to implement a really simple card game. As the game logic is written in Python I chose to stick with some Python framework even if my goal is to develop a Windows only version. I also don't like to work with .NET so I ruled out Iron Python + WPF. I tried to write a simple prototype with Pyglet but soon discovered that I will have to do a lot of stuff by hand: things like detecting mouseover events and finding which card was clicked on and moving it, etc... very low level and unnecessary for my use-case. So I thought it might be easier to do things in PyQt. Do you reckon it would be feasible to use PyQt for implementing a simple card game? Will I have higher level events I can work with?

    Read the article

  • Abstracting away the type of a property

    - by L. De Leo
    In Python luckily most of the times you don't have to write getters and setters to get access to class properties. That said sometimes you'll have to remember that a certain property is a list or whatnot and a property would save you there by abstracting the type and providing a setter to add something to such list for example rather than exposing the list directly. Where do you draw the line between exposing the type directly or wrapping its access in a property? What's the general "pythonic" advice?

    Read the article

  • black screen after waking up from suspend

    - by Daniel De Leon
    running 11.10 and on wake up after being suspended (for a long time or 2 seconds, doesn't matter) I am presented with a black screen that just has a purple bar (about the size of the top menu bar). As far as I can tell the system is unresponsive to anything outside of a manual restart by pressing the power button on the tower, after which it starts normally. using the most up to date graphics driver, also tried switching to a different version, nothing. just switched back to Linux and this is really giving me a headache, I would really like this to work. Any help would be greatly appreciated.

    Read the article

  • Something to add to your library...

    - by werner.de.gruyter
    There is a new book in town: The Grid Control Handbook. Featuring an in-depth discussion of what Grid Control is and what Grid Control can do for your IT environment. It starts right at the beginning, and guides you through the all steps of a typical deployment: From the planning phase, to installing, to the strengthening of the environment and finally (most importantly) the maintenance and daily-use of the product. And there are quite a few tips, tricks, workshops and best practices along the way to help you with some very practical day-to-day challenges. For all those using Grid Control, something definitely worth checking out!

    Read the article

  • Can find any /.ecryptfs dir to retrieve my encrypted home dir

    - by Roberto de Armas
    At firs sorry for my english, it isn't my native languaje. I've readed some questions similar but no the exactly whith the same problem. I've moved my home directory to a separated partitión (Ubuntu 11.10) following this tutorial http://www.ubuntu-es.org/node/58233 After checking that they were all my files and folders (forgetting that one of dirs was encrypted by ecryptfs) i've installed fedora 16. Well, suprised when in my home folder was an Readme.txt advising me that my folder was unmounted for security reasons and proposing to type in comand line "ecryptfs-mount-Private" (din't work) or make click on labeled icon "acces your private data desktop" (neither din't work). After three days reading all i could find on the internet, i follow The Dustin Kirkland tutorial in http://blog.dustinkirkland.com/2011/04/introducing-ecryptfs-recover-private.html, but any /.ecryptfs was found. I'm sure that the data are somewere (the size of the moved dir is identical to the original one). Any help would by greatly appreciated. Thaks a lot.

    Read the article

  • Game software design

    - by L. De Leo
    I have been working on a simple implementation of a card game in object oriented Python/HTML/Javascript and building on the top of Django. At this point the game is in its final stage of development but, while spotting a big issue about how I was keeping the application state (basically using a global variable), I reached the point that I'm stuck. The thing is that ignoring the design flaw, in a single-threaded environment such as under the Django development server, the game works perfectly. While I tried to design classes cleanly and keep methods short I now have in front of me an issue that has been keeping me busy for the last 2 days and that countless print statements and visual debugging hasn't helped me spot. The reason I think has to do with some side-effects of functions and to solve it I've been wondering if maybe refactoring the code entirely with static classes that keep no state and just passing the state around might be a good option to keep side-effects under control. Or maybe trying to program it in a functional programming style (although I'm not sure Python allows for a purely functional style). I feel that now there's already too many layers that the software (which I plan to make incredibly more complex by adding non trivial features) has already become unmanageable. How would you suggest I re-take control of my code-base that (despite being still only at < 1000 LOC) seems to have taken a life of its own?

    Read the article

  • Should devs, testers and business users have one unified test script?

    - by Carlos Jaime C. De Leon
    In development, I would normally have my own test scripts that would document the data, scenarios and execution steps that I plan to test; this is my dev test plan. When the functionality has been deployed to Test, testers test it using their own test script that they wrote. In UAT, the business user then tests using their own test plan. In retrospect, it looks like this provides a better coverage, with dev tests having a mix of black and white box testing, while testers and business users focus on black box testing. But on the other hand, this brings up distinct test cases that only are executed per stage (ie. some cases which testers thought of are only executed on Test stage) and it would like the dev missed it, which makes it a finding/bug. Is it worth consolidating the test scripts from the start? Thus using one unified test script, or is it abit difficult to do this upfront?

    Read the article

  • Why is Adobe Air so underrated for building mobile apps?

    - by Marcelo de Assis
    I worked with Adobe Flash related technologies for the last 5 years, although not being a big fan of Adobe. I see some little bugs happening in some apps, but I cannot imagine why a lot of big companies do not even think to use use Adobe Air as a good technology for their mobile apps. I see a lot of mobile developer positions asking for experts in Android or iOS , but very much less positions asking for Adobe Air, even when Adobe Air apps have the advantage of being multi-plataform, with the same app working in Blackberry, iOS and Android. Is so much easier to develop a game using Flash, than using Android SDK, for example. It really have flaws (that I never saw) or it is just some kind of mass prejudgement? I also would like to hear what a project manager or a indie developer takes when choosing a plataform for building apps.

    Read the article

  • Connect to new wireless network

    - by Shawn de Wet
    Wireless networking was working perfectly on my old adsl router. Then I moved to a new home that does not have adsl, and the router has been replaced with a new 3G one. I have worked through the following links: Connecting to a wireless network and http://ubuntuforums.org/showthread.php?t=571188 I have set the router from it's default WPA-PSK encryption to WEP encryption and also to No Encryption. But when I get to dhclient -wlan0, I keep getting No DHCPOFFERS received. Yet the 2 windows machines in my home connect fine to this wirelesss network (in all encryption settings). If I try iwlist scan I can see that the wireless network is indeed visible. Where do I start scratching to see where the problem may lie.

    Read the article

  • Which algorithm used in Advance Wars type turn based games

    - by Jan de Lange
    Has anyone tried to develop, or know of an algorithm such as used in a typical turn based game like Advance Wars, where the number of objects and the number of moves per object may be too large to search through up to a reasonable depth like one would do in a game with a smaller search base like chess? There is some path-finding needed to to engage into combat, harvest, or move to an object, so that in the next move such actions are possible. With this you can build a search tree for each item, resulting in a large tree for all items. With a cost function one can determine the best moves. Then the board flips over to the player role (min/max) and the computer searches the best player move, and flips back etc. upto a number of cycles deep. Finally it has found the best move and now it's the players turn. But he may be asleep by now... So how is this done in practice? I have found several good sources on A*, DFS, BFS, evaluation / cost functions etc. But as of yet I do not see how I can put it all together.

    Read the article

  • Firefox ignore GTK theme for localhost stuff

    - by Mario De Schaepmeester
    I know this is pretty much a duplicate of How can one make firefox ignore my GTK theme entirely?, but the answers on that one are no permanent solution. It works by launching firefox from the terminal. I would like to know a solution that works for every instance of firefox no matter how it was created. There is the possibility to edit the userContent.css file, but the settings you make randomly do not apply to some sites or in some situations, strangely, even with the !important added... I have a dark GTK theme and this results in some textboxes having a black background with black text with a userContent.css that has input, textarea { color: black !important; background-color: white !important; } Update I changed a setting in about:config from true to false, namely browser.display.use_system_colors. Everything appears normal and well now, for one exception: everything that runs on localhost. This includes PHPMyAdmin and a website I am making. I would like to know if there is a solution to this.

    Read the article

  • Accessing host LVM partition from Windows XP through Virt.manager 0.8.5 / Qemu / KVM

    - by Nico de Smidt
    Hi, requested use case is having a Windows XP SP3 guest running in 64bit Ubuntu. (Linux pcs 2.6.35-22-server #35-Ubuntu SMP Sat Oct 16 22:02:33 UTC 2010 x86_64 GNU/Linux) I want this guest to access an LVM LV on the Ubuntu disk. I've setup the following LVM config: --- Logical volume --- LV Name /dev/storage/sdc1 VG Name storage LV UUID Zg5IMC-OlqB-prL5-fgg4-3A9A-OgKP-oZ0QkJ LV Write Access read/write LV Status available # open 0 LV Size 1.01 GiB Current LE 259 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 256 Block device 251:3 -- 1) I've setup a storage pool for /dev/storage 2) I've mkfs.vfat /dev/storage/sdc1 3) and made a virtual IDE disk in the virt-manager setup for the guest. Target device: IDE Disk 2 Source path: /dev/storage/sdc1 -- Now when running XP (guest) Windows sees a new disk in Disk Manager and want's to install a partition on it, since it believes the drive is empty. After formatting from within Windows I can put data on the new disk volume. -- Back in Ubuntu however I cannot access this this any more since it created a partition within an LVM Logical Volume. Running fdisk -l shows the following: root@pcs:/media# fdisk -l /dev/storage/sdc1 Disk /dev/storage/sdc1: 1086 MB, 1086324736 bytes 32 heads, 63 sectors/track, 1052 cylinders Units = cylinders of 2016 * 512 = 1032192 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x8d72e4f4 Device Boot Start End Blocks Id System /dev/storage/sdc1p1 1 1050 1058368+ c W95 FAT32 (LBA) -- which seems fine to me, but when trying to mount /dev/storage/sdc1p1 I get the following error: mount /dev/storage/sdc1p1 /media/xp mount: special device /dev/storage/sdc1p1 does not exist which makes sense since in lvdisplay sdc1p1 does not exist Main question: I want to mount the vfat partition in both Ubuntu and XP What am I missing here????? regards, and thanks for your consideration. Nico

    Read the article

  • DBCC CHECKDB on VVLDB and latches (Or: My Pain is Your Gain)

    - by Argenis
      Does your CHECKDB hurt, Argenis? There is a classic blog series by Paul Randal [blog|twitter] called “CHECKDB From Every Angle” which is pretty much mandatory reading for anybody who’s even remotely considering going for the MCM certification, or its replacement (the Microsoft Certified Solutions Master: Data Platform – makes my fingers hurt just from typing it). Of particular interest is the post “Consistency Options for a VLDB” – on it, Paul provides solid, timeless advice (I use the word “timeless” because it was written in 2007, and it all applies today!) on how to perform checks on very large databases. Well, here I was trying to figure out how to make CHECKDB run faster on a restored copy of one of our databases, which happens to exceed 7TB in size. The whole thing was taking several days on multiple systems, regardless of the storage used – SAS, SATA or even SSD…and I actually didn’t pay much attention to how long it was taking, or even bothered to look at the reasons why - as long as it was finishing okay and found no consistency errors. Yes – I know. That was a huge mistake, as corruption found in a database several days after taking place could only allow for further spread of the corruption – and potentially large data loss. In the last two weeks I increased my attention towards this problem, as we noticed that CHECKDB was taking EVEN LONGER on brand new all-flash storage in the SAN! I couldn’t really explain it, and were almost ready to blame the storage vendor. The vendor told us that they could initially see the server driving decent I/O – around 450Mb/sec, and then it would settle at a very slow rate of 10Mb/sec or so. “Hum”, I thought – “CHECKDB is just not pushing the I/O subsystem hard enough”. Perfmon confirmed the vendor’s observations. Dreaded @BlobEater What was CHECKDB doing all the time while doing so little I/O? Eating Blobs. It turns out that CHECKDB was taking an extremely long time on one of our frankentables, which happens to be have 35 billion rows (yup, with a b) and sucks up several terabytes of space in the database. We do have a project ongoing to purge/split/partition this table, so it’s just a matter of time before we deal with it. But the reality today is that CHECKDB is coming to a screeching halt in performance when dealing with this particular table. Checking sys.dm_os_waiting_tasks and sys.dm_os_latch_stats showed that LATCH_EX (DBCC_OBJECT_METADATA) was by far the top wait type. I remembered hearing recently about that wait from another post that Paul Randal made, but that was related to computed-column indexes, and in fact, Paul himself reminded me of his article via twitter. But alas, our pathologic table had no non-clustered indexes on computed columns. I knew that latches are used by the database engine to do internal synchronization – but how could I help speed this up? After all, this is stuff that doesn’t have a lot of knobs to tweak. (There’s a fantastic level 500 talk by Bob Ward from Microsoft CSS [blog|twitter] called “Inside SQL Server Latches” given at PASS 2010 – and you can check it out here. DISCLAIMER: I assume no responsibility for any brain melting that might ensue from watching Bob’s talk!) Failed Hypotheses Earlier on this week I flew down to Palo Alto, CA, to visit our Headquarters – and after having a great time with my Monkey peers, I was relaxing on the plane back to Seattle watching a great talk by SQL Server MVP and fellow MCM Maciej Pilecki [twitter] called “Masterclass: A Day in the Life of a Database Transaction” where he discusses many different topics related to transaction management inside SQL Server. Very good stuff, and when I got home it was a little late – that slow DBCC CHECKDB that I had been dealing with was way in the back of my head. As I was looking at the problem at hand earlier on this week, I thought “How about I set the database to read-only?” I remembered one of the things Maciej had (jokingly) said in his talk: “if you don’t want locking and blocking, set the database to read-only” (or something to that effect, pardon my loose memory). I immediately killed the CHECKDB which had been running painfully for days, and set the database to read-only mode. Then I ran DBCC CHECKDB against it. It started going really fast (even a bit faster than before), and then throttled down again to around 10Mb/sec. All sorts of expletives went through my head at the time. Sure enough, the same latching scenario was present. Oh well. I even spent some time trying to figure out if NUMA was hurting performance. Folks on Twitter made suggestions in this regard (thanks, Lonny! [twitter]) …Eureka? This past Friday I was still scratching my head about the whole thing; I was ready to start profiling with XPERF to see if I could figure out which part of the engine was to blame and then get Microsoft to look at the evidence. After getting a bunch of good news I’ll blog about separately, I sat down for a figurative smack down with CHECKDB before the weekend. And then the light bulb went on. A sparse column. I thought that I couldn’t possibly be experiencing the same scenario that Paul blogged about back in March showing extreme latching with non-clustered indexes on computed columns. Did I even have a non-clustered index on my sparse column? As it turns out, I did. I had one filtered non-clustered index – with the sparse column as the index key (and only column). To prove that this was the problem, I went and setup a test. Yup, that'll do it The repro is very simple for this issue: I tested it on the latest public builds of SQL Server 2008 R2 SP2 (CU6) and SQL Server 2012 SP1 (CU4). First, create a test database and a test table, which only needs to contain a sparse column: CREATE DATABASE SparseColTest; GO USE SparseColTest; GO CREATE TABLE testTable (testCol smalldatetime SPARSE NULL); GO INSERT INTO testTable (testCol) VALUES (NULL); GO 1000000 That’s 1 million rows, and even though you’re inserting NULLs, that’s going to take a while. In my laptop, it took 3 minutes and 31 seconds. Next, we run DBCC CHECKDB against the database: DBCC CHECKDB('SparseColTest') WITH NO_INFOMSGS, ALL_ERRORMSGS; This runs extremely fast, as least on my test rig – 198 milliseconds. Now let’s create a filtered non-clustered index on the sparse column: CREATE NONCLUSTERED INDEX [badBadIndex] ON testTable (testCol) WHERE testCol IS NOT NULL; With the index in place now, let’s run DBCC CHECKDB one more time: DBCC CHECKDB('SparseColTest') WITH NO_INFOMSGS, ALL_ERRORMSGS; In my test system this statement completed in 11433 milliseconds. 11.43 full seconds. Quite the jump from 198 milliseconds. I went ahead and dropped the filtered non-clustered indexes on the restored copy of our production database, and ran CHECKDB against that. We went down from 7+ days to 19 hours and 20 minutes. Cue the “Argenis is not impressed” meme, please, Mr. LaRock. My pain is your gain, folks. Go check to see if you have any of such indexes – they’re likely causing your consistency checks to run very, very slow. Happy CHECKDBing, -Argenis ps: I plan to file a Connect item for this issue – I consider it a pretty serious bug in the engine. After all, filtered indexes were invented BECAUSE of the sparse column feature – and it makes a lot of sense to use them together. Watch this space and my twitter timeline for a link.

    Read the article

  • How to build MVC Views that work with polymorphic domain model design?

    - by Johann de Swardt
    This is more of a "how would you do it" type of question. The application I'm working on is an ASP.NET MVC4 app using Razor syntax. I've got a nice domain model which has a few polymorphic classes, awesome to work with in the code, but I have a few questions regarding the MVC front-end. Views are easy to build for normal classes, but when it comes to the polymorphic ones I'm stuck on deciding how to implement them. The one (ugly) option is to build a page which handles the base type (eg. IContract) and has a bunch of if statements to check if we passed in a IServiceContract or ISupplyContract instance. Not pretty and very nasty to maintain. The other option is to build a view for each of these IContract child classes, breaking DRY principles completely. Don't like doing this for obvious reasons. Another option (also not great) is to split the view into chunks with partials and build partial views for each of the child types that are loaded into the main view for the base type, then deciding to show or hide the partial in a single if statement in the partial. Also messy. I've also been thinking about building a master page with sections for the fields that only occur in subclasses and to build views for each subclass referencing the master page. This looks like the least problematic solution? It will allow for fairly simple maintenance and it doesn't involve code duplication. What are your thoughts? Am I missing something obvious that will make our lives easier? Suggestions?

    Read the article

  • Stuck on Ubuntu screen during installation of 12.10

    - by Rick de Groot
    From 12.04 I changed to win 8 consumer preview and now i want to go to 12.10, however i cant seem get past the Ubuntu loadingscreen( with the dots going from white to yellow etc.) it has taken more then an hour now and all my DVD drive does is speed up and slow down every 3 seconds. So i tried it again, but nothing changed im stuck on the same spot. what could cause this problem? And even better, how do i solve it?

    Read the article

  • Javascript slider Image and text from php, scrollable in groups by indexes

    - by Roberto de Nobrega
    I am looking for a javascript solution that slides images with text, pulled from php. This slider will slide in groups by indexes in points. I was googling, but nothing as I need. I am going to make an example. Imagine 10 products. I need to show the principal picture, and a text below the image. It is going to show 6 products, and with points (indexes), I click and the group slides to the next group. Do you know some script.?? I know the php code, but I am a newbie with javascript.! Thanks.!! PD. I am lost of where i have to put this question. So, If this was a wrong place, let me know, and accept my apologises.! ;)

    Read the article

  • Javascript slider Image and text from php, scrollable in groups by indexes

    - by Roberto de Nobrega
    I am looking for a javascript solution that slides images with text, pulled from php. This slider will slide in groups by indexes in points. I was googling, but nothing as I need. I am going to make an example. Imagine 10 products. I need to show the principal picture, and a text below the image. It is going to show 6 products, and with points (indexes), I click and the group slides to the next group. Do you know some script.?? I know the php code, but I am a newbie with javascript.! Thanks.!! PD. I am lost of where i have to put this question. So, If this was a wrong place, let me know, and accept my apologises.! ;)

    Read the article

  • Patterns to refactor common code in multi-platform software

    - by L. De Leo
    I have a Django application and a PyQt application that share a lot of code. A big chunk of the PyQt application are copied verbatim from the Django application's views. As this is a game, I have already an engine.py module that I'm sharing among the two applications, but I was wondering how to restructure the middle layer (what in Django corresponds to the largest part of the views minus the return HttpResponse part) into its own component. In the web application the components are those of a classic Django application (with the only exception that I don't make any use of models): the game engine the url dispatcher the template the views My PyQt application is divided into: the game engine the UI definition where I declare the UI components and react to the events (basically this takes the place of the template and the url dispatcher in the Django app) the controller where I instantiate the window object and reproduce the methods that map the views in the Django app

    Read the article

< Previous Page | 145 146 147 148 149 150 151 152 153 154 155 156  | Next Page >