Search Results

Search found 621 results on 25 pages for 'optimizer transformations'.

Page 7/25 | < Previous Page | 3 4 5 6 7 8 9 10 11 12 13 14 | Next Page >

BAM design pointers

- by Kavitha Srinivasan

In working recently with a large Oracle customer on SOA and BAM, I discovered that some BAM best practices are not quite well known as I had always assumed ! There is a doc bug out to formally incorporate those learnings but here are a few notes.. EMS-DO parity When using EMS (Enterprise Message Source) as a BAM feed, the best practice is to use one EMS to write to one Data Object. There is a possibility of collisions and duplicates when multiple EMS write to the same row of a DO at the same time. This customer had 17 EMS writing to one DO at the same time. Every sensor in their BPEL process writes to one topic but the Topic was read by 1 EMS corresponding to one sensor. They then used XSL within BAM to transform the payload into the BAM DO format. And hence for a given BPEL instance, 17 sensors fired, populated 1 JMS topic, was consumed by 17 EMS which in turn wrote to 1 DataObject.(You can image what would happen for later versions of the application that needs to send more information to BAM !). We modified their design to use one Master XSL based on sensorname for all sensors relating to a DO- say Data Object 'Orders' and were able to thus reduce the 17 EMS to 1 with a master XSL. For those of you wondering about how squeaky clean this design is, you are right ! This is indeed not squeaky clean and that brings us to yet another 'inferred' best practice. (I try very hard not to state the obvious in my blogs with the hope that everytime I blog, it is very useful but this one is an exception.) Transformations and Calculations It is optimal to do transformations within an engine like BPEL. Not only does this provide modelling ease with a nice GUI XSL mapper in JDeveloper, the XSL engine in BPEL is quite efficient at runtime as well. And so, doing XSL transformations in BAM is not quite prudent. The same is true for any non-trivial calculations as well. It is best to do all transformations,calcuations and sanitize the data in a BPEL or like layer and then send this to BAM (via JMS, WS etc.) This then delegates simply the function of report rendering and mechanics of real-time reporting to the Oracle BAM reporting tool which it is most suited to do. All nulls are not created equal Here is yet another possibly known fact but reiterated here. For an EMS with an Upsert operation: a) If Empty tags or tags with no value are sent like <Tag1/> or <Tag1></Tag1>, the DO will be overwritten with --null-- b) If Empty tags are suppressed ie not generated at all, the corresponding DO field will NOT be overwritten. The field will have whatever value existed previously. For an EMS with an Insert operation, both tags with an empty value and no tags result in –null-- being written to the DO. Hope this helps .. Happy 4th!

Read the article
July, the 31 Days of SQL Server DMO’s – Day 25 (sys.dm_db_missing_index_details)

- by Tamarick Hill

The sys.dm_db_missing_index_details Dynamic Management View is used to return information about missing indexes on your SQL Server instances. These indexes are ones that the optimizer has identified as indexes it would like to use but did not have. You may also see these same indexes indicated in other tools such as query execution plans or the Database tuning advisor. Let’s execute this DMV so we can review the information it provides us. I do not have any missing index information for my AdventureWorks2012 database, but for the purposes of illustrating the result set of this DMV, I will present the results from my msdb database. SELECT * FROM sys.dm_db_missing_index_details The first column presented is the index_handle which uniquely identifies a particular missing index. The next two columns represent the database_id and the object_id for the particular table in question. Next is the ‘equality_columns’ column which gives you a list of columns (comma separated) that would be beneficial to the optimizer for equality operations. By equality operation I mean for any queries that would use a filter or join condition such as WHERE A = B. The next column, ‘inequality_columns’, gives you a comma separated list of columns that would be beneficial to the optimizer for inequality operations. An inequality operation is anything other than A = B. For example, “WHERE A != B”, “WHERE A > B”, “WHERE A < B”, and “WHERE A <> B” would all qualify as inequality. Next is the ‘included_columns’ column which list all columns that would be beneficial to the optimizer for purposes of providing a covering index and preventing key/bookmark lookups. Lastly is the ‘statement’ column which lists the name of the table where the index is missing. This DMV can help you identify potential indexes that could be added to improve the performance of your system. However, I will advise you not to just take the output of this DMV and create an index for everything you see. Everything listed here should be analyzed and then tested on a Development or Test system before implementing into a Production environment. For more information on this DMV, please see the below Books Online link: http://msdn.microsoft.com/en-us/library/ms345434.aspx Follow me on Twitter @PrimeTimeDBA

Read the article
Showplan Operator of the Week - Merge Interval

When Fabiano agreed to undertake the epic task of describing each showplan operator, none of us quite predicted the interesting ways that the series helps to understand how the query optimizer works. With the Merge Interval, Fabiano comes up with some insights about the way that the Query optimizer handles overlapping ranges efficiently. Free trial of SQL Backup™“SQL Backup was able to cut down my backup time significantly AND achieved a 90% compression at the same time!” Joe Cheng. Download a free trial now.

Read the article
Showplan Operator of the Week - Lazy Spool

Continuing to illuminate the depths of SQL Server's Query Optimizer, Fabiano shines a light on the sixth major Showplan Operator on his list: the Lazy Spool. What does the Lazy Spool do that's so special, how does the Query Optimizer use it, and why is it so Lazy? Fabiano explains all...

Read the article
SSIS Virtual Class

- by ejohnson2010

I recorded a Virtual SSIS Class with the good folks over at SSWUG and the first airing of the class will by May 15th. This is 100% online so you can do it on your own time and from anywhere. The class will run monthly and I will be available for questions through out. You get the following 12 sessions on SSIS, each about an hour. Session 1: The SSIS Basics Session 2: Control Flow Basics Session 3: Data Flow - Sources and Destinations Session 4: Data Flow - Transformations Session 5: Advanced Transformations...(read more)

Read the article
Exploiting Data Parallelism in Ordered Data Streams

• Many compute-intensive applications involve complex transformations of ordered input data to ordered output data. While the algorithms employed in these transformations are often parallel, managing the I/O order dependence can be a challenge.

Read the article
Class member functions instantiated by traits

- by Jive Dadson

I am reluctant to say I can't figure this out, but I can't figure this out. I've googled and searched Stack Overflow, and come up empty. The abstract, and possibly overly vague form of the question is, how can I use the traits-pattern to instantiate non-virtual member functions? The question came up while modernizing a set of multivariate function optimizers that I wrote more than 10 years ago. The optimizers all operate by selecting a straight-line path through the parameter space away from the current best point (the "update"), then finding a better point on that line (the "line search"), then testing for the "done" condition, and if not done, iterating. There are different methods for doing the update, the line-search, and conceivably for the done test, and other things. Mix and match. Different update formulae require different state-variable data. For example, the LMQN update requires a vector, and the BFGS update requires a matrix. If evaluating gradients is cheap, the line-search should do so. If not, it should use function evaluations only. Some methods require more accurate line-searches than others. Those are just some examples. The original version instantiates several of the combinations by means of virtual functions. Some traits are selected by setting mode bits that are tested at runtime. Yuck. It would be trivial to define the traits with #define's and the member functions with #ifdef's and macros. But that's so twenty years ago. It bugs me that I cannot figure out a whiz-bang modern way. If there were only one trait that varied, I could use the curiously recurring template pattern. But I see no way to extend that to arbitrary combinations of traits. I tried doing it using boost::enable_if, etc.. The specialized state information was easy. I managed to get the functions done, but only by resorting to non-friend external functions that have the this-pointer as a parameter. I never even figured out how to make the functions friends, much less member functions. The compiler (VC++ 2008) always complained that things didn't match. I would yell, "SFINAE, you moron!" but the moron is probably me. Perhaps tag-dispatch is the key. I haven't gotten very deeply into that. Surely it's possible, right? If so, what is best practice? UPDATE: Here's another try at explaining it. I want the user to be able to fill out an order (manifest) for a custom optimizer, something like ordering off of a Chinese menu - one from column A, one from column B, etc.. Waiter, from column A (updaters), I'll have the BFGS update with Cholesky-decompositon sauce. From column B (line-searchers), I'll have the cubic interpolation line-search with an eta of 0.4 and a rho of 1e-4, please. Etc... UPDATE: Okay, okay. Here's the playing-around that I've done. I offer it reluctantly, because I suspect it's a completely wrong-headed approach. It runs okay under vc++ 2008. #include <boost/utility.hpp> #include <boost/type_traits/integral_constant.hpp> namespace dj { struct CBFGS { void bar() {printf("CBFGS::bar %d\n", data);} CBFGS(): data(1234){} int data; }; template<class T> struct is_CBFGS: boost::false_type{}; template<> struct is_CBFGS<CBFGS>: boost::true_type{}; struct LMQN {LMQN(): data(54.321){} void bar() {printf("LMQN::bar %lf\n", data);} double data; }; template<class T> struct is_LMQN: boost::false_type{}; template<> struct is_LMQN<LMQN> : boost::true_type{}; struct default_optimizer_traits { typedef CBFGS update_type; }; template<class traits> class Optimizer; template<class traits> void foo(typename boost::enable_if<is_LMQN<typename traits::update_type>, Optimizer<traits> >::type& self) { printf(" LMQN %lf\n", self.data); } template<class traits> void foo(typename boost::enable_if<is_CBFGS<typename traits::update_type>, Optimizer<traits> >::type& self) { printf("CBFGS %d\n", self.data); } template<class traits = default_optimizer_traits> class Optimizer{ friend typename traits::update_type; //friend void dj::foo<traits>(typename Optimizer<traits> & self); // How? public: //void foo(void); // How??? void foo() { dj::foo<traits>(*this); } void bar() { data.bar(); } //protected: // How? typedef typename traits::update_type update_type; update_type data; }; } // namespace dj int main_() { dj::Optimizer<> opt; opt.foo(); opt.bar(); std::getchar(); return 0; }

Read the article
Animation effect on C#

- by Optimizer

Can someone point me to a C# open source implementaion with a simple image animations. e.g. I feed the input image to animator, and the animation code produces a few dozen of images which if displayed sequentially looks like animation. I am not something extremely fancy - a simple DirectX filter like animations would do. Thank you.

Read the article
How to set _optimizer_search_limit and _optimizer_max_permutations in Oracle10g.

- by user52856

I am working on a product that must support both MSSQL and Oracle (10g and 11g). I have some very complex queries that seem to run without issue on MSSQL 2005/2008, but very, very slow with Oracle. The CPU on the oracle server skyrockets for long periods of time, and it seems like the optimizer may be trying to find the best execution plan for the very complex query. I did some Googling to figure out how to limit the amount of time the optimizer spends on this, and came up with _optimizer_search_limit and _optimizer_max_permutations. Both of these parameters are hidden in Oracle 10g, and setting them in init.ora doesn't seem to make any difference. How do I set these parameters in Oracle. Or am I just totally barking up the wrong tree with the assumption that the optimizer is spending several minutes finding an execution plan? Thanks.

Read the article
Using design-patterns to transform web-service model classes into local model classes and vise versa

- by Daniil Petrov

There is a web-application built with play framework 1.2.7. It contains less than 10 model classes. The main purpose of the application is a lightweight access to a complex remote application (more than 50 model classes). The remote application has its own SOAP API and we use it for synchronization of data. There is a scheduled job in the web-app which makes requests to the remote app. It gets bunches of objects from the remote model and populates corresponding objects of the local model. Currently, there are two groups of classes - the local model and the remote model (generated from wsdl schema). It is not allowed to make any modifications to the remote model. Transformations are being made in the scheduled job class. When it gets objects from the remote app it creates local objects. Recently, it was decided to add a possibility to modify the remote objects. It requires more transformations on our side. We need to transform from remote to local model when reading objects and from local to remote when changing objects. I wonder if this would be possible to use some design-patterns to reduce a number of transformations?

Read the article
Heaps of Trouble?

- by Paul White NZ

If you’re not already a regular reader of Brad Schulz’s blog, you’re missing out on some great material. In his latest entry, he is tasked with optimizing a query run against tables that have no indexes at all. The problem is, predictably, that performance is not very good. The catch is that we are not allowed to create any indexes (or even new statistics) as part of our optimization efforts. In this post, I’m going to look at the problem from a slightly different angle, and present an alternative solution to the one Brad found. Inevitably, there’s going to be some overlap between our entries, and while you don’t necessarily need to read Brad’s post before this one, I do strongly recommend that you read it at some stage; he covers some important points that I won’t cover again here. The Example We’ll use data from the AdventureWorks database, copied to temporary unindexed tables. A script to create these structures is shown below: CREATE TABLE #Custs ( CustomerID INTEGER NOT NULL, TerritoryID INTEGER NULL, CustomerType NCHAR(1) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, ); GO CREATE TABLE #Prods ( ProductMainID INTEGER NOT NULL, ProductSubID INTEGER NOT NULL, ProductSubSubID INTEGER NOT NULL, Name NVARCHAR(50) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, ); GO CREATE TABLE #OrdHeader ( SalesOrderID INTEGER NOT NULL, OrderDate DATETIME NOT NULL, SalesOrderNumber NVARCHAR(25) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, CustomerID INTEGER NOT NULL, ); GO CREATE TABLE #OrdDetail ( SalesOrderID INTEGER NOT NULL, OrderQty SMALLINT NOT NULL, LineTotal NUMERIC(38,6) NOT NULL, ProductMainID INTEGER NOT NULL, ProductSubID INTEGER NOT NULL, ProductSubSubID INTEGER NOT NULL, ); GO INSERT #Custs ( CustomerID, TerritoryID, CustomerType ) SELECT C.CustomerID, C.TerritoryID, C.CustomerType FROM AdventureWorks.Sales.Customer C WITH (TABLOCK); GO INSERT #Prods ( ProductMainID, ProductSubID, ProductSubSubID, Name ) SELECT P.ProductID, P.ProductID, P.ProductID, P.Name FROM AdventureWorks.Production.Product P WITH (TABLOCK); GO INSERT #OrdHeader ( SalesOrderID, OrderDate, SalesOrderNumber, CustomerID ) SELECT H.SalesOrderID, H.OrderDate, H.SalesOrderNumber, H.CustomerID FROM AdventureWorks.Sales.SalesOrderHeader H WITH (TABLOCK); GO INSERT #OrdDetail ( SalesOrderID, OrderQty, LineTotal, ProductMainID, ProductSubID, ProductSubSubID ) SELECT D.SalesOrderID, D.OrderQty, D.LineTotal, D.ProductID, D.ProductID, D.ProductID FROM AdventureWorks.Sales.SalesOrderDetail D WITH (TABLOCK); The query itself is a simple join of the four tables: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #Prods P JOIN #OrdDetail D ON P.ProductMainID = D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID JOIN #OrdHeader H ON D.SalesOrderID = H.SalesOrderID JOIN #Custs C ON H.CustomerID = C.CustomerID ORDER BY P.ProductMainID ASC OPTION (RECOMPILE, MAXDOP 1); Remember that these tables have no indexes at all, and only the single-column sampled statistics SQL Server automatically creates (assuming default settings). The estimated query plan produced for the test query looks like this (click to enlarge): The Problem The problem here is one of cardinality estimation – the number of rows SQL Server expects to find at each step of the plan. The lack of indexes and useful statistical information means that SQL Server does not have the information it needs to make a good estimate. Every join in the plan shown above estimates that it will produce just a single row as output. Brad covers the factors that lead to the low estimates in his post. In reality, the join between the #Prods and #OrdDetail tables will produce 121,317 rows. It should not surprise you that this has rather dire consequences for the remainder of the query plan. In particular, it makes a nonsense of the optimizer’s decision to use Nested Loops to join to the two remaining tables. Instead of scanning the #OrdHeader and #Custs tables once (as it expected), it has to perform 121,317 full scans of each. The query takes somewhere in the region of twenty minutes to run to completion on my development machine. A Solution At this point, you may be thinking the same thing I was: if we really are stuck with no indexes, the best we can do is to use hash joins everywhere. We can force the exclusive use of hash joins in several ways, the two most common being join and query hints. A join hint means writing the query using the INNER HASH JOIN syntax; using a query hint involves adding OPTION (HASH JOIN) at the bottom of the query. The difference is that using join hints also forces the order of the join, whereas the query hint gives the optimizer freedom to reorder the joins at its discretion. Adding the OPTION (HASH JOIN) hint results in this estimated plan: That produces the correct output in around seven seconds, which is quite an improvement! As a purely practical matter, and given the rigid rules of the environment we find ourselves in, we might leave things there. (We can improve the hashing solution a bit – I’ll come back to that later on). Faster Nested Loops It might surprise you to hear that we can beat the performance of the hash join solution shown above using nested loops joins exclusively, and without breaking the rules we have been set. The key to this part is to realize that a condition like (A = B) can be expressed as (A <= B) AND (A >= B). Armed with this tremendous new insight, we can rewrite the join predicates like so: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #OrdDetail D JOIN #OrdHeader H ON D.SalesOrderID >= H.SalesOrderID AND D.SalesOrderID <= H.SalesOrderID JOIN #Custs C ON H.CustomerID >= C.CustomerID AND H.CustomerID <= C.CustomerID JOIN #Prods P ON P.ProductMainID >= D.ProductMainID AND P.ProductMainID <= D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID ORDER BY D.ProductMainID OPTION (RECOMPILE, LOOP JOIN, MAXDOP 1, FORCE ORDER); I’ve also added LOOP JOIN and FORCE ORDER query hints to ensure that only nested loops joins are used, and that the tables are joined in the order they appear. The new estimated execution plan is: This new query runs in under 2 seconds. Why Is It Faster? The main reason for the improvement is the appearance of the eager Index Spools, which are also known as index-on-the-fly spools. If you read my Inside The Optimiser series you might be interested to know that the rule responsible is called JoinToIndexOnTheFly. An eager index spool consumes all rows from the table it sits above, and builds a index suitable for the join to seek on. Taking the index spool above the #Custs table as an example, it reads all the CustomerID and TerritoryID values with a single scan of the table, and builds an index keyed on CustomerID. The term ‘eager’ means that the spool consumes all of its input rows when it starts up. The index is built in a work table in tempdb, has no associated statistics, and only exists until the query finishes executing. The result is that each unindexed table is only scanned once, and just for the columns necessary to build the temporary index. From that point on, every execution of the inner side of the join is answered by a seek on the temporary index – not the base table. A second optimization is that the sort on ProductMainID (required by the ORDER BY clause) is performed early, on just the rows coming from the #OrdDetail table. The optimizer has a good estimate for the number of rows it needs to sort at that stage – it is just the cardinality of the table itself. The accuracy of the estimate there is important because it helps determine the memory grant given to the sort operation. Nested loops join preserves the order of rows on its outer input, so sorting early is safe. (Hash joins do not preserve order in this way, of course). The extra lazy spool on the #Prods branch is a further optimization that avoids executing the seek on the temporary index if the value being joined (the ‘outer reference’) hasn’t changed from the last row received on the outer input. It takes advantage of the fact that rows are still sorted on ProductMainID, so if duplicates exist, they will arrive at the join operator one after the other. The optimizer is quite conservative about introducing index spools into a plan, because creating and dropping a temporary index is a relatively expensive operation. It’s presence in a plan is often an indication that a useful index is missing. I want to stress that I rewrote the query in this way primarily as an educational exercise – I can’t imagine having to do something so horrible to a production system. Improving the Hash Join I promised I would return to the solution that uses hash joins. You might be puzzled that SQL Server can create three new indexes (and perform all those nested loops iterations) faster than it can perform three hash joins. The answer, again, is down to the poor information available to the optimizer. Let’s look at the hash join plan again: Two of the hash joins have single-row estimates on their build inputs. SQL Server fixes the amount of memory available for the hash table based on this cardinality estimate, so at run time the hash join very quickly runs out of memory. This results in the join spilling hash buckets to disk, and any rows from the probe input that hash to the spilled buckets also get written to disk. The join process then continues, and may again run out of memory. This is a recursive process, which may eventually result in SQL Server resorting to a bailout join algorithm, which is guaranteed to complete eventually, but may be very slow. The data sizes in the example tables are not large enough to force a hash bailout, but it does result in multiple levels of hash recursion. You can see this for yourself by tracing the Hash Warning event using the Profiler tool. The final sort in the plan also suffers from a similar problem: it receives very little memory and has to perform multiple sort passes, saving intermediate runs to disk (the Sort Warnings Profiler event can be used to confirm this). Notice also that because hash joins don’t preserve sort order, the sort cannot be pushed down the plan toward the #OrdDetail table, as in the nested loops plan. Ok, so now we understand the problems, what can we do to fix it? We can address the hash spilling by forcing a different order for the joins: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #Prods P JOIN #Custs C JOIN #OrdHeader H ON H.CustomerID = C.CustomerID JOIN #OrdDetail D ON D.SalesOrderID = H.SalesOrderID ON P.ProductMainID = D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID ORDER BY D.ProductMainID OPTION (MAXDOP 1, HASH JOIN, FORCE ORDER); With this plan, each of the inputs to the hash joins has a good estimate, and no hash recursion occurs. The final sort still suffers from the one-row estimate problem, and we get a single-pass sort warning as it writes rows to disk. Even so, the query runs to completion in three or four seconds. That’s around half the time of the previous hashing solution, but still not as fast as the nested loops trickery. Final Thoughts SQL Server’s optimizer makes cost-based decisions, so it is vital to provide it with accurate information. We can’t really blame the performance problems highlighted here on anything other than the decision to use completely unindexed tables, and not to allow the creation of additional statistics. I should probably stress that the nested loops solution shown above is not one I would normally contemplate in the real world. It’s there primarily for its educational and entertainment value. I might perhaps use it to demonstrate to the sceptical that SQL Server itself is crying out for an index. Be sure to read Brad’s original post for more details. My grateful thanks to him for granting permission to reuse some of his material. Paul White Email: [email protected] Twitter: @PaulWhiteNZ

Read the article
New Whitepaper: Best Practices for Gathering EBS Database Statistics

- by Elke Phelps (Oracle Development)

Most Oracle Applications DBAs and E-Business Suite users understand the importance of accurate database statistics. Missing, stale or skewed statistics can adversely affect performance. Oracle E-Business Suite statistics should only be gathered using FND_STATS or the Gather Statistics concurrent request. Gathering statistics with DBMS_STATS or the desupported ANALYZE command may result in suboptimal executions plans for E-Business Suite. Our E-Business Suite Performance Team has been busy implementing and testing new features for gathering statistics using FND_STATS in Oracle E-Business Suite databases. The new features and guidelines for when and how to gather statistics are published in the following whitepaper: Best Practices for Gathering Statistics with Oracle E-Business Suite (Note 1586374.1) The new white paper details the following options for gathering statistics using FND_STATS and the Gather Statistics concurrent request:: History Mode - backup existing statistics prior to gather new statistics GATHER_AUTO Option - gather statistics for tables based upon % change Histograms - collect statistics for histograms AUTO Sampling - use the new FND_STATS feature that supports the AUTO option for using AUTO sample size Extended Statistics - use the new FND_STATS feature that supports the creation of column groups and automatic statistics collection on the column groups when table statistics are gathered Incremental Statistics - gather incremental statistics for partitioned tables The new white paper also includes examples and performance test cases for the following: Extended Optimizer Statistics Incremental Statistics Gathering Concurrent Statistics Gathering This white paper includes details about the standalone Oracle E-Business Suite Release 11i and 12 patches that are required to take advantage of this new functionality. Your feedback is welcome We would be very interested in hearing about your experiences with these new options for gathering statistics. Please feel free to post your comments here or drop us a line privately.Related Oracle OpenWorld 2013 Session Getting Optimal Performance from Oracle E-Business Suite (CON8485) Related My Oracle Support Notes Collecting Statistics with Oracle EBS 11i and R12 (Note 368252.1) Non-EBS Related Blogs, White Papers and My Oracle Support Notes Oracle Optimizer Blog Understanding Optimizer Statistic (white paper) Fixed Objects Statistics(GATHER_FIXED_OBJECTS_STATS) Considerations (Note 798257.1)

Read the article
XSLT and possible alternatives [on hold]

- by wirrbel

I had a look at XSLT for transforming one XML file into another one (HTML, etc.). Now while I see that there are benefits to XSLT (being a standardized and used tool) I am reluctant for a couple of reasons XSLT processors seem to be quite huge / resource hungry XML is a bad notation for programming and thats what XSLT is all about. It do not want to troll XSLT here though I just want to point out what I dislike about it to give you an idea of what I would expect from an alternative. Having some Lisp background I wonder whether there are better ways for tree-structure transformations based upon some lisp. I have seen references to DSSSL, sadly most links about DSSSL are dead so its already challenging to see some code that illustrates it. Is DSSSL still in use? I remember that I had installed openjade once when checking out docbook stuff. Jeff Atwood's blog post seems to hint upon using Ruby instead of XSLT. Are there any sane ways to do XML transformations similar to XSLT in a non-xml programming language? I would be open for input on Useful libraries for scripting languages that facilitate XML transformations especially (but not exclusively) lisp-like transformation languages, or Ruby, etc. A few things I found so far: A couple of places on the web have pointed out Linq as a possible alternative. Quite generally I any kind of classifications, also from those who have had the best XSLT experience. For scheme http://cs.brown.edu/~sk/Publications/Papers/Published/kk-sxslt/ and http://www.okmij.org/ftp/Scheme/xml.html

Read the article
Ray-Box Intersection during Scene traversal with matrix transforms

- by Myx

Hello: There are a few ways that I'm testing my ray-box intersections: Using the ComputeIntersectionBox(...) method, that takes a ray and a box as arguments and computes the closest intersection of the ray and the box. This method works by forming a plane with each of the faces of the box and finding an intersection with each of the planes. Once an intersection is found, a check is made whether or not the point is on the surface of the box by checking that the intersection point is between the corner points. When I look at rays after running this algorithm on two different boxes, I obtain the correct intersections. Using ComputeIntersectionScene(...) method without using the matrix transformations on a scene that has two spheres, a dodecahedron (a triangular mesh), and two boxes. ComputeIntersectionScene(...) recursively traverses all of the nodes of the scene graph and computes the closest intersection with the given ray. This test in particular does not apply any transformations that parent nodes may have that also need to be applied to their children. With this test, I also obtain the correct intersections. Using ComputeIntersectionScene(...) method WITH the matrix transformations. This test works like the one above except that before finding an intersection between the ray and a node in the scene, the ray is transformed into the node's coordinate frame using the inverse of the node's transformation matrix and after the intersection has been computed, this intersection is transformed back into the world coordinates by applying the transformation matrix to the intersection point. When testing with the third method on the same scene file as described in 2, testing with 4 rays (thus one ray intersects the one sphere, one ray the the other sphere, one ray one box, and one ray the other box), only the two spheres get intersected and the two boxes do not get intersections. When I debug looking into my ComputeIntersectionBox(...) method, it actually tells me that the ray intersects every plane on the box but each intersection point does not lie on the box. This seems to be strange behavior, since when using test 2 without transformations, I obtain the correct box intersections (thus, I believe my ray-box intersection to be correct) and when using test 3 WITH transformations, I obtain the correct sphere intersections (thus, I believe my transformed ray should be OK). Any suggestions where I could be going wrong? Thank you in advance.

Read the article
When row estimation goes wrong

- by Dave Ballantyne

Whilst working at a client site, I hit upon one of those issues that you are not sure if that this is something entirely new or a bug or a gap in your knowledge. The client had a large query that needed optimizing. The query itself looked pretty good, no udfs, UNION ALL were used rather than UNION, most of the predicates were sargable other than one or two minor ones. There were a few extra joins that could be eradicated and having fixed up the query I then started to dive into the plan. I could see all manor of spills in the hash joins and the sort operations, these are caused when SQL Server has not reserved enough memory and has to write to tempdb. A VERY expensive operation that is generally avoidable. These, however, are a symptom of a bad row estimation somewhere else, and when that bad estimation is combined with other estimation errors, chaos can ensue. Working my way back down the plan, I found the cause, and the more I thought about it the more i came convinced that the optimizer could be making a much more intelligent choice. First step is to reproduce and I was able to simplify the query down a single join between two tables, Product and ProductStatus, from a business point of view, quite fundamental, find the status of particular products to show if ‘active’ ,’inactive’ or whatever. The query itself couldn’t be any simpler The estimated plan looked like this: Ignore the “!” warning which is a missing index, but notice that Products has 27,984 rows and the join outputs 14,000. The actual plan shows how bad that estimation of 14,000 is : So every row in Products has a corresponding row in ProductStatus. This is unsurprising, in fact it is guaranteed, there is a trusted FK relationship between the two columns. There is no way that the actual output of the join can be different from the input. The optimizer is already partly aware of the foreign key meta data, and that can be seen in the simplifiction stage. If we drop the Description column from the query: the join to ProductStatus is optimized out. It serves no purpose to the query, there is no data required from the table and the optimizer knows that the FK will guarantee that a matching row will exist so it has been removed. Surely the same should be applied to the row estimations in the initial example, right ? If you think so, please upvote this connect item. So what are our options in fixing this error ? Simply changing the join to a left join will cause the optimizer to think that we could allow the rows not to exist. or a subselect would also work However, this is a client site, Im not able to change each and every query where this join takes place but there is a more global switch that will fix this error, TraceFlag 2301. This is described as, perhaps loosely, “Enable advanced decision support optimizations”. We can test this on the original query in isolation by using the “QueryTraceOn” option and lo and behold our estimated plan now has the ‘correct’ estimation. Many thanks goes to Paul White (b|t) for his help and keeping me sane through this

Read the article
Ant build from Android-generated build file fails - how to fix?

- by Eno

Building our Android app from Ant fails with this error: [apply] [apply] UNEXPECTED TOP-LEVEL ERROR: [apply] java.lang.OutOfMemoryError: Java heap space [apply] at java.util.HashMap.<init>(HashMap.java:209) [apply] at java.util.HashSet.<init>(HashSet.java:86) [apply] at com.android.dx.ssa.Dominators.compress(Dominators.java:96) [apply] at com.android.dx.ssa.Dominators.eval(Dominators.java:132) [apply] at com.android.dx.ssa.Dominators.run(Dominators.java:213) [apply] at com.android.dx.ssa.DomFront.run(DomFront.java:84) [apply] at com.android.dx.ssa.SsaConverter.placePhiFunctions(SsaConverter.java:265) [apply] at com.android.dx.ssa.SsaConverter.convertToSsaMethod(SsaConverter.java:51) [apply] at com.android.dx.ssa.Optimizer.optimize(Optimizer.java:100) [apply] at com.android.dx.ssa.Optimizer.optimize(Optimizer.java:74) [apply] at com.android.dx.dex.cf.CfTranslator.processMethods(CfTranslator.java:269) [apply] at com.android.dx.dex.cf.CfTranslator.translate0(CfTranslator.java:131) [apply] at com.android.dx.dex.cf.CfTranslator.translate(CfTranslator.java:85) [apply] at com.android.dx.command.dexer.Main.processClass(Main.java:297) [apply] at com.android.dx.command.dexer.Main.processFileBytes(Main.java:276) [apply] at com.android.dx.command.dexer.Main.access$100(Main.java:56) [apply] at com.android.dx.command.dexer.Main$1.processFileBytes(Main.java:228) [apply] at com.android.dx.cf.direct.ClassPathOpener.processArchive(ClassPathOpener.java:245) [apply] at com.android.dx.cf.direct.ClassPathOpener.processOne(ClassPathOpener.java:130) [apply] at com.android.dx.cf.direct.ClassPathOpener.process(ClassPathOpener.java:108) [apply] at com.android.dx.command.dexer.Main.processOne(Main.java:245) [apply] at com.android.dx.command.dexer.Main.processAllFiles(Main.java:183) [apply] at com.android.dx.command.dexer.Main.run(Main.java:139) [apply] at com.android.dx.command.dexer.Main.main(Main.java:120) [apply] at com.android.dx.command.Main.main(Main.java:87) BUILD FAILED Ive tried giving Ant more memory by setting ANT_OPTS="-Xms256m -Xmx512m". (This build machine has 1Gb RAM). Do I just need more memory or is there anything else I can try?

Read the article
Method of transforming 3D vectors with a matrix

- by Drew Noakes

I've been doing some reading on transforming Vector3 with matrices, and am tossing up digging deeper into the math and coding this myself versus using existing code. For whatever reason my school curriculum never included matrices, so I'm filling a gap in my knowledge. Thankfully I only need a few simple things, I think. Context is that I'm programming a robot for the RoboCup 3D league. I'm coding it in C# but it'll have to run on Mono. Ideally I wouldn't use any existing graphics libraries for this (WinForms/WPF/XNA) as all I really need is a neat subset of matrix transformations. Specifically, I need translation and x/y/z rotations, and a way of combining multiple transformations into a single matrix. This will then be applied to my own Vector3 type to produce the transformed Vector3. I've read different advice about this. For example, some model the transformation with a 4x3 matrix, others with a 4x4 matrix. Also, some examples show that you need a forth value for the vector's matrix of 1. What happens to this value when it's included in the output? [1 0 0 0] [x y z 1] * [0 1 0 0] = [a b c d] [0 0 1 0] [2 4 6 1] The parts I'm missing are: What sizes my matrices should be Compositing transformations by multiplying the transformation matrices together Transforming 3D vectors with the resulting matrix As I mostly just want to get this running, any psuedo-code would be great. Information about what matrix values perform what transformations is quite clearly defined on many pages, so need not be discussed here unless you're very keen :)

Read the article
How to fix "OutOfMemoryError: java heap space" while compiling MonoDroid App in MonoDevelop

- by Rodja

When I try to compile one of my projects, I recently get the following error: Tool /usr/bin/java execution started with arguments: -jar /Applications/android-sdk-mac_x86/platform-tools/lib/dx.jar --no-strict --dex --output=obj/Debug/android/bin/classes.dex obj/Debug/android/bin/classes /Developer/MonoAndroid/usr/lib/mandroid/platforms/android-8/mono.android.jar FlurryAnalytics/Jars/FlurryAgent.jar Jars/android-support-v4.jar UNEXPECTED TOP-LEVEL ERROR: java.lang.OutOfMemoryError: Java heap space at com.android.dx.rop.code.RegisterSpecSet.<init>(RegisterSpecSet.java:49) at com.android.dx.rop.code.RegisterSpecSet.mutableCopy(RegisterSpecSet.java:383) at com.android.dx.ssa.LocalVariableInfo.mutableCopyOfStarts(LocalVariableInfo.java:169) at com.android.dx.ssa.LocalVariableExtractor.processBlock(LocalVariableExtractor.java:104) at com.android.dx.ssa.LocalVariableExtractor.doit(LocalVariableExtractor.java:90) at com.android.dx.ssa.LocalVariableExtractor.extract(LocalVariableExtractor.java:56) at com.android.dx.ssa.SsaConverter.convertToSsaMethod(SsaConverter.java:50) at com.android.dx.ssa.Optimizer.optimize(Optimizer.java:99) at com.android.dx.ssa.Optimizer.optimize(Optimizer.java:73) at com.android.dx.dex.cf.CfTranslator.processMethods(CfTranslator.java:273) at com.android.dx.dex.cf.CfTranslator.translate0(CfTranslator.java:134) at com.android.dx.dex.cf.CfTranslator.translate(CfTranslator.java:87) at com.android.dx.command.dexer.Main.processClass(Main.java:487) at com.android.dx.command.dexer.Main.processFileBytes(Main.java:459) at com.android.dx.command.dexer.Main.access$400(Main.java:67) at com.android.dx.command.dexer.Main$1.processFileBytes(Main.java:398) at com.android.dx.cf.direct.ClassPathOpener.processArchive(ClassPathOpener.java:245) at com.android.dx.cf.direct.ClassPathOpener.processOne(ClassPathOpener.java:131) at com.android.dx.cf.direct.ClassPathOpener.process(ClassPathOpener.java:109) at com.android.dx.command.dexer.Main.processOne(Main.java:422) at com.android.dx.command.dexer.Main.processAllFiles(Main.java:333) at com.android.dx.command.dexer.Main.run(Main.java:209) at com.android.dx.command.dexer.Main.main(Main.java:174) at com.android.dx.command.Main.main(Main.java:91) Other projects build as expected. I think I need to increase the heap size for this java build step? But how?

Read the article
The SSIS tuning tip that everyone misses

- by Rob Farley

I know that everyone misses this, because I’m yet to find someone who doesn’t have a bit of an epiphany when I describe this. When tuning Data Flows in SQL Server Integration Services, people see the Data Flow as moving from the Source to the Destination, passing through a number of transformations. What people don’t consider is the Source, getting the data out of a database. Remember, the source of data for your Data Flow is not your Source Component. It’s wherever the data is, within your database, probably on a disk somewhere. You need to tune your query to optimise it for SSIS, and this is what most people fail to do. I’m not suggesting that people don’t tune their queries – there’s plenty of information out there about making sure that your queries run as fast as possible. But for SSIS, it’s not about how fast your query runs. Let me say that again, but in bolder text: The speed of an SSIS Source is not about how fast your query runs. If your query is used in a Source component for SSIS, the thing that matters is how fast it starts returning data. In particular, those first 10,000 rows to populate that first buffer, ready to pass down the rest of the transformations on its way to the Destination. Let’s look at a very simple query as an example, using the AdventureWorks database: We’re picking the different Weight values out of the Product table, and it’s doing this by scanning the table and doing a Sort. It’s a Distinct Sort, which means that the duplicates are discarded. It'll be no surprise to see that the data produced is sorted. Obvious, I know, but I'm making a comparison to what I'll do later. Before I explain the problem here, let me jump back into the SSIS world... If you’ve investigated how to tune an SSIS flow, then you’ll know that some SSIS Data Flow Transformations are known to be Blocking, some are Partially Blocking, and some are simply Row transformations. Take the SSIS Sort transformation, for example. I’m using a larger data set for this, because my small list of Weights won’t demonstrate it well enough. Seven buffers of data came out of the source, but none of them could be pushed past the Sort operator, just in case the last buffer contained the data that would be sorted into the first buffer. This is a blocking operation. Back in the land of T-SQL, we consider our Distinct Sort operator. It’s also blocking. It won’t let data through until it’s seen all of it. If you weren’t okay with blocking operations in SSIS, why would you be happy with them in an execution plan? The source of your data is not your OLE DB Source. Remember this. The source of your data is the NCIX/CIX/Heap from which it’s being pulled. Picture it like this... the data flowing from the Clustered Index, through the Distinct Sort operator, into the SELECT operator, where a series of SSIS Buffers are populated, flowing (as they get full) down through the SSIS transformations. Alright, I know that I’m taking some liberties here, because the two queries aren’t the same, but consider the visual. The data is flowing from your disk and through your execution plan before it reaches SSIS, so you could easily find that a blocking operation in your plan is just as painful as a blocking operation in your SSIS Data Flow. Luckily, T-SQL gives us a brilliant query hint to help avoid this. OPTION (FAST 10000) This hint means that it will choose a query which will optimise for the first 10,000 rows – the default SSIS buffer size. And the effect can be quite significant. First let’s consider a simple example, then we’ll look at a larger one. Consider our weights. We don’t have 10,000, so I’m going to use OPTION (FAST 1) instead. You’ll notice that the query is more expensive, using a Flow Distinct operator instead of the Distinct Sort. This operator is consuming 84% of the query, instead of the 59% we saw from the Distinct Sort. But the first row could be returned quicker – a Flow Distinct operator is non-blocking. The data here isn’t sorted, of course. It’s in the same order that it came out of the index, just with duplicates removed. As soon as a Flow Distinct sees a value that it hasn’t come across before, it pushes it out to the operator on its left. It still has to maintain the list of what it’s seen so far, but by handling it one row at a time, it can push rows through quicker. Overall, it’s a lot more work than the Distinct Sort, but if the priority is the first few rows, then perhaps that’s exactly what we want. The Query Optimizer seems to do this by optimising the query as if there were only one row coming through: This 1 row estimation is caused by the Query Optimizer imagining the SELECT operation saying “Give me one row” first, and this message being passed all the way along. The request might not make it all the way back to the source, but in my simple example, it does. I hope this simple example has helped you understand the significance of the blocking operator. Now I’m going to show you an example on a much larger data set. This data was fetching about 780,000 rows, and these are the Estimated Plans. The data needed to be Sorted, to support further SSIS operations that needed that. First, without the hint. ...and now with OPTION (FAST 10000): A very different plan, I’m sure you’ll agree. In case you’re curious, those arrows in the top one are 780,000 rows in size. In the second, they’re estimated to be 10,000, although the Actual figures end up being 780,000. The top one definitely runs faster. It finished several times faster than the second one. With the amount of data being considered, these numbers were in minutes. Look at the second one – it’s doing Nested Loops, across 780,000 rows! That’s not generally recommended at all. That’s “Go and make yourself a coffee” time. In this case, it was about six or seven minutes. The faster one finished in about a minute. But in SSIS-land, things are different. The particular data flow that was consuming this data was significant. It was being pumped into a Script Component to process each row based on previous rows, creating about a dozen different flows. The data flow would take roughly ten minutes to run – ten minutes from when the data first appeared. The query that completes faster – chosen by the Query Optimizer with no hints, based on accurate statistics (rather than pretending the numbers are smaller) – would take a minute to start getting the data into SSIS, at which point the ten-minute flow would start, taking eleven minutes to complete. The query that took longer – chosen by the Query Optimizer pretending it only wanted the first 10,000 rows – would take only ten seconds to fill the first buffer. Despite the fact that it might have taken the database another six or seven minutes to get the data out, SSIS didn’t care. Every time it wanted the next buffer of data, it was already available, and the whole process finished in about ten minutes and ten seconds. When debugging SSIS, you run the package, and sit there waiting to see the Debug information start appearing. You look for the numbers on the data flow, and seeing operators going Yellow and Green. Without the hint, I’d sit there for a minute. With the hint, just ten seconds. You can imagine which one I preferred. By adding this hint, it felt like a magic wand had been waved across the query, to make it run several times faster. It wasn’t the case at all – but it felt like it to SSIS.

Read the article
Class member functions instantiated by traits [policies, actually]

- by Jive Dadson

I am reluctant to say I can't figure this out, but I can't figure this out. I've googled and searched Stack Overflow, and come up empty. The abstract, and possibly overly vague form of the question is, how can I use the traits-pattern to instantiate member functions? [Update: I used the wrong term here. It should be "policies" rather than "traits." Traits describe existing classes. Policies prescribe synthetic classes.] The question came up while modernizing a set of multivariate function optimizers that I wrote more than 10 years ago. The optimizers all operate by selecting a straight-line path through the parameter space away from the current best point (the "update"), then finding a better point on that line (the "line search"), then testing for the "done" condition, and if not done, iterating. There are different methods for doing the update, the line-search, and conceivably for the done test, and other things. Mix and match. Different update formulae require different state-variable data. For example, the LMQN update requires a vector, and the BFGS update requires a matrix. If evaluating gradients is cheap, the line-search should do so. If not, it should use function evaluations only. Some methods require more accurate line-searches than others. Those are just some examples. The original version instantiates several of the combinations by means of virtual functions. Some traits are selected by setting mode bits that are tested at runtime. Yuck. It would be trivial to define the traits with #define's and the member functions with #ifdef's and macros. But that's so twenty years ago. It bugs me that I cannot figure out a whiz-bang modern way. If there were only one trait that varied, I could use the curiously recurring template pattern. But I see no way to extend that to arbitrary combinations of traits. I tried doing it using boost::enable_if, etc.. The specialized state information was easy. I managed to get the functions done, but only by resorting to non-friend external functions that have the this-pointer as a parameter. I never even figured out how to make the functions friends, much less member functions. The compiler (VC++ 2008) always complained that things didn't match. I would yell, "SFINAE, you moron!" but the moron is probably me. Perhaps tag-dispatch is the key. I haven't gotten very deeply into that. Surely it's possible, right? If so, what is best practice? UPDATE: Here's another try at explaining it. I want the user to be able to fill out an order (manifest) for a custom optimizer, something like ordering off of a Chinese menu - one from column A, one from column B, etc.. Waiter, from column A (updaters), I'll have the BFGS update with Cholesky-decompositon sauce. From column B (line-searchers), I'll have the cubic interpolation line-search with an eta of 0.4 and a rho of 1e-4, please. Etc... UPDATE: Okay, okay. Here's the playing-around that I've done. I offer it reluctantly, because I suspect it's a completely wrong-headed approach. It runs okay under vc++ 2008. #include <boost/utility.hpp> #include <boost/type_traits/integral_constant.hpp> namespace dj { struct CBFGS { void bar() {printf("CBFGS::bar %d\n", data);} CBFGS(): data(1234){} int data; }; template<class T> struct is_CBFGS: boost::false_type{}; template<> struct is_CBFGS<CBFGS>: boost::true_type{}; struct LMQN {LMQN(): data(54.321){} void bar() {printf("LMQN::bar %lf\n", data);} double data; }; template<class T> struct is_LMQN: boost::false_type{}; template<> struct is_LMQN<LMQN> : boost::true_type{}; // "Order form" struct default_optimizer_traits { typedef CBFGS update_type; // Selection from column A - updaters }; template<class traits> class Optimizer; template<class traits> void foo(typename boost::enable_if<is_LMQN<typename traits::update_type>, Optimizer<traits> >::type& self) { printf(" LMQN %lf\n", self.data); } template<class traits> void foo(typename boost::enable_if<is_CBFGS<typename traits::update_type>, Optimizer<traits> >::type& self) { printf("CBFGS %d\n", self.data); } template<class traits = default_optimizer_traits> class Optimizer{ friend typename traits::update_type; //friend void dj::foo<traits>(typename Optimizer<traits> & self); // How? public: //void foo(void); // How??? void foo() { dj::foo<traits>(*this); } void bar() { data.bar(); } //protected: // How? typedef typename traits::update_type update_type; update_type data; }; } // namespace dj int main() { dj::Optimizer<> opt; opt.foo(); opt.bar(); std::getchar(); return 0; }

Read the article
SQL SERVER – Challenge – Puzzle – Usage of FAST Hint

- by pinaldave

I was recently working with various SQL Server Hints. After working for a day on various hints, I realize that for one hint, I am not able to come up with good example. The hint is FAST. Let us look at the definition of the FAST hint from the Book On-Line. FAST number_rows Specifies that the query is optimized for fast retrieval of the first number_rows. This is a nonnegative integer. After the first number_rows are returned, the query continues execution and produces its full result set. Now the question is in what condition this hint can be useful. I have tried so many different combination, I have found this hint does not make much performance difference, infect I did not notice any change in time taken to load the resultset. I noticed that this hint does not change number of the page read to return result. Now when there is difference in performance is expected because if you read the what FAST hint does is that it only returns first few results FAST – which does not mean there will be difference in performance. I also understand that this hint gives the guidance/suggestions/hint to query optimizer that there are only 100 rows are in expected resultset. This tricking the optimizer to think there are only 100 rows and which (may) lead to render different execution plan than the one which it would have taken in normal case (without hint). Again, not necessarily, this will happen always. Now if you read above discussion, you will find that basic understanding of the hint is very clear to me but I still feel that I am missing something. Here are my questions: 1) In what condition this hint can be useful? What is the case, when someone want to see first few rows early because my experience suggests that when first few rows are rendered remaining rows are rendered as well. 2) Is there any way application can retrieve the fast fetched rows from SQL Server? 3) Do you use this hint in your application? Why? When? and How? Here are few examples I have attempted during the my experiment and found there is no difference in execution plan except its estimated number of rows are different leading optimizer think that the cost is less but in reality that is not the case. USE AdventureWorks GO SET STATISTICS IO ON SET STATISTICS TIME ON GO --------------------------------------------- -- Table Scan with Fast Hint SELECT * FROM Sales.SalesOrderDetail GO SELECT * FROM Sales.SalesOrderDetail OPTION (FAST 100) GO --------------------------------------------- -- Table Scan with Where on Index Key SELECT * FROM Sales.SalesOrderDetail WHERE OrderQty = 14 GO SELECT * FROM Sales.SalesOrderDetail WHERE OrderQty = 14 OPTION (FAST 100) GO --------------------------------------------- -- Table Scan with Where on Index Key SELECT * FROM Sales.SalesOrderDetail WHERE SalesOrderDetailID < 1000 GO SELECT * FROM Sales.SalesOrderDetail WHERE SalesOrderDetailID < 1000 OPTION (FAST 100) GO Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Pinal Dave, SQL, SQL Authority, SQL Puzzle, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL Contest – Win USD 300 Worth Gift – Cartoon Contest is Back

- by pinaldave

There are two excellent contests and we have lots of winning to do this year end. 1) Win USD 25 Amazon Gift Cards (10 Units) This is very simple, you just have to download SQL Server DB Optimizer. That’s it! There are only two conditions: You must have a valid email address. As USD 25 Amazon Gift Card will be sent to the same address. Download DB Optimizer between today and Dec 8, 2012. Link to Download DB Optimizer. Every day one winner will be notified about their winning USD 25 Amazon Gift Cards for next 10 days. 2) Win Star Wars R2-D2 Inflatable R/C This the coolest thing to win. I personally want one but as I am running a contest, I can’t participate. You get this cool Remote Controlled Device – you just have to answer following cartoon contest. Read the complete story and think what will be the answer provided by the smart employee. There are only two conditions: Leave your answer in the comment area of this blog post (every comment will be hidden till Dec 8, 2012). Please leave your answer in the comment area between today and Dec 8, 2012. Remember you can participate as many times as you want. Make sure that your answer is correct and creative. The most creative answer will be selected. The decision of contest owner will be final. We may have runner’s up prices but for the moment let us try to win R2-D2. Here is the cool video of R2D2. Now here is the cartoon story, please follow the story and complete the very last cartoon template. Your answer should be correct and should be creative. However, the ideal answer will not be longer than one or two sentences. Hint: (Hint) Well, Leave your answer in the comment area of this blog post. If you do not win R2D2, trust me there are chances you may win a surprise gift from me. Remember your answer should be correct and should be creative. However, the ideal answer will not be longer than one or two sentences. Last day to participate in both of the contest is Dec 8, 2012. We will announce the winner in the week of December 10. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Puzzle, SQL Query, SQL Server, SQL Tips and Tricks, SQLAuthority News, SQLServer, T SQL, Technology

Read the article
Spooling in SQL execution plans

- by Rob Farley

Sewing has never been my thing. I barely even know the terminology, and when discussing this with American friends, I even found out that half the words that Americans use are different to the words that English and Australian people use. That said – let’s talk about spools! In particular, the Spool operators that you find in some SQL execution plans. This post is for T-SQL Tuesday, hosted this month by me! I’ve chosen to write about spools because they seem to get a bad rap (even in my song I used the line “There’s spooling from a CTE, they’ve got recursion needlessly”). I figured it was worth covering some of what spools are about, and hopefully explain why they are remarkably necessary, and generally very useful. If you have a look at the Books Online page about Plan Operators, at http://msdn.microsoft.com/en-us/library/ms191158.aspx, and do a search for the word ‘spool’, you’ll notice it says there are 46 matches. 46! Yeah, that’s what I thought too... Spooling is mentioned in several operators: Eager Spool, Lazy Spool, Index Spool (sometimes called a Nonclustered Index Spool), Row Count Spool, Spool, Table Spool, and Window Spool (oh, and Cache, which is a special kind of spool for a single row, but as it isn’t used in SQL 2012, I won’t describe it any further here). Spool, Table Spool, Index Spool, Window Spool and Row Count Spool are all physical operators, whereas Eager Spool and Lazy Spool are logical operators, describing the way that the other spools work. For example, you might see a Table Spool which is either Eager or Lazy. A Window Spool can actually act as both, as I’ll mention in a moment. In sewing, cotton is put onto a spool to make it more useful. You might buy it in bulk on a cone, but if you’re going to be using a sewing machine, then you quite probably want to have it on a spool or bobbin, which allows it to be used in a more effective way. This is the picture that I want you to think about in relation to your data. I’m sure you use spools every time you use your sewing machine. I know I do. I can’t think of a time when I’ve got out my sewing machine to do some sewing and haven’t used a spool. However, I often run SQL queries that don’t use spools. You see, the data that is consumed by my query is typically in a useful state without a spool. It’s like I can just sew with my cotton despite it not being on a spool! Many of my favourite features in T-SQL do like to use spools though. This looks like a very similar query to before, but includes an OVER clause to return a column telling me the number of rows in my data set. I’ll describe what’s going on in a few paragraphs’ time. So what does a Spool operator actually do? The spool operator consumes a set of data, and stores it in a temporary structure, in the tempdb database. This structure is typically either a Table (ie, a heap), or an Index (ie, a b-tree). If no data is actually needed from it, then it could also be a Row Count spool, which only stores the number of rows that the spool operator consumes. A Window Spool is another option if the data being consumed is tightly linked to windows of data, such as when the ROWS/RANGE clause of the OVER clause is being used. You could maybe think about the type of spool being like whether the cotton is going onto a small bobbin to fit in the base of the sewing machine, or whether it’s a larger spool for the top. A Table or Index Spool is either Eager or Lazy in nature. Eager and Lazy are Logical operators, which talk more about the behaviour, rather than the physical operation. If I’m sewing, I can either be all enthusiastic and get all my cotton onto the spool before I start, or I can do it as I need it. “Lazy” might not the be the best word to describe a person – in the SQL world it describes the idea of either fetching all the rows to build up the whole spool when the operator is called (Eager), or populating the spool only as it’s needed (Lazy). Window Spools are both physical and logical. They’re eager on a per-window basis, but lazy between windows. And when is it needed? The way I see it, spools are needed for two reasons. 1 – When data is going to be needed AGAIN. 2 – When data needs to be kept away from the original source. If you’re someone that writes long stored procedures, you are probably quite aware of the second scenario. I see plenty of stored procedures being written this way – where the query writer populates a temporary table, so that they can make updates to it without risking the original table. SQL does this too. Imagine I’m updating my contact list, and some of my changes move data to later in the book. If I’m not careful, I might update the same row a second time (or even enter an infinite loop, updating it over and over). A spool can make sure that I don’t, by using a copy of the data. This problem is known as the Halloween Effect (not because it’s spooky, but because it was discovered in late October one year). As I’m sure you can imagine, the kind of spool you’d need to protect against the Halloween Effect would be eager, because if you’re only handling one row at a time, then you’re not providing the protection... An eager spool will block the flow of data, waiting until it has fetched all the data before serving it up to the operator that called it. In the query below I’m forcing the Query Optimizer to use an index which would be upset if the Name column values got changed, and we see that before any data is fetched, a spool is created to load the data into. This doesn’t stop the index being maintained, but it does mean that the index is protected from the changes that are being done. There are plenty of times, though, when you need data repeatedly. Consider the query I put above. A simple join, but then counting the number of rows that came through. The way that this has executed (be it ideal or not), is to ask that a Table Spool be populated. That’s the Table Spool operator on the top row. That spool can produce the same set of rows repeatedly. This is the behaviour that we see in the bottom half of the plan. In the bottom half of the plan, we see that the a join is being done between the rows that are being sourced from the spool – one being aggregated and one not – producing the columns that we need for the query. Table v Index When considering whether to use a Table Spool or an Index Spool, the question that the Query Optimizer needs to answer is whether there is sufficient benefit to storing the data in a b-tree. The idea of having data in indexes is great, but of course there is a cost to maintaining them. Here we’re creating a temporary structure for data, and there is a cost associated with populating each row into its correct position according to a b-tree, as opposed to simply adding it to the end of the list of rows in a heap. Using a b-tree could even result in page-splits as the b-tree is populated, so there had better be a reason to use that kind of structure. That all depends on how the data is going to be used in other parts of the plan. If you’ve ever thought that you could use a temporary index for a particular query, well this is it – and the Query Optimizer can do that if it thinks it’s worthwhile. It’s worth noting that just because a Spool is populated using an Index Spool, it can still be fetched using a Table Spool. The details about whether or not a Spool used as a source shows as a Table Spool or an Index Spool is more about whether a Seek predicate is used, rather than on the underlying structure. Recursive CTE I’ve already shown you an example of spooling when the OVER clause is used. You might see them being used whenever you have data that is needed multiple times, and CTEs are quite common here. With the definition of a set of data described in a CTE, if the query writer is leveraging this by referring to the CTE multiple times, and there’s no simplification to be leveraged, a spool could theoretically be used to avoid reapplying the CTE’s logic. Annoyingly, this doesn’t happen. Consider this query, which really looks like it’s using the same data twice. I’m creating a set of data (which is completely deterministic, by the way), and then joining it back to itself. There seems to be no reason why it shouldn’t use a spool for the set described by the CTE, but it doesn’t. On the other hand, if we don’t pull as many columns back, we might see a very different plan. You see, CTEs, like all sub-queries, are simplified out to figure out the best way of executing the whole query. My example is somewhat contrived, and although there are plenty of cases when it’s nice to give the Query Optimizer hints about how to execute queries, it usually doesn’t do a bad job, even without spooling (and you can always use a temporary table). When recursion is used, though, spooling should be expected. Consider what we’re asking for in a recursive CTE. We’re telling the system to construct a set of data using an initial query, and then use set as a source for another query, piping this back into the same set and back around. It’s very much a spool. The analogy of cotton is long gone here, as the idea of having a continual loop of cotton feeding onto a spool and off again doesn’t quite fit, but that’s what we have here. Data is being fed onto the spool, and getting pulled out a second time when the spool is used as a source. (This query is running on AdventureWorks, which has a ManagerID column in HumanResources.Employee, not AdventureWorks2012) The Index Spool operator is sucking rows into it – lazily. It has to be lazy, because at the start, there’s only one row to be had. However, as rows get populated onto the spool, the Table Spool operator on the right can return rows when asked, ending up with more rows (potentially) getting back onto the spool, ready for the next round. (The Assert operator is merely checking to see if we’ve reached the MAXRECURSION point – it vanishes if you use OPTION (MAXRECURSION 0), which you can try yourself if you like). Spools are useful. Don’t lose sight of that. Every time you use temporary tables or table variables in a stored procedure, you’re essentially doing the same – don’t get upset at the Query Optimizer for doing so, even if you think the spool looks like an expensive part of the query. I hope you’re enjoying this T-SQL Tuesday. Why not head over to my post that is hosting it this month to read about some other plan operators? At some point I’ll write a summary post – once I have you should find a comment below pointing at it. @rob_farley

Read the article
Why use 3d matrix and camera in 2D world for 2d geometric figures?

- by Navy Seal

I'm working in XNA on a 2d isometric world/game and I'm using DrawUserPrimitives to draw some geometric figures... I saw some tutorials about creating dynamic shadows but I didn't understood why they use a "3d" matrix to control the transformations since the figure I'm drawing is in 2d perspective. I know I'm drawing a 2d figure in 3d but I still can't understand if I really need to work with the matrix. Is there any advantage in using a 3d Matrix to control camera and view? Any reason why I can't just update my vertex's positions by using a regular method since the view is always the same... And since I want to work only with single figures, won't this cause all the geometric figures have the same transformations simultaneously? To understand better what I mean here's a video http://www.youtube.com/watch?v=LjvsGHXaGEA&feature=player_embedded

Read the article
Day 5 of Oracle OpenWorld 2012 October 4th

- by Maria Colgan

Its the last day of Oracle OpenWorld and we have saved the very best for last. So hopefully you are still awake and functioning at this stage! Today, we present An Insider’s View of How the Optimizer Works (Session CON8457) at Moscone South - room 104. This session explains how the latest version of the optimizer works and the best ways you can influence its decisions to ensure you get optimal execution every time We really hope you have enjoy the conference so far and will stop by our session this afternoon before you head off home! +Maria Colgan

Read the article

Search Results

Search found 621 results on 25 pages for 'optimizer transformations'.

Page 7/25 | < Previous Page | 3 4 5 6 7 8 9 10 11 12 13 14 | Next Page >

- by Kavitha Srinivasan

- by Tamarick Hill

- by ejohnson2010

- by Jive Dadson

- by Optimizer

- by user52856

- by Daniil Petrov

- by Paul White NZ

- by Elke Phelps (Oracle Development)

- by wirrbel

- by Myx

- by Dave Ballantyne

- by Eno

- by Drew Noakes

- by Rodja

- by Rob Farley

- by Jive Dadson

- by pinaldave

- by pinaldave

- by Rob Farley

- by Navy Seal

- by Maria Colgan

< Previous Page | 3 4 5 6 7 8 9 10 11 12 13 14 | Next Page >