Search Results

Search found 4278 results on 172 pages for 'white rose'.

Page 15/172 | < Previous Page | 11 12 13 14 15 16 17 18 19 20 21 22  | Next Page >

  • Design by Contract with Microsoft .Net Code Contract

    - by Fredrik N
    I have done some talks on different events and summits about Defensive Programming and Design by Contract, last time was at Cornerstone’s Developer Summit 2010. Next time will be at SweNug (Sweden .Net User Group). I decided to write a blog post about of some stuffs I was talking about. Users are a terrible thing! Protect your self from them ”Human users have a gift for doing the worst possible thing at the worst possible time.” – Michael T. Nygard, Release It! The kind of users Michael T. Nygard are talking about is the users of a system. We also have users that uses our code, the users I’m going to focus on is the users of our code. Me and you and another developers. “Any fool can write code that a computer can understand. Good programmers write code that humans can understand.” – Martin Fowler Good programmers also writes code that humans know how to use, good programmers also make sure software behave in a predictable manner despise inputs or user actions. Design by Contract   Design by Contract (DbC) is a way for us to make a contract between us (the code writer) and the users of our code. It’s about “If you give me this, I promise to give you this”. It’s not about business validations, that is something completely different that should be part of the domain model. DbC is to make sure the users of our code uses it in a correct way, and that we can rely on the contract and write code in a way where we know that the users will follow the contract. It will make it much easier for us to write code with a contract specified. Something like the following code is something we may see often: public void DoSomething(Object value) { value.DoIKnowThatICanDoThis(); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Where “value” can be uses directly or passed to other methods and later be used. What some of us can easily forget here is that the “value” can be “null”. We will probably not passing a null value, but someone else that uses our code maybe will do it. I think most of you (including me) have passed “null” into a method because you don’t know if the argument need to be specified to a valid value etc. I bet most of you also have got the “Null reference exception”. Sometimes this “Null reference exception” can be hard and take time to fix, because we need to search among our code to see where the “null” value was passed in etc. Wouldn’t it be much better if we can as early as possible specify that the value can’t not be null, so the users of our code also know it when the users starts to use our code, and before run time execution of the code? This is where DbC comes into the picture. We can use DbC to specify what we need, and by doing so we can rely on the contract when we write our code. So the code above can actually use the DoIKnowThatICanDoThis() method on the value object without being worried that the “value” can be null. The contract between the users of the code and us writing the code, says that the “value” can’t be null.   Pre- and Postconditions   When working with DbC we are specifying pre- and postconditions.  Precondition is a condition that should be met before a query or command is executed. An example of a precondition is: “The Value argument of the method can’t be null”, and we make sure the “value” isn’t null before the method is called. Postcondition is a condition that should be met when a command or query is completed, a postcondition will make sure the result is correct. An example of a postconditon is “The method will return a list with at least 1 item”. Commands an Quires When using DbC, we need to know what a Command and a Query is, because some principles that can be good to follow are based on commands and queries. A Command is something that will not return anything, like the SQL’s CREATE, UPDATE and DELETE. There are two kinds of Commands when using DbC, the Creation commands (for example a Constructor), and Others. Others can for example be a Command to add a value to a list, remove or update a value etc. //Creation commands public Stack(int size) //Other commands public void Push(object value); public void Remove(); .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; }   A Query, is something that will return something, for example an Attribute, Property or a Function, like the SQL’s SELECT.   There are two kinds of Queries, the Basic Queries  (Quires that aren’t based on another queries), and the Derived Queries, queries that is based on another queries. Here is an example of queries of a Stack: //Basic Queries public int Count; public object this[int index] { get; } //Derived Queries //Is related to Count Query public bool IsEmpty() { return Count == 0; } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } To understand about some principles that are good to follow when using DbC, we need to know about the Commands and different Queries. The 6 Principles When working with DbC, it’s advisable to follow some principles to make it easier to define and use contracts. The following DbC principles are: Separate commands and queries. Separate basic queries from derived queries. For each derived query, write a postcondition that specifies what result will be returned, in terms of one or more basic queries. For each command, write a postcondition that specifies the value of every basic query. For every query and command, decide on a suitable precondition. Write invariants to define unchanging properties of objects. Before I will write about each of them I want you to now that I’m going to use .Net 4.0 Code Contract. I will in the rest of the post uses a simple Stack (Yes I know, .Net already have a Stack class) to give you the basic understanding about using DbC. A Stack is a data structure where the first item in, will be the first item out. Here is a basic implementation of a Stack where not contract is specified yet: public class Stack { private object[] _array; //Basic Queries public uint Count; public object this[uint index] { get { return _array[index]; } set { _array[index] = value; } } //Derived Queries //Is related to Count Query public bool IsEmpty() { return Count == 0; } //Is related to Count and this[] Query public object Top() { return this[Count]; } //Creation commands public Stack(uint size) { Count = 0; _array = new object[size]; } //Other commands public void Push(object value) { this[++Count] = value; } public void Remove() { this[Count] = null; Count--; } } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; }   Note: The Stack is implemented in a way to demonstrate the use of Code Contract in a simple way, the implementation may not look like how you would implement it, so don’t think this is the perfect Stack implementation, only used for demonstration.   Before I will go deeper into the principles I will simply mention how we can use the .Net Code Contract. I mention before about pre- and postcondition, is about “Require” something and to “Ensure” something. When using Code Contract, we will use a static class called “Contract” and is located in he “System.Diagnostics.Contracts” namespace. The contract must be specified at the top or our member statement block. To specify a precondition with Code Contract we uses the Contract.Requires method, and to specify a postcondition, we uses the Contract.Ensure method. Here is an example where both a pre- and postcondition are used: public object Top() { Contract.Requires(Count > 0, "Stack is empty"); Contract.Ensures(Contract.Result<object>() == this[Count]); return this[Count]; } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; }   The contract above requires that the Count is greater than 0, if not we can’t get the item at the Top of a Stack. We also Ensures that the results (By using the Contract.Result method, we can specify a postcondition that will check if the value returned from a method is correct) of the Top query is equal to this[Count].   1. Separate Commands and Queries   When working with DbC, it’s important to separate Command and Quires. A method should either be a command that performs an Action, or returning information to the caller, not both. By asking a question the answer shouldn’t be changed. The following is an example of a Command and a Query of a Stack: public void Push(object value) public object Top() .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; }   The Push is a command and will not return anything, just add a value to the Stack, the Top is a query to get the item at the top of the stack.   2. Separate basic queries from derived queries There are two different kinds of queries,  the basic queries that doesn’t rely on another queries, and derived queries that uses a basic query. The “Separate basic queries from derived queries” principle is about about that derived queries can be specified in terms of basic queries. So this principles is more about recognizing that a query is a derived query or a basic query. It will then make is much easier to follow the other principles. The following code shows a basic query and a derived query: //Basic Queries public uint Count; //Derived Queries //Is related to Count Query public bool IsEmpty() { return Count == 0; } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; }   We can see that IsEmpty will use the Count query, and that makes the IsEmpty a Derived query.   3. For each derived query, write a postcondition that specifies what result will be returned, in terms of one or more basic queries.   When the derived query is recognize we can follow the 3ed principle. For each derived query, we can create a postcondition that specifies what result our derived query will return in terms of one or more basic queries. Remember that DbC is about contracts between the users of the code and us writing the code. So we can’t use demand that the users will pass in a valid value, we must also ensure that we will give the users what the users wants, when the user is following our contract. The IsEmpty query of the Stack will use a Count query and that will make the IsEmpty a Derived query, so we should now write a postcondition that specified what results will be returned, in terms of using a basic query and in this case the Count query, //Basic Queries public uint Count; //Derived Queries public bool IsEmpty() { Contract.Ensures(Contract.Result<bool>() == (Count == 0)); return Count == 0; } The Contract.Ensures is used to create a postcondition. The above code will make sure that the results of the IsEmpty (by using the Contract.Result to get the result of the IsEmpty method) is correct, that will say that the IsEmpty will be either true or false based on Count is equal to 0 or not. The postcondition are using a basic query, so the IsEmpty is now following the 3ed principle. We also have another Derived Query, the Top query, it will also need a postcondition and it uses all basic queries. The Result of the Top method must be the same value as the this[] query returns. //Basic Queries public uint Count; public object this[uint index] { get { return _array[index]; } set { _array[index] = value; } } //Derived Queries //Is related to Count and this[] Query public object Top() { Contract.Ensures(Contract.Result<object>() == this[Count]); return this[Count]; } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; }   4. For each command, write a postcondition that specifies the value of every basic query.   For each command we will create a postconditon that specifies the value of basic queries. If we look at the Stack implementation we will have three Commands, one Creation command, the Constructor, and two others commands, Push and Remove. Those commands need a postcondition and they should include basic query to follow the 4th principle. //Creation commands public Stack(uint size) { Contract.Ensures(Count == 0); Count = 0; _array = new object[size]; } //Other commands public void Push(object value) { Contract.Ensures(Count == Contract.OldValue<uint>(Count) + 1); Contract.Ensures(this[Count] == value); this[++Count] = value; } public void Remove() { Contract.Ensures(Count == Contract.OldValue<uint>(Count) - 1); this[Count] = null; Count--; } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; }   As you can see the Create command will Ensures that Count will be 0 when the Stack is created, when a Stack is created there shouldn’t be any items in the stack. The Push command will take a value and put it into the Stack, when an item is pushed into the Stack, the Count need to be increased to know the number of items added to the Stack, and we must also make sure the item is really added to the Stack. The postconditon of the Push method will make sure the that old value of the Count (by using the Contract.OldValue we can get the value a Query has before the method is called)  plus 1 will be equal to the Count query, this is the way we can ensure that the Push will increase the Count with one. We also make sure the this[] query will now contain the item we pushed into the Stack. The Remove method must make sure the Count is decreased by one when the top item is removed from the Stack. The Commands is now following the 4th principle, where each command now have a postcondition that used the value of basic queries. Note: The principle says every basic Query, the Remove only used one Query the Count, it’s because this command can’t use the this[] query because an item is removed, so the only way to make sure an item is removed is to just use the Count query, so the Remove will still follow the principle.   5. For every query and command, decide on a suitable precondition.   We have now focused only on postcondition, now time for some preconditons. The 5th principle is about deciding a suitable preconditon for every query and command. If we starts to look at one of our basic queries (will not go through all Queries and commands here, just some of them) the this[] query, we can’t pass an index that is lower then 1 (.Net arrays and list are zero based, but not the stack in this blog post ;)) and the index can’t be lesser than the number of items in the stack. So here we will need a preconditon. public object this[uint index] { get { Contract.Requires(index >= 1); Contract.Requires(index <= Count); return _array[index]; } } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Think about the Contract as an documentation about how to use the code in a correct way, so if the contract could be specified elsewhere (not part of the method body), we could simply write “return _array[index]” and there is no need to check if index is greater or lesser than Count, because that is specified in a “contract”. The implementation of Code Contract, requires that the contract is specified in the code. As a developer I would rather have this contract elsewhere (Like Spec#) or implemented in a way Eiffel uses it as part of the language. Now when we have looked at one Query, we can also look at one command, the Remove command (You can see the whole implementation of the Stack at the end of this blog post, where precondition is added to more queries and commands then what I’m going to show in this section). We can only Remove an item if the Count is greater than 0. So we can write a precondition that will require that Count must be greater than 0. public void Remove() { Contract.Requires(Count > 0); Contract.Ensures(Count == Contract.OldValue<uint>(Count) - 1); this[Count] = null; Count--; } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; }   6. Write invariants to define unchanging properties of objects.   The last principle is about making sure the object are feeling great! This is done by using invariants. When using Code Contract we can specify invariants by adding a method with the attribute ContractInvariantMethod, the method must be private or public and can only contains calls to Contract.Invariant. To make sure the Stack feels great, the Stack must have 0 or more items, the Count can’t never be a negative value to make sure each command and queries can be used of the Stack. Here is our invariant for the Stack object: [ContractInvariantMethod] private void ObjectInvariant() { Contract.Invariant(Count >= 0); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; }   Note: The ObjectInvariant method will be called every time after a Query or Commands is called. Here is the full example using Code Contract:   public class Stack { private object[] _array; //Basic Queries public uint Count; public object this[uint index] { get { Contract.Requires(index >= 1); Contract.Requires(index <= Count); return _array[index]; } set { Contract.Requires(index >= 1); Contract.Requires(index <= Count); _array[index] = value; } } //Derived Queries //Is related to Count Query public bool IsEmpty() { Contract.Ensures(Contract.Result<bool>() == (Count == 0)); return Count == 0; } //Is related to Count and this[] Query public object Top() { Contract.Requires(Count > 0, "Stack is empty"); Contract.Ensures(Contract.Result<object>() == this[Count]); return this[Count]; } //Creation commands public Stack(uint size) { Contract.Requires(size > 0); Contract.Ensures(Count == 0); Count = 0; _array = new object[size]; } //Other commands public void Push(object value) { Contract.Requires(value != null); Contract.Ensures(Count == Contract.OldValue<uint>(Count) + 1); Contract.Ensures(this[Count] == value); this[++Count] = value; } public void Remove() { Contract.Requires(Count > 0); Contract.Ensures(Count == Contract.OldValue<uint>(Count) - 1); this[Count] = null; Count--; } [ContractInvariantMethod] private void ObjectInvariant() { Contract.Invariant(Count >= 0); } } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Summary By using Design By Contract we can make sure the users are using our code in a correct way, and we must also make sure the users will get the expected results when they uses our code. This can be done by specifying contracts. To make it easy to use Design By Contract, some principles may be good to follow like the separation of commands an queries. With .Net 4.0 we can use the Code Contract feature to specify contracts.

    Read the article

  • Add logo background color to data returned by StackAuth sites route

    - by Yi Jiang
    Given that now with Stack Exchange 2.0 the logo of some of the sites, like Web Apps, AskUbuntu, Photography, Gaming and Pro Webmasters have non-white background, I think it will be best if the StackAuth sites route can include the preferred background color for those the logo of these sites. This is especially important for sites like Photography whose logo is unreadable if the traditional white is used. Edit: Here's an example of what I mean here: As you can see, the AskUbuntu logo text totally invisible against a white background.

    Read the article

  • Vermeidung von SOA Anti-Patterns mittels AIA

    - by Hans Viehmann
    Gerade ist mir ein White Paper des Enterprise Architecture Teams in die Hände gefallen, das sich mit SOA Anti-Patterns befasst. Es ist zwar kein AIA Paper im eigentlichen Sinne, aber mit AIA hat man natürlich eine gute Unterstützung darin, die dort beschriebenen Fehler zu vermeiden. Das White Paper behandelt Themen wie: Vermeidung von SOA Silos SOA Reifegrad und Projekt-Management Ausuferndes Service Portfolio Umgang mit Referenz-Architekturen EAI 2.0 - Punkt-zu-Punkt Integration auf offenen Standards Ein Link auf das Dokument ist unten angefügt - viel Vergnügen bei der Lektüre ... Oracle White Paper: SOA Anti-Patterns.

    Read the article

  • New Whitepaper: Best Practices for Gathering EBS Database Statistics

    - by Elke Phelps (Oracle Development)
    Most Oracle Applications DBAs and E-Business Suite users understand the importance of accurate database statistics.  Missing, stale or skewed statistics can adversely affect performance.  Oracle E-Business Suite statistics should only be gathered using FND_STATS or the Gather Statistics concurrent request. Gathering statistics with DBMS_STATS or the desupported ANALYZE command may result in suboptimal executions plans for E-Business Suite. Our E-Business Suite Performance Team has been busy implementing and testing new features for gathering statistics using FND_STATS in Oracle E-Business Suite databases.  The new features and guidelines for when and how to gather statistics are published in the following whitepaper: Best Practices for Gathering Statistics with Oracle E-Business Suite (Note 1586374.1) The new white paper details the following options for gathering statistics using FND_STATS and the Gather Statistics concurrent request:: History Mode - backup existing statistics prior to gather new statistics GATHER_AUTO Option - gather statistics for tables based upon % change Histograms - collect statistics for histograms AUTO Sampling - use the new FND_STATS feature that supports the AUTO option for using AUTO sample size Extended Statistics - use the new FND_STATS feature that supports the creation of column groups and automatic statistics collection on the column groups when table statistics are gathered Incremental Statistics - gather incremental statistics for partitioned tables The new white paper also includes examples and performance test cases for the following: Extended Optimizer Statistics Incremental Statistics Gathering Concurrent Statistics Gathering This white paper includes details about the standalone Oracle E-Business Suite Release 11i and 12 patches that are required to take advantage of this new functionality. Your feedback is welcome We would be very interested in hearing about your experiences with these new options for gathering statistics.  Please feel free to post your comments here or drop us a line privately.Related Oracle OpenWorld 2013 Session Getting Optimal Performance from Oracle E-Business Suite (CON8485) Related My Oracle Support Notes Collecting Statistics with Oracle EBS 11i and R12 (Note 368252.1) Non-EBS Related Blogs, White Papers and My Oracle Support Notes  Oracle Optimizer Blog Understanding Optimizer Statistic (white paper) Fixed Objects Statistics(GATHER_FIXED_OBJECTS_STATS) Considerations (Note 798257.1)

    Read the article

  • XNA running slow when making a texture

    - by Anthony
    I'm using XNA to test an image analysis algorithm for a robot. I made a simple 3D world that has a grass, a robot, and white lines (that are represent the course). The image analysis algorithm is a modification of the Hough line detection algorithm. I have the game render 2 camera views to a render target in memory. One camera is a top down view of the robot going around the course, and the second camera is the view from the robot's perspective as it moves along. I take the rendertarget of the robot camera and convert it to a Color[,] so that I can do image analysis on it. private Color[,] TextureTo2DArray(Texture2D texture, Color[] colors1D, Color[,] colors2D) { texture.GetData(colors1D); for (int x = 0; x < texture.Width; x++) { for (int y = 0; y < texture.Height; y++) { colors2D[x, y] = colors1D[x + (y * texture.Width)]; } } return colors2D; } I want to overlay the results of the image analysis on the robot camera view. The first part of the image analysis is finding the white pixels. When I find the white pixels I create a bool[,] array showing which pixels were white and which were black. Then I want to convert it back into a texture so that I can overlay on the robot view. When I try to create the new texture showing which ones pixels were white, then the game goes super slow (around 10 hz). Can you give me some pointers as to what to do to make the game go faster. If I comment out this algorithm, then it goes back up to 60 hz. private Texture2D GenerateTexturesFromBoolArray(bool[,] boolArray,Color[] colorMap, Texture2D textureToModify) { for(int i =0;i < screenWidth;i++) { for(int j =0;j<screenHeight;j++) { if (boolArray[i, j] == true) { colorMap[i+(j*screenWidth)] = Color.Red; } else { colorMap[i + (j * screenWidth)] = Color.Transparent; } } } textureToModify.SetData<Color>(colorMap); return textureToModify; } Each Time I run draw, I must set the texture to null, so that I can modify it. public override void Draw(GameTime gameTime) { Vector2 topRightVector = ((SimulationMain)Game).spriteRectangleManager.topRightVector; Vector2 scaleFactor = ((SimulationMain)Game).config.scaleFactorScreenSizeToWindow; this.spriteBatch.Begin(); // Start the 2D drawing this.spriteBatch.Draw(this.textureFindWhite, topRightVector, null, Color.White, 0, Vector2.Zero, scaleFactor, SpriteEffects.None, 0); this.spriteBatch.End(); // Stop drawing. GraphicsDevice.Textures[0] = null; } Thanks for the help, Anthony G.

    Read the article

  • Comparison of phrases containing the same word in Google Trends

    - by alisia123
    If I compare three phrases in google trends : house sale house white house I get the following numbers: house - 91 sale house - 3 white house - 2 The question is: Is "sale house" and "white house" already included in the number 91? It is an important question, because if it is true, than: house_except_sale_house + sale_house = 91 sale_house = 3 Which means I have to compare 88 and 3, if I compare "house" and "sale house"

    Read the article

  • Onsite Interview : QA Engineer with more Emphasis on Java Skills

    - by coolrockers2007
    Hello I'm having a onsite interview for QA engineer with Startup. While phone interview the person said he would want to test my JAVA, JUnit and SQL skills on white board with more importance on Object-oriented skills, So what all can i questions can i expect ? One more important issue : How do i overcome the fear of White board interview ?. I'm very bad at White board sessions, i get fully tensed. Please suggest me tips to overcome my jinx

    Read the article

  • Read SQL Server Reporting Services Overview

    - by Editor
    Read an excellent, 14-page, general overview of Microsoft SQL Server 2008 Reporting Services entitled White Paper: Reporting Services in SQL Server 2008. Download the White Paper. (360 KB Microsoft Word file) White Paper: Reporting Services in SQL Server 2008 Microsoft SQL Server 2008 Reporting Services provides a complete server-based platform that is designed to support a wide variety [...]

    Read the article

  • How can I change my "Desktop bar"?

    - by d_Joke
    The problem: In Gnome 3.4 when I click the main menu, it's white. The text of the menu is also white. Original text: I have a problem. When I installed the new version of Gnome (3.4) the "Desktop bar" (I'm sorry, I don't know what's the real name of that bar, but is the bar on the top on Ubuntu 11.10) every time I click on the username icon, or the battery, etc., the menu comes on gray or white and the letters are white. I know maybe this is a stupid question but it annoys me. Besides, my username doest not appear on the login screen. I tried to reset my settings, I delete gnome, and check the Unsettings and CompizConfig but the problem is still there. Maybe I miss something on the process of looking on any configuration tool but I don't think so... Sorry if the question is something basic or even stupid but I'm new on Ubuntu and I'm experimenting whit it.

    Read the article

  • WPF4 Unleashed - how does converting child elements work?

    - by Kapol
    In chapter 2 of the book WPF4 Unleashed the author shows an example of how XAML processes type conversion. He states that <SolidColorBrush>White</SolidColorBrush> is equivalent to <SolidColorBrush Color="White"/> , because a string can be converted into a SolidColorBrush object. But how is that enough? It doesn't make sense to me. How does XAML know to which property should the value White be assigned?

    Read the article

  • Natural Search Engine Optimization - Don't "Game the System" Or You Will Get Banned!

    When focusing on natural search engine optimization, it is important that you keep the process "white hat." You see, when it comes to SEO, there are basically three schools of thought: White hat, Gray hat, and Black hat. As you can probably infer, white hat is following the rules, gray hat is a little in between, and black hat is going against parameters that Google and other major search engines have set for ethical SEO practices.

    Read the article

  • Why is 1px sometimes 2px when specified in Android XML?

    - by Daniel Lew
    I've got a desire for a one-pixel divider line, just for looks. I thought I could accomplish this using a View of height 1px, with a defined background. However, I'm getting some very odd behavior on different devices - sometimes the 1px ends up as 2px. Take this sample layout for example: <LinearLayout xmlns:android="http://schemas.android.com/apk/res/android" android:orientation="vertical" android:layout_width="fill_parent" android:layout_height="fill_parent"> <View android:layout_width="fill_parent" android:layout_height="1px" android:background="@android:color/white" android:layout_marginBottom="4dp" /> <View android:layout_width="fill_parent" android:layout_height="1px" android:background="@android:color/white" android:layout_marginBottom="4dp" /> <View android:layout_width="fill_parent" android:layout_height="1px" android:background="@android:color/white" android:layout_marginBottom="4dp" /> <View android:layout_width="fill_parent" android:layout_height="1px" android:background="@android:color/white" android:layout_marginBottom="4dp" /> <View android:layout_width="fill_parent" android:layout_height="1px" android:background="@android:color/white" android:layout_marginBottom="4dp" /> <View android:layout_width="fill_parent" android:layout_height="1px" android:background="@android:color/white" android:layout_marginBottom="4dp" /> <View android:layout_width="fill_parent" android:layout_height="1px" android:background="@android:color/white" android:layout_marginBottom="4dp" /> </LinearLayout> When run on my G1, this comes out fine. But on the Nexus One, it alternates between 1px lines and 2px lines. Does anyone know where this is going awry? Why does Android sometimes make 1px into 2px?

    Read the article

  • Shortest Path algorithm of a different kind

    - by Ram Bhat
    Hey guys, Lets say you have a grid like this (made randomly) Now lets say you have a car starting randomly from one of the while boxes, what would be the shortest path to go through each one of the white boxes? you can visit each white box as many times as you want and cant Jump over the black boxes. The black boxes are like walls. In simple words you can move from white box to white box only.. You can move in any direction, even diagonally.

    Read the article

  • Ajax Tabs implementation problem .

    - by SmartDev
    Hi , I have Implement ajax tabs and i have four tabs in it . In the four tabs i have four grid views with paging and sorting .The tabs are looking good i can see the grid ,but the problem is my first tab sorting works fine, where if i click on any other tab and click on the grid it goes to my first tab again . One more thing i want to change the background color of each tab. Can anyone help please here is my source code: <asp:Content ID="Content1" ContentPlaceHolderID="ContentPlaceHolder1" Runat="Server"> <asp:ScriptManager ID="ScMMyTabs" runat="server"> </asp:ScriptManager> <cc1:TabContainer ID="TCMytabs" ActiveTabIndex="0" runat="server"> <cc1:TabPanel ID="TpMyreq" runat="server" CssClass="TabBackground" HeaderText="My request"> <ContentTemplate> <table> <tr> <td> <asp:Button ID="btnexportMyRequestCsu" runat="server" Text="Export To Excel" CssClass="LabelDisplay" OnClick="btnexportMyRequestCsu_Click" /> </td> </tr> <tr> <td> <asp:GridView ID="GdvMyrequest" runat="server" CssClass="Mytabs" BackColor="White" BorderColor="White" BorderStyle="Ridge" BorderWidth="2px" CellPadding="3" CellSpacing="1" GridLines="None" OnPageIndexChanging="GdvMyrequest_PageIndexChanging" OnSorting="GdvMyrequest_Sorting" EmptyDataText="No request found for this user"> <RowStyle BackColor="#DEDFDE" ForeColor="Black" /> <FooterStyle BackColor="#C6C3C6" ForeColor="Black" /> <PagerStyle BackColor="Control" ForeColor="Gray" HorizontalAlign="Left" /> <SelectedRowStyle BackColor="#9471DE" Font-Bold="True" ForeColor="White" /> <HeaderStyle BackColor="#4A3C8C" Font-Bold="True" ForeColor="#E7E7FF" /> <PagerSettings Position="TopAndBottom" /> <Columns> <asp:TemplateField> <HeaderTemplate> Row No </HeaderTemplate> <ItemTemplate> <%# Container.DataItemIndex + 1 %> </ItemTemplate> </asp:TemplateField> </Columns> </asp:GridView> </td> </tr> <tr> <td> <asp:Label ID="lblmessmyrequestAhk" runat="server" CssClass="labelmess"></asp:Label> </td> </tr> </table> </ContentTemplate> </cc1:TabPanel> <cc1:TabPanel ID="TpMyPaymentCc" runat="server" HeaderText="Payments Credit Card" > <ContentTemplate> <table> <tr> <td> <asp:GridView ID="GdvmypaymentsCc" runat="server" CssClass="Mytabs" BackColor="White" BorderColor="White" BorderStyle="Ridge" BorderWidth="2px" CellPadding="3" CellSpacing="1" GridLines="None" OnPageIndexChanging="GdvmypaymentsCc_PageIndexChanging" OnSorting="GdvmypaymentsCc_Sorting" EmptyDataText="No Data"> <RowStyle BackColor="#DEDFDE" ForeColor="Black" /> <FooterStyle BackColor="#C6C3C6" ForeColor="Black" /> <PagerStyle BackColor="Control" ForeColor="Gray" HorizontalAlign="Left" /> <SelectedRowStyle BackColor="#9471DE" Font-Bold="True" ForeColor="White" /> <HeaderStyle BackColor="#4A3C8C" Font-Bold="True" ForeColor="#E7E7FF" /> <PagerSettings Position="TopAndBottom" /> </asp:GridView> </td> </tr> <tr> <td> <asp:Label ID="lblmessmypaymentsCsu" runat="server" CssClass="labelmess"></asp:Label> </td> </tr> </table> </ContentTemplate> </cc1:TabPanel> <cc1:TabPanel ID="TpMyPaymentsCk" runat="server" HeaderText="Payments Check" > <ContentTemplate> <asp:GridView ID="GdvmypaymentsCk" runat="server" CssClass="Mytabs" BackColor="White" BorderColor="White" BorderStyle="Ridge" BorderWidth="2px" CellPadding="3" CellSpacing="1" GridLines="None" OnPageIndexChanging="GdvmypaymentsCk_PageIndexChanging" OnSorting="GdvmypaymentsCk_Sorting" EmptyDataText="No Data"> <RowStyle BackColor="#DEDFDE" ForeColor="Black" /> <FooterStyle BackColor="#C6C3C6" ForeColor="Black" /> <PagerStyle BackColor="Control" ForeColor="Gray" HorizontalAlign="Left" /> <SelectedRowStyle BackColor="#9471DE" Font-Bold="True" ForeColor="White" /> <HeaderStyle BackColor="#4A3C8C" Font-Bold="True" ForeColor="#E7E7FF" /> <PagerSettings Position="TopAndBottom" /> </asp:GridView> </ContentTemplate> </cc1:TabPanel> <cc1:TabPanel ID="TpMyCalls" runat="server" HeaderText="Calls" > <ContentTemplate> <table> <tr> <td> <asp:GridView ID="GdvSelectcallsP" runat="server" CssClass="Mytabs" BackColor="White" BorderColor="White" BorderStyle="Ridge" BorderWidth="2px" CellPadding="3" CellSpacing="1" GridLines="None" OnPageIndexChanging="GdvSelectcallsP_PageIndexChanging" OnRowDataBound="GdvSelectcallsP_RowDataBound" OnSorting="GdvSelectcallsP_Sorting" > <RowStyle BackColor="#DEDFDE" ForeColor="Black" /> <FooterStyle BackColor="#C6C3C6" ForeColor="Black" /> <PagerStyle BackColor="#C6C3C6" ForeColor="Black" HorizontalAlign="Right" /> <SelectedRowStyle BackColor="#9471DE" Font-Bold="True" ForeColor="White" /> <HeaderStyle BackColor="#4A3C8C" Font-Bold="True" ForeColor="#E7E7FF" /> <Columns> <asp:TemplateField> <HeaderTemplate> Row No </HeaderTemplate> <ItemTemplate> <%# Container.DataItemIndex + 1 %> </ItemTemplate> </asp:TemplateField> </Columns> </asp:GridView> </td> </tr> <tr> <td> <asp:Label ID="lblmessselectcallsAhkP" runat="server" CssClass="labelmess"></asp:Label> </td> </tr> </table> </ContentTemplate> </cc1:TabPanel> </cc1:TabContainer> </asp:Content>

    Read the article

  • Test your internet connection - Emtel Mobile Internet

    After yesterday's report on Emtel Fixed Broadband (I'm still wondering where the 'fixed' part is), I did the same tests on Emtel Mobile Internet. For this I'm using the Huawei E169G HSDPA USB stick, connected to the same machine. Actually, this is my fail-safe internet connection and the system automatically switches between them if a problem, let's say timeout, etc. has been detected on the main line. For better comparison I used exactly the same servers on Speedtest.net. The results Following are the results of Rose Hill (hosted by Emtel) and respectively Frankfurt, Germany (hosted by Vodafone DE): Speedtest.net result of 31.05.2013 between Flic en Flac and Rose Hill, Mauritius (Emtel - Mobile Internet) Speedtest.net result of 31.05.2013 between Flic en Flac and Frankfurt, Germany (Emtel - Mobile Internet) As you might easily see, there is a big difference in speed between national and international connections. More interestingly are the results related to the download and upload ratio. I'm not sure whether connections over Emtel Mobile Internet are asymmetric or symmetric like the Fixed Broadband. Might be interesting to find out. The first test result actually might give us a clue that the connection could be asymmetric with a ratio of 3:1 but again I'm not sure. I'll find out and post an update on this. It depends on network coverage Later today I was on tour with my tablet, a Samsung Galaxy Tab 10.1 (model GT-P7500) running on Android 4.0.4 (Ice Cream Sandwich), and did some more tests using the Speedtest.net app. The results are actually as expected and in areas with better network coverage you will get better results after all. At least, as long as you stay inside the national networks. For anything abroad, it doesn't really matter. But see for yourselves: Speedtest.net result of 31.05.2013 between Cascavelle and servers in Rose Hill, Mauritius (Emtel - Mobile Internet), Port Louis, Mauritius and Kuala Lumpur, Malaysia It's rather shocking and frustrating to see how the speed on international destinations goes down. And the full capability of the tablet's integrated modem (HSDPA: 21 Mbps; HSUPA: 5.76 Mbps) isn't used, too. I guess, this demands more tests in other areas of the island, like Ebene, Pailles or Port Louis. I'll keep you updated... The question remains: Alternatives? After the publication of the test results on Fixed Broadband I had some exchange with others on Facebook. Sadly, it seems that there are really no alternatives to what Emtel is offering at the moment. There are the various internet packages by Mauritius Telecom feat. Orange, like ADSL, MyT and Mobile Internet, and there is Bharat Telecom with their Bees offer which is currently limited to Ebene and parts of Quatre Bornes.

    Read the article

  • Cardinality Estimation Bug with Lookups in SQL Server 2008 onward

    - by Paul White
    Cost-based optimization stands or falls on the quality of cardinality estimates (expected row counts).  If the optimizer has incorrect information to start with, it is quite unlikely to produce good quality execution plans except by chance.  There are many ways we can provide good starting information to the optimizer, and even more ways for cardinality estimation to go wrong.  Good database people know this, and work hard to write optimizer-friendly queries with a schema and metadata (e.g. statistics) that reduce the chances of poor cardinality estimation producing a sub-optimal plan.  Today, I am going to look at a case where poor cardinality estimation is Microsoft’s fault, and not yours. SQL Server 2005 SELECT th.ProductID, th.TransactionID, th.TransactionDate FROM Production.TransactionHistory AS th WHERE th.ProductID = 1 AND th.TransactionDate BETWEEN '20030901' AND '20031231'; The query plan on SQL Server 2005 is as follows (if you are using a more recent version of AdventureWorks, you will need to change the year on the date range from 2003 to 2007): There is an Index Seek on ProductID = 1, followed by a Key Lookup to find the Transaction Date for each row, and finally a Filter to restrict the results to only those rows where Transaction Date falls in the range specified.  The cardinality estimate of 45 rows at the Index Seek is exactly correct.  The table is not very large, there are up-to-date statistics associated with the index, so this is as expected. The estimate for the Key Lookup is also exactly right.  Each lookup into the Clustered Index to find the Transaction Date is guaranteed to return exactly one row.  The plan shows that the Key Lookup is expected to be executed 45 times.  The estimate for the Inner Join output is also correct – 45 rows from the seek joining to one row each time, gives 45 rows as output. The Filter estimate is also very good: the optimizer estimates 16.9951 rows will match the specified range of transaction dates.  Eleven rows are produced by this query, but that small difference is quite normal and certainly nothing to worry about here.  All good so far. SQL Server 2008 onward The same query executed against an identical copy of AdventureWorks on SQL Server 2008 produces a different execution plan: The optimizer has pushed the Filter conditions seen in the 2005 plan down to the Key Lookup.  This is a good optimization – it makes sense to filter rows out as early as possible.  Unfortunately, it has made a bit of a mess of the cardinality estimates. The post-Filter estimate of 16.9951 rows seen in the 2005 plan has moved with the predicate on Transaction Date.  Instead of estimating one row, the plan now suggests that 16.9951 rows will be produced by each clustered index lookup – clearly not right!  This misinformation also confuses SQL Sentry Plan Explorer: Plan Explorer shows 765 rows expected from the Key Lookup (it multiplies a rounded estimate of 17 rows by 45 expected executions to give 765 rows total). Workarounds One workaround is to provide a covering non-clustered index (avoiding the lookup avoids the problem of course): CREATE INDEX nc1 ON Production.TransactionHistory (ProductID) INCLUDE (TransactionDate); With the Transaction Date filter applied as a residual predicate in the same operator as the seek, the estimate is again as expected: We could also force the use of the ultimate covering index (the clustered one): SELECT th.ProductID, th.TransactionID, th.TransactionDate FROM Production.TransactionHistory AS th WITH (INDEX(1)) WHERE th.ProductID = 1 AND th.TransactionDate BETWEEN '20030901' AND '20031231'; Summary Providing a covering non-clustered index for all possible queries is not always practical, and scanning the clustered index will rarely be optimal.  Nevertheless, these are the best workarounds we have today. In the meantime, watch out for poor cardinality estimates when a predicate is applied as part of a lookup. The worst thing is that the estimate after the lookup join in the 2008+ plans is wrong.  It’s not hopelessly wrong in this particular case (45 versus 16.9951 is not the end of the world) but it easily can be much worse, and there’s not much you can do about it.  Any decisions made by the optimizer after such a lookup could be based on very wrong information – which can only be bad news. If you think this situation should be improved, please vote for this Connect item. © 2012 Paul White – All Rights Reserved twitter: @SQL_Kiwi email: [email protected]

    Read the article

  • Deleting xml file using radio value

    - by ???? ???
    i using php to delete file, but i got table loop like this: <table border="0" width="100%" cellpadding="0" cellspacing="0" id="product-table"> <tr class="bg_tableheader"> <th class="table-header-check"><a id="toggle-all" ></a> </th> <th class="table-header-check"><a href="#"><font color="white">Username</font></a> </th> <th class="table-header-check"><a href="#"><font color="white">First Name</font></a></th> <th class="table-header-check"><a href="#"><font color="white">Last Name</font></a></th> <th class="table-header-check"><a href="#"><font color="white">Email</font></a></th> <th class="table-header-check"><a href="#"><font color="white">Group</font></a></th> <th class="table-header-check"><a href="#"><font color="white">Birthday</font></a></th> <th class="table-header-check"><a href="#"><font color="white">Gender</font></a></th> <th class="table-header-check"><a href="#"><font color="white">Age</font></a></th> <th class="table-header-check"><a href="#"><font color="white">Country</font></a></th> </tr> <?php $files = glob('users/*.xml'); foreach($files as $file){ $xml = new SimpleXMLElement($file, 0, true); echo ' <tr> <td></td> <form action="" method="post"> <td class="alternate-row1"><input type="radio" name="file_name" value="'. basename($file, '.xml') .'" />'. basename($file, '.xml') .'</td> <td>'. $xml->name .'&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp</td> <td class="alternate-row1">'. $xml->lastname .'</td> <td>'. $xml->email .'</td> <td class="alternate-row1">'. $xml->level .'</td> <td>'. $xml->birthday .'</td> <td class="alternate-row1">'. $xml->gender .'</td> <td>'. $xml->age .'</td> <td class="alternate-row1">'. $xml->country .'</td> </tr>'; } ?> </table> </div> <?php if(isset($_POST['file_name'])){ unlink('users/'.$_POST['file_name']); } ?> <input type="submit" value="Delete" /> </form> so as you can see i got radio value set has basename (xml file name) but from some reason it not working, any idea why is that? Thanks in advance.

    Read the article

  • Fun with Aggregates

    - by Paul White
    There are interesting things to be learned from even the simplest queries.  For example, imagine you are given the task of writing a query to list AdventureWorks product names where the product has at least one entry in the transaction history table, but fewer than ten. One possible query to meet that specification is: SELECT p.Name FROM Production.Product AS p JOIN Production.TransactionHistory AS th ON p.ProductID = th.ProductID GROUP BY p.ProductID, p.Name HAVING COUNT_BIG(*) < 10; That query correctly returns 23 rows (execution plan and data sample shown below): The execution plan looks a bit different from the written form of the query: the base tables are accessed in reverse order, and the aggregation is performed before the join.  The general idea is to read all rows from the history table, compute the count of rows grouped by ProductID, merge join the results to the Product table on ProductID, and finally filter to only return rows where the count is less than ten. This ‘fully-optimized’ plan has an estimated cost of around 0.33 units.  The reason for the quote marks there is that this plan is not quite as optimal as it could be – surely it would make sense to push the Filter down past the join too?  To answer that, let’s look at some other ways to formulate this query.  This being SQL, there are any number of ways to write logically-equivalent query specifications, so we’ll just look at a couple of interesting ones.  The first query is an attempt to reverse-engineer T-SQL from the optimized query plan shown above.  It joins the result of pre-aggregating the history table to the Product table before filtering: SELECT p.Name FROM ( SELECT th.ProductID, cnt = COUNT_BIG(*) FROM Production.TransactionHistory AS th GROUP BY th.ProductID ) AS q1 JOIN Production.Product AS p ON p.ProductID = q1.ProductID WHERE q1.cnt < 10; Perhaps a little surprisingly, we get a slightly different execution plan: The results are the same (23 rows) but this time the Filter is pushed below the join!  The optimizer chooses nested loops for the join, because the cardinality estimate for rows passing the Filter is a bit low (estimate 1 versus 23 actual), though you can force a merge join with a hint and the Filter still appears below the join.  In yet another variation, the < 10 predicate can be ‘manually pushed’ by specifying it in a HAVING clause in the “q1” sub-query instead of in the WHERE clause as written above. The reason this predicate can be pushed past the join in this query form, but not in the original formulation is simply an optimizer limitation – it does make efforts (primarily during the simplification phase) to encourage logically-equivalent query specifications to produce the same execution plan, but the implementation is not completely comprehensive. Moving on to a second example, the following query specification results from phrasing the requirement as “list the products where there exists fewer than ten correlated rows in the history table”: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) < 10 ); Unfortunately, this query produces an incorrect result (86 rows): The problem is that it lists products with no history rows, though the reasons are interesting.  The COUNT_BIG(*) in the EXISTS clause is a scalar aggregate (meaning there is no GROUP BY clause) and scalar aggregates always produce a value, even when the input is an empty set.  In the case of the COUNT aggregate, the result of aggregating the empty set is zero (the other standard aggregates produce a NULL).  To make the point really clear, let’s look at product 709, which happens to be one for which no history rows exist: -- Scalar aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709;   -- Vector aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709 GROUP BY th.ProductID; The estimated execution plans for these two statements are almost identical: You might expect the Stream Aggregate to have a Group By for the second statement, but this is not the case.  The query includes an equality comparison to a constant value (709), so all qualified rows are guaranteed to have the same value for ProductID and the Group By is optimized away. In fact there are some minor differences between the two plans (the first is auto-parameterized and qualifies for trivial plan, whereas the second is not auto-parameterized and requires cost-based optimization), but there is nothing to indicate that one is a scalar aggregate and the other is a vector aggregate.  This is something I would like to see exposed in show plan so I suggested it on Connect.  Anyway, the results of running the two queries show the difference at runtime: The scalar aggregate (no GROUP BY) returns a result of zero, whereas the vector aggregate (with a GROUP BY clause) returns nothing at all.  Returning to our EXISTS query, we could ‘fix’ it by changing the HAVING clause to reject rows where the scalar aggregate returns zero: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) BETWEEN 1 AND 9 ); The query now returns the correct 23 rows: Unfortunately, the execution plan is less efficient now – it has an estimated cost of 0.78 compared to 0.33 for the earlier plans.  Let’s try adding a redundant GROUP BY instead of changing the HAVING clause: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY th.ProductID HAVING COUNT_BIG(*) < 10 ); Not only do we now get correct results (23 rows), this is the execution plan: I like to compare that plan to quantum physics: if you don’t find it shocking, you haven’t understood it properly :)  The simple addition of a redundant GROUP BY has resulted in the EXISTS form of the query being transformed into exactly the same optimal plan we found earlier.  What’s more, in SQL Server 2008 and later, we can replace the odd-looking GROUP BY with an explicit GROUP BY on the empty set: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ); I offer that as an alternative because some people find it more intuitive (and it perhaps has more geek value too).  Whichever way you prefer, it’s rather satisfying to note that the result of the sub-query does not exist for a particular correlated value where a vector aggregate is used (the scalar COUNT aggregate always returns a value, even if zero, so it always ‘EXISTS’ regardless which ProductID is logically being evaluated). The following query forms also produce the optimal plan and correct results, so long as a vector aggregate is used (you can probably find more equivalent query forms): WHERE Clause SELECT p.Name FROM Production.Product AS p WHERE ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) < 10; APPLY SELECT p.Name FROM Production.Product AS p CROSS APPLY ( SELECT NULL FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ) AS ca (dummy); FROM Clause SELECT q1.Name FROM ( SELECT p.Name, cnt = ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) FROM Production.Product AS p ) AS q1 WHERE q1.cnt < 10; This last example uses SUM(1) instead of COUNT and does not require a vector aggregate…you should be able to work out why :) SELECT q.Name FROM ( SELECT p.Name, cnt = ( SELECT SUM(1) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID ) FROM Production.Product AS p ) AS q WHERE q.cnt < 10; The semantics of SQL aggregates are rather odd in places.  It definitely pays to get to know the rules, and to be careful to check whether your queries are using scalar or vector aggregates.  As we have seen, query plans do not show in which ‘mode’ an aggregate is running and getting it wrong can cause poor performance, wrong results, or both. © 2012 Paul White Twitter: @SQL_Kiwi email: [email protected]

    Read the article

  • I see no LOBs!

    - by Paul White
    Is it possible to see LOB (large object) logical reads from STATISTICS IO output on a table with no LOB columns? I was asked this question today by someone who had spent a good fraction of their afternoon trying to work out why this was occurring – even going so far as to re-run DBCC CHECKDB to see if any corruption had taken place.  The table in question wasn’t particularly pretty – it had grown somewhat organically over time, with new columns being added every so often as the need arose.  Nevertheless, it remained a simple structure with no LOB columns – no TEXT or IMAGE, no XML, no MAX types – nothing aside from ordinary INT, MONEY, VARCHAR, and DATETIME types.  To add to the air of mystery, not every query that ran against the table would report LOB logical reads – just sometimes – but when it did, the query often took much longer to execute. Ok, enough of the pre-amble.  I can’t reproduce the exact structure here, but the following script creates a table that will serve to demonstrate the effect: IF OBJECT_ID(N'dbo.Test', N'U') IS NOT NULL DROP TABLE dbo.Test GO CREATE TABLE dbo.Test ( row_id NUMERIC IDENTITY NOT NULL,   col01 NVARCHAR(450) NOT NULL, col02 NVARCHAR(450) NOT NULL, col03 NVARCHAR(450) NOT NULL, col04 NVARCHAR(450) NOT NULL, col05 NVARCHAR(450) NOT NULL, col06 NVARCHAR(450) NOT NULL, col07 NVARCHAR(450) NOT NULL, col08 NVARCHAR(450) NOT NULL, col09 NVARCHAR(450) NOT NULL, col10 NVARCHAR(450) NOT NULL, CONSTRAINT [PK dbo.Test row_id] PRIMARY KEY CLUSTERED (row_id) ) ; The next script loads the ten variable-length character columns with one-character strings in the first row, two-character strings in the second row, and so on down to the 450th row: WITH Numbers AS ( -- Generates numbers 1 - 450 inclusive SELECT TOP (450) n = ROW_NUMBER() OVER (ORDER BY (SELECT 0)) FROM master.sys.columns C1, master.sys.columns C2, master.sys.columns C3 ORDER BY n ASC ) INSERT dbo.Test WITH (TABLOCKX) SELECT REPLICATE(N'A', N.n), REPLICATE(N'B', N.n), REPLICATE(N'C', N.n), REPLICATE(N'D', N.n), REPLICATE(N'E', N.n), REPLICATE(N'F', N.n), REPLICATE(N'G', N.n), REPLICATE(N'H', N.n), REPLICATE(N'I', N.n), REPLICATE(N'J', N.n) FROM Numbers AS N ORDER BY N.n ASC ; Once those two scripts have run, the table contains 450 rows and 10 columns of data like this: Most of the time, when we query data from this table, we don’t see any LOB logical reads, for example: -- Find the maximum length of the data in -- column 5 for a range of rows SELECT result = MAX(DATALENGTH(T.col05)) FROM dbo.Test AS T WHERE row_id BETWEEN 50 AND 100 ; But with a different query… -- Read all the data in column 1 SELECT result = MAX(DATALENGTH(T.col01)) FROM dbo.Test AS T ; …suddenly we have 49 LOB logical reads, as well as the ‘normal’ logical reads we would expect. The Explanation If we had tried to create this table in SQL Server 2000, we would have received a warning message to say that future INSERT or UPDATE operations on the table might fail if the resulting row exceeded the in-row storage limit of 8060 bytes.  If we needed to store more data than would fit in an 8060 byte row (including internal overhead) we had to use a LOB column – TEXT, NTEXT, or IMAGE.  These special data types store the large data values in a separate structure, with just a small pointer left in the original row. Row Overflow SQL Server 2005 introduced a feature called row overflow, which allows one or more variable-length columns in a row to move to off-row storage if the data in a particular row would otherwise exceed 8060 bytes.  You no longer receive a warning when creating (or altering) a table that might need more than 8060 bytes of in-row storage; if SQL Server finds that it can no longer fit a variable-length column in a particular row, it will silently move one or more of these columns off the row into a separate allocation unit. Only variable-length columns can be moved in this way (for example the (N)VARCHAR, VARBINARY, and SQL_VARIANT types).  Fixed-length columns (like INTEGER and DATETIME for example) never move into ‘row overflow’ storage.  The decision to move a column off-row is done on a row-by-row basis – so data in a particular column might be stored in-row for some table records, and off-row for others. In general, if SQL Server finds that it needs to move a column into row-overflow storage, it moves the largest variable-length column record for that row.  Note that in the case of an UPDATE statement that results in the 8060 byte limit being exceeded, it might not be the column that grew that is moved! Sneaky LOBs Anyway, that’s all very interesting but I don’t want to get too carried away with the intricacies of row-overflow storage internals.  The point is that it is now possible to define a table with non-LOB columns that will silently exceed the old row-size limit and result in ordinary variable-length columns being moved to off-row storage.  Adding new columns to a table, expanding an existing column definition, or simply storing more data in a column than you used to – all these things can result in one or more variable-length columns being moved off the row. Note that row-overflow storage is logically quite different from old-style LOB and new-style MAX data type storage – individual variable-length columns are still limited to 8000 bytes each – you can just have more of them now.  Having said that, the physical mechanisms involved are very similar to full LOB storage – a column moved to row-overflow leaves a 24-byte pointer record in the row, and the ‘separate storage’ I have been talking about is structured very similarly to both old-style LOBs and new-style MAX types.  The disadvantages are also the same: when SQL Server needs a row-overflow column value it needs to follow the in-row pointer a navigate another chain of pages, just like retrieving a traditional LOB. And Finally… In the example script presented above, the rows with row_id values from 402 to 450 inclusive all exceed the total in-row storage limit of 8060 bytes.  A SELECT that references a column in one of those rows that has moved to off-row storage will incur one or more lob logical reads as the storage engine locates the data.  The results on your system might vary slightly depending on your settings, of course; but in my tests only column 1 in rows 402-450 moved off-row.  You might like to play around with the script – updating columns, changing data type lengths, and so on – to see the effect on lob logical reads and which columns get moved when.  You might even see row-overflow columns moving back in-row if they are updated to be smaller (hint: reduce the size of a column entry by at least 1000 bytes if you hope to see this). Be aware that SQL Server will not warn you when it moves ‘ordinary’ variable-length columns into overflow storage, and it can have dramatic effects on performance.  It makes more sense than ever to choose column data types sensibly.  If you make every column a VARCHAR(8000) or NVARCHAR(4000), and someone stores data that results in a row needing more than 8060 bytes, SQL Server might turn some of your column data into pseudo-LOBs – all without saying a word. Finally, some people make a distinction between ordinary LOBs (those that can hold up to 2GB of data) and the LOB-like structures created by row-overflow (where columns are still limited to 8000 bytes) by referring to row-overflow LOBs as SLOBs.  I find that quite appealing, but the ‘S’ stands for ‘small’, which makes expanding the whole acronym a little daft-sounding…small large objects anyone? © Paul White 2011 email: [email protected] twitter: @SQL_Kiwi

    Read the article

  • When is a Seek not a Seek?

    - by Paul White
    The following script creates a single-column clustered table containing the integers from 1 to 1,000 inclusive. IF OBJECT_ID(N'tempdb..#Test', N'U') IS NOT NULL DROP TABLE #Test ; GO CREATE TABLE #Test ( id INTEGER PRIMARY KEY CLUSTERED ); ; INSERT #Test (id) SELECT V.number FROM master.dbo.spt_values AS V WHERE V.[type] = N'P' AND V.number BETWEEN 1 AND 1000 ; Let’s say we need to find the rows with values from 100 to 170, excluding any values that divide exactly by 10.  One way to write that query would be: SELECT T.id FROM #Test AS T WHERE T.id IN ( 101,102,103,104,105,106,107,108,109, 111,112,113,114,115,116,117,118,119, 121,122,123,124,125,126,127,128,129, 131,132,133,134,135,136,137,138,139, 141,142,143,144,145,146,147,148,149, 151,152,153,154,155,156,157,158,159, 161,162,163,164,165,166,167,168,169 ) ; That query produces a pretty efficient-looking query plan: Knowing that the source column is defined as an INTEGER, we could also express the query this way: SELECT T.id FROM #Test AS T WHERE T.id >= 101 AND T.id <= 169 AND T.id % 10 > 0 ; We get a similar-looking plan: If you look closely, you might notice that the line connecting the two icons is a little thinner than before.  The first query is estimated to produce 61.9167 rows – very close to the 63 rows we know the query will return.  The second query presents a tougher challenge for SQL Server because it doesn’t know how to predict the selectivity of the modulo expression (T.id % 10 > 0).  Without that last line, the second query is estimated to produce 68.1667 rows – a slight overestimate.  Adding the opaque modulo expression results in SQL Server guessing at the selectivity.  As you may know, the selectivity guess for a greater-than operation is 30%, so the final estimate is 30% of 68.1667, which comes to 20.45 rows. The second difference is that the Clustered Index Seek is costed at 99% of the estimated total for the statement.  For some reason, the final SELECT operator is assigned a small cost of 0.0000484 units; I have absolutely no idea why this is so, or what it models.  Nevertheless, we can compare the total cost for both queries: the first one comes in at 0.0033501 units, and the second at 0.0034054.  The important point is that the second query is costed very slightly higher than the first, even though it is expected to produce many fewer rows (20.45 versus 61.9167). If you run the two queries, they produce exactly the same results, and both complete so quickly that it is impossible to measure CPU usage for a single execution.  We can, however, compare the I/O statistics for a single run by running the queries with STATISTICS IO ON: Table '#Test'. Scan count 63, logical reads 126, physical reads 0. Table '#Test'. Scan count 01, logical reads 002, physical reads 0. The query with the IN list uses 126 logical reads (and has a ‘scan count’ of 63), while the second query form completes with just 2 logical reads (and a ‘scan count’ of 1).  It is no coincidence that 126 = 63 * 2, by the way.  It is almost as if the first query is doing 63 seeks, compared to one for the second query. In fact, that is exactly what it is doing.  There is no indication of this in the graphical plan, or the tool-tip that appears when you hover your mouse over the Clustered Index Seek icon.  To see the 63 seek operations, you have click on the Seek icon and look in the Properties window (press F4, or right-click and choose from the menu): The Seek Predicates list shows a total of 63 seek operations – one for each of the values from the IN list contained in the first query.  I have expanded the first seek node to show the details; it is seeking down the clustered index to find the entry with the value 101.  Each of the other 62 nodes expands similarly, and the same information is contained (even more verbosely) in the XML form of the plan. Each of the 63 seek operations starts at the root of the clustered index B-tree and navigates down to the leaf page that contains the sought key value.  Our table is just large enough to need a separate root page, so each seek incurs 2 logical reads (one for the root, and one for the leaf).  We can see the index depth using the INDEXPROPERTY function, or by using the a DMV: SELECT S.index_type_desc, S.index_depth FROM sys.dm_db_index_physical_stats ( DB_ID(N'tempdb'), OBJECT_ID(N'tempdb..#Test', N'U'), 1, 1, DEFAULT ) AS S ; Let’s look now at the Properties window when the Clustered Index Seek from the second query is selected: There is just one seek operation, which starts at the root of the index and navigates the B-tree looking for the first key that matches the Start range condition (id >= 101).  It then continues to read records at the leaf level of the index (following links between leaf-level pages if necessary) until it finds a row that does not meet the End range condition (id <= 169).  Every row that meets the seek range condition is also tested against the Residual Predicate highlighted above (id % 10 > 0), and is only returned if it matches that as well. You will not be surprised that the single seek (with a range scan and residual predicate) is much more efficient than 63 singleton seeks.  It is not 63 times more efficient (as the logical reads comparison would suggest), but it is around three times faster.  Let’s run both query forms 10,000 times and measure the elapsed time: DECLARE @i INTEGER, @n INTEGER = 10000, @s DATETIME = GETDATE() ; SET NOCOUNT ON; SET STATISTICS XML OFF; ; WHILE @n > 0 BEGIN SELECT @i = T.id FROM #Test AS T WHERE T.id IN ( 101,102,103,104,105,106,107,108,109, 111,112,113,114,115,116,117,118,119, 121,122,123,124,125,126,127,128,129, 131,132,133,134,135,136,137,138,139, 141,142,143,144,145,146,147,148,149, 151,152,153,154,155,156,157,158,159, 161,162,163,164,165,166,167,168,169 ) ; SET @n -= 1; END ; PRINT DATEDIFF(MILLISECOND, @s, GETDATE()) ; GO DECLARE @i INTEGER, @n INTEGER = 10000, @s DATETIME = GETDATE() ; SET NOCOUNT ON ; WHILE @n > 0 BEGIN SELECT @i = T.id FROM #Test AS T WHERE T.id >= 101 AND T.id <= 169 AND T.id % 10 > 0 ; SET @n -= 1; END ; PRINT DATEDIFF(MILLISECOND, @s, GETDATE()) ; On my laptop, running SQL Server 2008 build 4272 (SP2 CU2), the IN form of the query takes around 830ms and the range query about 300ms.  The main point of this post is not performance, however – it is meant as an introduction to the next few parts in this mini-series that will continue to explore scans and seeks in detail. When is a seek not a seek?  When it is 63 seeks © Paul White 2011 email: [email protected] twitter: @SQL_kiwi

    Read the article

  • Fun with Aggregates

    - by Paul White
    There are interesting things to be learned from even the simplest queries.  For example, imagine you are given the task of writing a query to list AdventureWorks product names where the product has at least one entry in the transaction history table, but fewer than ten. One possible query to meet that specification is: SELECT p.Name FROM Production.Product AS p JOIN Production.TransactionHistory AS th ON p.ProductID = th.ProductID GROUP BY p.ProductID, p.Name HAVING COUNT_BIG(*) < 10; That query correctly returns 23 rows (execution plan and data sample shown below): The execution plan looks a bit different from the written form of the query: the base tables are accessed in reverse order, and the aggregation is performed before the join.  The general idea is to read all rows from the history table, compute the count of rows grouped by ProductID, merge join the results to the Product table on ProductID, and finally filter to only return rows where the count is less than ten. This ‘fully-optimized’ plan has an estimated cost of around 0.33 units.  The reason for the quote marks there is that this plan is not quite as optimal as it could be – surely it would make sense to push the Filter down past the join too?  To answer that, let’s look at some other ways to formulate this query.  This being SQL, there are any number of ways to write logically-equivalent query specifications, so we’ll just look at a couple of interesting ones.  The first query is an attempt to reverse-engineer T-SQL from the optimized query plan shown above.  It joins the result of pre-aggregating the history table to the Product table before filtering: SELECT p.Name FROM ( SELECT th.ProductID, cnt = COUNT_BIG(*) FROM Production.TransactionHistory AS th GROUP BY th.ProductID ) AS q1 JOIN Production.Product AS p ON p.ProductID = q1.ProductID WHERE q1.cnt < 10; Perhaps a little surprisingly, we get a slightly different execution plan: The results are the same (23 rows) but this time the Filter is pushed below the join!  The optimizer chooses nested loops for the join, because the cardinality estimate for rows passing the Filter is a bit low (estimate 1 versus 23 actual), though you can force a merge join with a hint and the Filter still appears below the join.  In yet another variation, the < 10 predicate can be ‘manually pushed’ by specifying it in a HAVING clause in the “q1” sub-query instead of in the WHERE clause as written above. The reason this predicate can be pushed past the join in this query form, but not in the original formulation is simply an optimizer limitation – it does make efforts (primarily during the simplification phase) to encourage logically-equivalent query specifications to produce the same execution plan, but the implementation is not completely comprehensive. Moving on to a second example, the following query specification results from phrasing the requirement as “list the products where there exists fewer than ten correlated rows in the history table”: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) < 10 ); Unfortunately, this query produces an incorrect result (86 rows): The problem is that it lists products with no history rows, though the reasons are interesting.  The COUNT_BIG(*) in the EXISTS clause is a scalar aggregate (meaning there is no GROUP BY clause) and scalar aggregates always produce a value, even when the input is an empty set.  In the case of the COUNT aggregate, the result of aggregating the empty set is zero (the other standard aggregates produce a NULL).  To make the point really clear, let’s look at product 709, which happens to be one for which no history rows exist: -- Scalar aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709;   -- Vector aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709 GROUP BY th.ProductID; The estimated execution plans for these two statements are almost identical: You might expect the Stream Aggregate to have a Group By for the second statement, but this is not the case.  The query includes an equality comparison to a constant value (709), so all qualified rows are guaranteed to have the same value for ProductID and the Group By is optimized away. In fact there are some minor differences between the two plans (the first is auto-parameterized and qualifies for trivial plan, whereas the second is not auto-parameterized and requires cost-based optimization), but there is nothing to indicate that one is a scalar aggregate and the other is a vector aggregate.  This is something I would like to see exposed in show plan so I suggested it on Connect.  Anyway, the results of running the two queries show the difference at runtime: The scalar aggregate (no GROUP BY) returns a result of zero, whereas the vector aggregate (with a GROUP BY clause) returns nothing at all.  Returning to our EXISTS query, we could ‘fix’ it by changing the HAVING clause to reject rows where the scalar aggregate returns zero: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) BETWEEN 1 AND 9 ); The query now returns the correct 23 rows: Unfortunately, the execution plan is less efficient now – it has an estimated cost of 0.78 compared to 0.33 for the earlier plans.  Let’s try adding a redundant GROUP BY instead of changing the HAVING clause: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY th.ProductID HAVING COUNT_BIG(*) < 10 ); Not only do we now get correct results (23 rows), this is the execution plan: I like to compare that plan to quantum physics: if you don’t find it shocking, you haven’t understood it properly :)  The simple addition of a redundant GROUP BY has resulted in the EXISTS form of the query being transformed into exactly the same optimal plan we found earlier.  What’s more, in SQL Server 2008 and later, we can replace the odd-looking GROUP BY with an explicit GROUP BY on the empty set: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ); I offer that as an alternative because some people find it more intuitive (and it perhaps has more geek value too).  Whichever way you prefer, it’s rather satisfying to note that the result of the sub-query does not exist for a particular correlated value where a vector aggregate is used (the scalar COUNT aggregate always returns a value, even if zero, so it always ‘EXISTS’ regardless which ProductID is logically being evaluated). The following query forms also produce the optimal plan and correct results, so long as a vector aggregate is used (you can probably find more equivalent query forms): WHERE Clause SELECT p.Name FROM Production.Product AS p WHERE ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) < 10; APPLY SELECT p.Name FROM Production.Product AS p CROSS APPLY ( SELECT NULL FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ) AS ca (dummy); FROM Clause SELECT q1.Name FROM ( SELECT p.Name, cnt = ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) FROM Production.Product AS p ) AS q1 WHERE q1.cnt < 10; This last example uses SUM(1) instead of COUNT and does not require a vector aggregate…you should be able to work out why :) SELECT q.Name FROM ( SELECT p.Name, cnt = ( SELECT SUM(1) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID ) FROM Production.Product AS p ) AS q WHERE q.cnt < 10; The semantics of SQL aggregates are rather odd in places.  It definitely pays to get to know the rules, and to be careful to check whether your queries are using scalar or vector aggregates.  As we have seen, query plans do not show in which ‘mode’ an aggregate is running and getting it wrong can cause poor performance, wrong results, or both. © 2012 Paul White Twitter: @SQL_Kiwi email: [email protected]

    Read the article

  • Heaps of Trouble?

    - by Paul White NZ
    If you’re not already a regular reader of Brad Schulz’s blog, you’re missing out on some great material.  In his latest entry, he is tasked with optimizing a query run against tables that have no indexes at all.  The problem is, predictably, that performance is not very good.  The catch is that we are not allowed to create any indexes (or even new statistics) as part of our optimization efforts. In this post, I’m going to look at the problem from a slightly different angle, and present an alternative solution to the one Brad found.  Inevitably, there’s going to be some overlap between our entries, and while you don’t necessarily need to read Brad’s post before this one, I do strongly recommend that you read it at some stage; he covers some important points that I won’t cover again here. The Example We’ll use data from the AdventureWorks database, copied to temporary unindexed tables.  A script to create these structures is shown below: CREATE TABLE #Custs ( CustomerID INTEGER NOT NULL, TerritoryID INTEGER NULL, CustomerType NCHAR(1) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, ); GO CREATE TABLE #Prods ( ProductMainID INTEGER NOT NULL, ProductSubID INTEGER NOT NULL, ProductSubSubID INTEGER NOT NULL, Name NVARCHAR(50) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, ); GO CREATE TABLE #OrdHeader ( SalesOrderID INTEGER NOT NULL, OrderDate DATETIME NOT NULL, SalesOrderNumber NVARCHAR(25) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, CustomerID INTEGER NOT NULL, ); GO CREATE TABLE #OrdDetail ( SalesOrderID INTEGER NOT NULL, OrderQty SMALLINT NOT NULL, LineTotal NUMERIC(38,6) NOT NULL, ProductMainID INTEGER NOT NULL, ProductSubID INTEGER NOT NULL, ProductSubSubID INTEGER NOT NULL, ); GO INSERT #Custs ( CustomerID, TerritoryID, CustomerType ) SELECT C.CustomerID, C.TerritoryID, C.CustomerType FROM AdventureWorks.Sales.Customer C WITH (TABLOCK); GO INSERT #Prods ( ProductMainID, ProductSubID, ProductSubSubID, Name ) SELECT P.ProductID, P.ProductID, P.ProductID, P.Name FROM AdventureWorks.Production.Product P WITH (TABLOCK); GO INSERT #OrdHeader ( SalesOrderID, OrderDate, SalesOrderNumber, CustomerID ) SELECT H.SalesOrderID, H.OrderDate, H.SalesOrderNumber, H.CustomerID FROM AdventureWorks.Sales.SalesOrderHeader H WITH (TABLOCK); GO INSERT #OrdDetail ( SalesOrderID, OrderQty, LineTotal, ProductMainID, ProductSubID, ProductSubSubID ) SELECT D.SalesOrderID, D.OrderQty, D.LineTotal, D.ProductID, D.ProductID, D.ProductID FROM AdventureWorks.Sales.SalesOrderDetail D WITH (TABLOCK); The query itself is a simple join of the four tables: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #Prods P JOIN #OrdDetail D ON P.ProductMainID = D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID JOIN #OrdHeader H ON D.SalesOrderID = H.SalesOrderID JOIN #Custs C ON H.CustomerID = C.CustomerID ORDER BY P.ProductMainID ASC OPTION (RECOMPILE, MAXDOP 1); Remember that these tables have no indexes at all, and only the single-column sampled statistics SQL Server automatically creates (assuming default settings).  The estimated query plan produced for the test query looks like this (click to enlarge): The Problem The problem here is one of cardinality estimation – the number of rows SQL Server expects to find at each step of the plan.  The lack of indexes and useful statistical information means that SQL Server does not have the information it needs to make a good estimate.  Every join in the plan shown above estimates that it will produce just a single row as output.  Brad covers the factors that lead to the low estimates in his post. In reality, the join between the #Prods and #OrdDetail tables will produce 121,317 rows.  It should not surprise you that this has rather dire consequences for the remainder of the query plan.  In particular, it makes a nonsense of the optimizer’s decision to use Nested Loops to join to the two remaining tables.  Instead of scanning the #OrdHeader and #Custs tables once (as it expected), it has to perform 121,317 full scans of each.  The query takes somewhere in the region of twenty minutes to run to completion on my development machine. A Solution At this point, you may be thinking the same thing I was: if we really are stuck with no indexes, the best we can do is to use hash joins everywhere. We can force the exclusive use of hash joins in several ways, the two most common being join and query hints.  A join hint means writing the query using the INNER HASH JOIN syntax; using a query hint involves adding OPTION (HASH JOIN) at the bottom of the query.  The difference is that using join hints also forces the order of the join, whereas the query hint gives the optimizer freedom to reorder the joins at its discretion. Adding the OPTION (HASH JOIN) hint results in this estimated plan: That produces the correct output in around seven seconds, which is quite an improvement!  As a purely practical matter, and given the rigid rules of the environment we find ourselves in, we might leave things there.  (We can improve the hashing solution a bit – I’ll come back to that later on). Faster Nested Loops It might surprise you to hear that we can beat the performance of the hash join solution shown above using nested loops joins exclusively, and without breaking the rules we have been set. The key to this part is to realize that a condition like (A = B) can be expressed as (A <= B) AND (A >= B).  Armed with this tremendous new insight, we can rewrite the join predicates like so: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #OrdDetail D JOIN #OrdHeader H ON D.SalesOrderID >= H.SalesOrderID AND D.SalesOrderID <= H.SalesOrderID JOIN #Custs C ON H.CustomerID >= C.CustomerID AND H.CustomerID <= C.CustomerID JOIN #Prods P ON P.ProductMainID >= D.ProductMainID AND P.ProductMainID <= D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID ORDER BY D.ProductMainID OPTION (RECOMPILE, LOOP JOIN, MAXDOP 1, FORCE ORDER); I’ve also added LOOP JOIN and FORCE ORDER query hints to ensure that only nested loops joins are used, and that the tables are joined in the order they appear.  The new estimated execution plan is: This new query runs in under 2 seconds. Why Is It Faster? The main reason for the improvement is the appearance of the eager Index Spools, which are also known as index-on-the-fly spools.  If you read my Inside The Optimiser series you might be interested to know that the rule responsible is called JoinToIndexOnTheFly. An eager index spool consumes all rows from the table it sits above, and builds a index suitable for the join to seek on.  Taking the index spool above the #Custs table as an example, it reads all the CustomerID and TerritoryID values with a single scan of the table, and builds an index keyed on CustomerID.  The term ‘eager’ means that the spool consumes all of its input rows when it starts up.  The index is built in a work table in tempdb, has no associated statistics, and only exists until the query finishes executing. The result is that each unindexed table is only scanned once, and just for the columns necessary to build the temporary index.  From that point on, every execution of the inner side of the join is answered by a seek on the temporary index – not the base table. A second optimization is that the sort on ProductMainID (required by the ORDER BY clause) is performed early, on just the rows coming from the #OrdDetail table.  The optimizer has a good estimate for the number of rows it needs to sort at that stage – it is just the cardinality of the table itself.  The accuracy of the estimate there is important because it helps determine the memory grant given to the sort operation.  Nested loops join preserves the order of rows on its outer input, so sorting early is safe.  (Hash joins do not preserve order in this way, of course). The extra lazy spool on the #Prods branch is a further optimization that avoids executing the seek on the temporary index if the value being joined (the ‘outer reference’) hasn’t changed from the last row received on the outer input.  It takes advantage of the fact that rows are still sorted on ProductMainID, so if duplicates exist, they will arrive at the join operator one after the other. The optimizer is quite conservative about introducing index spools into a plan, because creating and dropping a temporary index is a relatively expensive operation.  It’s presence in a plan is often an indication that a useful index is missing. I want to stress that I rewrote the query in this way primarily as an educational exercise – I can’t imagine having to do something so horrible to a production system. Improving the Hash Join I promised I would return to the solution that uses hash joins.  You might be puzzled that SQL Server can create three new indexes (and perform all those nested loops iterations) faster than it can perform three hash joins.  The answer, again, is down to the poor information available to the optimizer.  Let’s look at the hash join plan again: Two of the hash joins have single-row estimates on their build inputs.  SQL Server fixes the amount of memory available for the hash table based on this cardinality estimate, so at run time the hash join very quickly runs out of memory. This results in the join spilling hash buckets to disk, and any rows from the probe input that hash to the spilled buckets also get written to disk.  The join process then continues, and may again run out of memory.  This is a recursive process, which may eventually result in SQL Server resorting to a bailout join algorithm, which is guaranteed to complete eventually, but may be very slow.  The data sizes in the example tables are not large enough to force a hash bailout, but it does result in multiple levels of hash recursion.  You can see this for yourself by tracing the Hash Warning event using the Profiler tool. The final sort in the plan also suffers from a similar problem: it receives very little memory and has to perform multiple sort passes, saving intermediate runs to disk (the Sort Warnings Profiler event can be used to confirm this).  Notice also that because hash joins don’t preserve sort order, the sort cannot be pushed down the plan toward the #OrdDetail table, as in the nested loops plan. Ok, so now we understand the problems, what can we do to fix it?  We can address the hash spilling by forcing a different order for the joins: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #Prods P JOIN #Custs C JOIN #OrdHeader H ON H.CustomerID = C.CustomerID JOIN #OrdDetail D ON D.SalesOrderID = H.SalesOrderID ON P.ProductMainID = D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID ORDER BY D.ProductMainID OPTION (MAXDOP 1, HASH JOIN, FORCE ORDER); With this plan, each of the inputs to the hash joins has a good estimate, and no hash recursion occurs.  The final sort still suffers from the one-row estimate problem, and we get a single-pass sort warning as it writes rows to disk.  Even so, the query runs to completion in three or four seconds.  That’s around half the time of the previous hashing solution, but still not as fast as the nested loops trickery. Final Thoughts SQL Server’s optimizer makes cost-based decisions, so it is vital to provide it with accurate information.  We can’t really blame the performance problems highlighted here on anything other than the decision to use completely unindexed tables, and not to allow the creation of additional statistics. I should probably stress that the nested loops solution shown above is not one I would normally contemplate in the real world.  It’s there primarily for its educational and entertainment value.  I might perhaps use it to demonstrate to the sceptical that SQL Server itself is crying out for an index. Be sure to read Brad’s original post for more details.  My grateful thanks to him for granting permission to reuse some of his material. Paul White Email: [email protected] Twitter: @PaulWhiteNZ

    Read the article

  • So…is it a Seek or a Scan?

    - by Paul White
    You’re probably most familiar with the terms ‘Seek’ and ‘Scan’ from the graphical plans produced by SQL Server Management Studio (SSMS).  The image to the left shows the most common ones, with the three types of scan at the top, followed by four types of seek.  You might look to the SSMS tool-tip descriptions to explain the differences between them: Not hugely helpful are they?  Both mention scans and ranges (nothing about seeks) and the Index Seek description implies that it will not scan the index entirely (which isn’t necessarily true). Recall also yesterday’s post where we saw two Clustered Index Seek operations doing very different things.  The first Seek performed 63 single-row seeking operations; and the second performed a ‘Range Scan’ (more on those later in this post).  I hope you agree that those were two very different operations, and perhaps you are wondering why there aren’t different graphical plan icons for Range Scans and Seeks?  I have often wondered about that, and the first person to mention it after yesterday’s post was Erin Stellato (twitter | blog): Before we go on to make sense of all this, let’s look at another example of how SQL Server confusingly mixes the terms ‘Scan’ and ‘Seek’ in different contexts.  The diagram below shows a very simple heap table with two columns, one of which is the non-clustered Primary Key, and the other has a non-unique non-clustered index defined on it.  The right hand side of the diagram shows a simple query, it’s associated query plan, and a couple of extracts from the SSMS tool-tip and Properties windows. Notice the ‘scan direction’ entry in the Properties window snippet.  Is this a seek or a scan?  The different references to Scans and Seeks are even more pronounced in the XML plan output that the graphical plan is based on.  This fragment is what lies behind the single Index Seek icon shown above: You’ll find the same confusing references to Seeks and Scans throughout the product and its documentation. Making Sense of Seeks Let’s forget all about scans for a moment, and think purely about seeks.  Loosely speaking, a seek is the process of navigating an index B-tree to find a particular index record, most often at the leaf level.  A seek starts at the root and navigates down through the levels of the index to find the point of interest: Singleton Lookups The simplest sort of seek predicate performs this traversal to find (at most) a single record.  This is the case when we search for a single value using a unique index and an equality predicate.  It should be readily apparent that this type of search will either find one record, or none at all.  This operation is known as a singleton lookup.  Given the example table from before, the following query is an example of a singleton lookup seek: Sadly, there’s nothing in the graphical plan or XML output to show that this is a singleton lookup – you have to infer it from the fact that this is a single-value equality seek on a unique index.  The other common examples of a singleton lookup are bookmark lookups – both the RID and Key Lookup forms are singleton lookups (an RID lookup finds a single record in a heap from the unique row locator, and a Key Lookup does much the same thing on a clustered table).  If you happen to run your query with STATISTICS IO ON, you will notice that ‘Scan Count’ is always zero for a singleton lookup. Range Scans The other type of seek predicate is a ‘seek plus range scan’, which I will refer to simply as a range scan.  The seek operation makes an initial descent into the index structure to find the first leaf row that qualifies, and then performs a range scan (either backwards or forwards in the index) until it reaches the end of the scan range. The ability of a range scan to proceed in either direction comes about because index pages at the same level are connected by a doubly-linked list – each page has a pointer to the previous page (in logical key order) as well as a pointer to the following page.  The doubly-linked list is represented by the green and red dotted arrows in the index diagram presented earlier.  One subtle (but important) point is that the notion of a ‘forward’ or ‘backward’ scan applies to the logical key order defined when the index was built.  In the present case, the non-clustered primary key index was created as follows: CREATE TABLE dbo.Example ( key_col INTEGER NOT NULL, data INTEGER NOT NULL, CONSTRAINT [PK dbo.Example key_col] PRIMARY KEY NONCLUSTERED (key_col ASC) ) ; Notice that the primary key index specifies an ascending sort order for the single key column.  This means that a forward scan of the index will retrieve keys in ascending order, while a backward scan would retrieve keys in descending key order.  If the index had been created instead on key_col DESC, a forward scan would retrieve keys in descending order, and a backward scan would return keys in ascending order. A range scan seek predicate may have a Start condition, an End condition, or both.  Where one is missing, the scan starts (or ends) at one extreme end of the index, depending on the scan direction.  Some examples might help clarify that: the following diagram shows four queries, each of which performs a single seek against a column holding every integer from 1 to 100 inclusive.  The results from each query are shown in the blue columns, and relevant attributes from the Properties window appear on the right: Query 1 specifies that all key_col values less than 5 should be returned in ascending order.  The query plan achieves this by seeking to the start of the index leaf (there is no explicit starting value) and scanning forward until the End condition (key_col < 5) is no longer satisfied (SQL Server knows it can stop looking as soon as it finds a key_col value that isn’t less than 5 because all later index entries are guaranteed to sort higher). Query 2 asks for key_col values greater than 95, in descending order.  SQL Server returns these results by seeking to the end of the index, and scanning backwards (in descending key order) until it comes across a row that isn’t greater than 95.  Sharp-eyed readers may notice that the end-of-scan condition is shown as a Start range value.  This is a bug in the XML show plan which bubbles up to the Properties window – when a backward scan is performed, the roles of the Start and End values are reversed, but the plan does not reflect that.  Oh well. Query 3 looks for key_col values that are greater than or equal to 10, and less than 15, in ascending order.  This time, SQL Server seeks to the first index record that matches the Start condition (key_col >= 10) and then scans forward through the leaf pages until the End condition (key_col < 15) is no longer met. Query 4 performs much the same sort of operation as Query 3, but requests the output in descending order.  Again, we have to mentally reverse the Start and End conditions because of the bug, but otherwise the process is the same as always: SQL Server finds the highest-sorting record that meets the condition ‘key_col < 25’ and scans backward until ‘key_col >= 20’ is no longer true. One final point to note: seek operations always have the Ordered: True attribute.  This means that the operator always produces rows in a sorted order, either ascending or descending depending on how the index was defined, and whether the scan part of the operation is forward or backward.  You cannot rely on this sort order in your queries of course (you must always specify an ORDER BY clause if order is important) but SQL Server can make use of the sort order internally.  In the four queries above, the query optimizer was able to avoid an explicit Sort operator to honour the ORDER BY clause, for example. Multiple Seek Predicates As we saw yesterday, a single index seek plan operator can contain one or more seek predicates.  These seek predicates can either be all singleton seeks or all range scans – SQL Server does not mix them.  For example, you might expect the following query to contain two seek predicates, a singleton seek to find the single record in the unique index where key_col = 10, and a range scan to find the key_col values between 15 and 20: SELECT key_col FROM dbo.Example WHERE key_col = 10 OR key_col BETWEEN 15 AND 20 ORDER BY key_col ASC ; In fact, SQL Server transforms the singleton seek (key_col = 10) to the equivalent range scan, Start:[key_col >= 10], End:[key_col <= 10].  This allows both range scans to be evaluated by a single seek operator.  To be clear, this query results in two range scans: one from 10 to 10, and one from 15 to 20. Final Thoughts That’s it for today – tomorrow we’ll look at monitoring singleton lookups and range scans, and I’ll show you a seek on a heap table. Yes, a seek.  On a heap.  Not an index! If you would like to run the queries in this post for yourself, there’s a script below.  Thanks for reading! IF OBJECT_ID(N'dbo.Example', N'U') IS NOT NULL BEGIN DROP TABLE dbo.Example; END ; -- Test table is a heap -- Non-clustered primary key on 'key_col' CREATE TABLE dbo.Example ( key_col INTEGER NOT NULL, data INTEGER NOT NULL, CONSTRAINT [PK dbo.Example key_col] PRIMARY KEY NONCLUSTERED (key_col) ) ; -- Non-unique non-clustered index on the 'data' column CREATE NONCLUSTERED INDEX [IX dbo.Example data] ON dbo.Example (data) ; -- Add 100 rows INSERT dbo.Example WITH (TABLOCKX) ( key_col, data ) SELECT key_col = V.number, data = V.number FROM master.dbo.spt_values AS V WHERE V.[type] = N'P' AND V.number BETWEEN 1 AND 100 ; -- ================ -- Singleton lookup -- ================ ; -- Single value equality seek in a unique index -- Scan count = 0 when STATISTIS IO is ON -- Check the XML SHOWPLAN SELECT E.key_col FROM dbo.Example AS E WHERE E.key_col = 32 ; -- =========== -- Range Scans -- =========== ; -- Query 1 SELECT E.key_col FROM dbo.Example AS E WHERE E.key_col <= 5 ORDER BY E.key_col ASC ; -- Query 2 SELECT E.key_col FROM dbo.Example AS E WHERE E.key_col > 95 ORDER BY E.key_col DESC ; -- Query 3 SELECT E.key_col FROM dbo.Example AS E WHERE E.key_col >= 10 AND E.key_col < 15 ORDER BY E.key_col ASC ; -- Query 4 SELECT E.key_col FROM dbo.Example AS E WHERE E.key_col >= 20 AND E.key_col < 25 ORDER BY E.key_col DESC ; -- Final query (singleton + range = 2 range scans) SELECT E.key_col FROM dbo.Example AS E WHERE E.key_col = 10 OR E.key_col BETWEEN 15 AND 20 ORDER BY E.key_col ASC ; -- === TIDY UP === DROP TABLE dbo.Example; © 2011 Paul White email: [email protected] twitter: @SQL_Kiwi

    Read the article

< Previous Page | 11 12 13 14 15 16 17 18 19 20 21 22  | Next Page >