Search Results

Search found 16126 results on 646 pages for 'wcf performance'.

Page 142/646 | < Previous Page | 138 139 140 141 142 143 144 145 146 147 148 149  | Next Page >

  • When is a SQL function not a function?

    - by Rob Farley
    Should SQL Server even have functions? (Oh yeah – this is a T-SQL Tuesday post, hosted this month by Brad Schulz) Functions serve an important part of programming, in almost any language. A function is a piece of code that is designed to return something, as opposed to a piece of code which isn’t designed to return anything (which is known as a procedure). SQL Server is no different. You can call stored procedures, even from within other stored procedures, and you can call functions and use these in other queries. Stored procedures might query something, and therefore ‘return data’, but a function in SQL is considered to have the type of the thing returned, and can be used accordingly in queries. Consider the internal GETDATE() function. SELECT GETDATE(), SomeDatetimeColumn FROM dbo.SomeTable; There’s no logical difference between the field that is being returned by the function and the field that’s being returned by the table column. Both are the datetime field – if you didn’t have inside knowledge, you wouldn’t necessarily be able to tell which was which. And so as developers, we find ourselves wanting to create functions that return all kinds of things – functions which look up values based on codes, functions which do string manipulation, and so on. But it’s rubbish. Ok, it’s not all rubbish, but it mostly is. And this isn’t even considering the SARGability impact. It’s far more significant than that. (When I say the SARGability aspect, I mean “because you’re unlikely to have an index on the result of some function that’s applied to a column, so try to invert the function and query the column in an unchanged manner”) I’m going to consider the three main types of user-defined functions in SQL Server: Scalar Inline Table-Valued Multi-statement Table-Valued I could also look at user-defined CLR functions, including aggregate functions, but not today. I figure that most people don’t tend to get around to doing CLR functions, and I’m going to focus on the T-SQL-based user-defined functions. Most people split these types of function up into two types. So do I. Except that most people pick them based on ‘scalar or table-valued’. I’d rather go with ‘inline or not’. If it’s not inline, it’s rubbish. It really is. Let’s start by considering the two kinds of table-valued function, and compare them. These functions are going to return the sales for a particular salesperson in a particular year, from the AdventureWorks database. CREATE FUNCTION dbo.FetchSales_inline(@salespersonid int, @orderyear int) RETURNS TABLE AS  RETURN (     SELECT e.LoginID as EmployeeLogin, o.OrderDate, o.SalesOrderID     FROM Sales.SalesOrderHeader AS o     LEFT JOIN HumanResources.Employee AS e     ON e.EmployeeID = o.SalesPersonID     WHERE o.SalesPersonID = @salespersonid     AND o.OrderDate >= DATEADD(year,@orderyear-2000,'20000101')     AND o.OrderDate < DATEADD(year,@orderyear-2000+1,'20000101') ) ; GO CREATE FUNCTION dbo.FetchSales_multi(@salespersonid int, @orderyear int) RETURNS @results TABLE (     EmployeeLogin nvarchar(512),     OrderDate datetime,     SalesOrderID int     ) AS BEGIN     INSERT @results (EmployeeLogin, OrderDate, SalesOrderID)     SELECT e.LoginID, o.OrderDate, o.SalesOrderID     FROM Sales.SalesOrderHeader AS o     LEFT JOIN HumanResources.Employee AS e     ON e.EmployeeID = o.SalesPersonID     WHERE o.SalesPersonID = @salespersonid     AND o.OrderDate >= DATEADD(year,@orderyear-2000,'20000101')     AND o.OrderDate < DATEADD(year,@orderyear-2000+1,'20000101')     ;     RETURN END ; GO You’ll notice that I’m being nice and responsible with the use of the DATEADD function, so that I have SARGability on the OrderDate filter. Regular readers will be hoping I’ll show what’s going on in the execution plans here. Here I’ve run two SELECT * queries with the “Show Actual Execution Plan” option turned on. Notice that the ‘Query cost’ of the multi-statement version is just 2% of the ‘Batch cost’. But also notice there’s trickery going on. And it’s nothing to do with that extra index that I have on the OrderDate column. Trickery. Look at it – clearly, the first plan is showing us what’s going on inside the function, but the second one isn’t. The second one is blindly running the function, and then scanning the results. There’s a Sequence operator which is calling the TVF operator, and then calling a Table Scan to get the results of that function for the SELECT operator. But surely it still has to do all the work that the first one is doing... To see what’s actually going on, let’s look at the Estimated plan. Now, we see the same plans (almost) that we saw in the Actuals, but we have an extra one – the one that was used for the TVF. Here’s where we see the inner workings of it. You’ll probably recognise the right-hand side of the TVF’s plan as looking very similar to the first plan – but it’s now being called by a stack of other operators, including an INSERT statement to be able to populate the table variable that the multi-statement TVF requires. And the cost of the TVF is 57% of the batch! But it gets worse. Let’s consider what happens if we don’t need all the columns. We’ll leave out the EmployeeLogin column. Here, we see that the inline function call has been simplified down. It doesn’t need the Employee table. The join is redundant and has been eliminated from the plan, making it even cheaper. But the multi-statement plan runs the whole thing as before, only removing the extra column when the Table Scan is performed. A multi-statement function is a lot more powerful than an inline one. An inline function can only be the result of a single sub-query. It’s essentially the same as a parameterised view, because views demonstrate this same behaviour of extracting the definition of the view and using it in the outer query. A multi-statement function is clearly more powerful because it can contain far more complex logic. But a multi-statement function isn’t really a function at all. It’s a stored procedure. It’s wrapped up like a function, but behaves like a stored procedure. It would be completely unreasonable to expect that a stored procedure could be simplified down to recognise that not all the columns might be needed, but yet this is part of the pain associated with this procedural function situation. The biggest clue that a multi-statement function is more like a stored procedure than a function is the “BEGIN” and “END” statements that surround the code. If you try to create a multi-statement function without these statements, you’ll get an error – they are very much required. When I used to present on this kind of thing, I even used to call it “The Dangers of BEGIN and END”, and yes, I’ve written about this type of thing before in a similarly-named post over at my old blog. Now how about scalar functions... Suppose we wanted a scalar function to return the count of these. CREATE FUNCTION dbo.FetchSales_scalar(@salespersonid int, @orderyear int) RETURNS int AS BEGIN     RETURN (         SELECT COUNT(*)         FROM Sales.SalesOrderHeader AS o         LEFT JOIN HumanResources.Employee AS e         ON e.EmployeeID = o.SalesPersonID         WHERE o.SalesPersonID = @salespersonid         AND o.OrderDate >= DATEADD(year,@orderyear-2000,'20000101')         AND o.OrderDate < DATEADD(year,@orderyear-2000+1,'20000101')     ); END ; GO Notice the evil words? They’re required. Try to remove them, you just get an error. That’s right – any scalar function is procedural, despite the fact that you wrap up a sub-query inside that RETURN statement. It’s as ugly as anything. Hopefully this will change in future versions. Let’s have a look at how this is reflected in an execution plan. Here’s a query, its Actual plan, and its Estimated plan: SELECT e.LoginID, y.year, dbo.FetchSales_scalar(p.SalesPersonID, y.year) AS NumSales FROM (VALUES (2001),(2002),(2003),(2004)) AS y (year) CROSS JOIN Sales.SalesPerson AS p LEFT JOIN HumanResources.Employee AS e ON e.EmployeeID = p.SalesPersonID; We see here that the cost of the scalar function is about twice that of the outer query. Nicely, the query optimizer has worked out that it doesn’t need the Employee table, but that’s a bit of a red herring here. There’s actually something way more significant going on. If I look at the properties of that UDF operator, it tells me that the Estimated Subtree Cost is 0.337999. If I just run the query SELECT dbo.FetchSales_scalar(281,2003); we see that the UDF cost is still unchanged. You see, this 0.0337999 is the cost of running the scalar function ONCE. But when we ran that query with the CROSS JOIN in it, we returned quite a few rows. 68 in fact. Could’ve been a lot more, if we’d had more salespeople or more years. And so we come to the biggest problem. This procedure (I don’t want to call it a function) is getting called 68 times – each one between twice as expensive as the outer query. And because it’s calling it in a separate context, there is even more overhead that I haven’t considered here. The cheek of it, to say that the Compute Scalar operator here costs 0%! I know a number of IT projects that could’ve used that kind of costing method, but that’s another story that I’m not going to go into here. Let’s look at a better way. Suppose our scalar function had been implemented as an inline one. Then it could have been expanded out like a sub-query. It could’ve run something like this: SELECT e.LoginID, y.year, (SELECT COUNT(*)     FROM Sales.SalesOrderHeader AS o     LEFT JOIN HumanResources.Employee AS e     ON e.EmployeeID = o.SalesPersonID     WHERE o.SalesPersonID = p.SalesPersonID     AND o.OrderDate >= DATEADD(year,y.year-2000,'20000101')     AND o.OrderDate < DATEADD(year,y.year-2000+1,'20000101')     ) AS NumSales FROM (VALUES (2001),(2002),(2003),(2004)) AS y (year) CROSS JOIN Sales.SalesPerson AS p LEFT JOIN HumanResources.Employee AS e ON e.EmployeeID = p.SalesPersonID; Don’t worry too much about the Scan of the SalesOrderHeader underneath a Nested Loop. If you remember from plenty of other posts on the matter, execution plans don’t push the data through. That Scan only runs once. The Index Spool sucks the data out of it and populates a structure that is used to feed the Stream Aggregate. The Index Spool operator gets called 68 times, but the Scan only once (the Number of Executions property demonstrates this). Here, the Query Optimizer has a full picture of what’s being asked, and can make the appropriate decision about how it accesses the data. It can simplify it down properly. To get this kind of behaviour from a function, we need it to be inline. But without inline scalar functions, we need to make our function be table-valued. Luckily, that’s ok. CREATE FUNCTION dbo.FetchSales_inline2(@salespersonid int, @orderyear int) RETURNS table AS RETURN (SELECT COUNT(*) as NumSales     FROM Sales.SalesOrderHeader AS o     LEFT JOIN HumanResources.Employee AS e     ON e.EmployeeID = o.SalesPersonID     WHERE o.SalesPersonID = @salespersonid     AND o.OrderDate >= DATEADD(year,@orderyear-2000,'20000101')     AND o.OrderDate < DATEADD(year,@orderyear-2000+1,'20000101') ); GO But we can’t use this as a scalar. Instead, we need to use it with the APPLY operator. SELECT e.LoginID, y.year, n.NumSales FROM (VALUES (2001),(2002),(2003),(2004)) AS y (year) CROSS JOIN Sales.SalesPerson AS p LEFT JOIN HumanResources.Employee AS e ON e.EmployeeID = p.SalesPersonID OUTER APPLY dbo.FetchSales_inline2(p.SalesPersonID, y.year) AS n; And now, we get the plan that we want for this query. All we’ve done is tell the function that it’s returning a table instead of a single value, and removed the BEGIN and END statements. We’ve had to name the column being returned, but what we’ve gained is an actual inline simplifiable function. And if we wanted it to return multiple columns, it could do that too. I really consider this function to be superior to the scalar function in every way. It does need to be handled differently in the outer query, but in many ways it’s a more elegant method there too. The function calls can be put amongst the FROM clause, where they can then be used in the WHERE or GROUP BY clauses without fear of calling the function multiple times (another horrible side effect of functions). So please. If you see BEGIN and END in a function, remember it’s not really a function, it’s a procedure. And then fix it. @rob_farley

    Read the article

  • Essbase 11.1.2 - JVM_OPTION settings for Essbase

    - by sujata
    When tuning the heap size for Essbase, there are two JVM_OPTIONS settings available for Essbase - one for the Essbase agent and one for the Essbase applications that are using custom-defined functions (CDFs), custom-defined macros (CDMs), data mining, triggers or external authentication. ESS_JVM_OPTION setting is used for the application and mainly for CDFs, CDMs, data mining, triggers, external authentication ESS_CSS_JVM_OPTION setting is used to set the heap size for the Essbase agent

    Read the article

  • Implementing fog of war in opengl es 2.0 game

    - by joxnas
    Hi game development community, this is my first question here! ;) I'm developing a tactics/strategy real time android game and I've been wondering for some time what's the best way to implement an efficient and somewhat nice looking fog of war to incorporate in it. My experience with OpenGL or Android is not vast by any means, but I think it is sufficient for what I'm asking here. So far I have thought in some solutions: Draw white circles to a dark background, corresponding to the units visibility, then render to a texture, and then drawing a quad with that texture with blend mode set to multiply. Will this approach be efficient? Will it take too much memory? (I don't know how to render to texture and then use the texture. Is it too messy?) Have a grid object with a vertex shader which has an array of uniforms having the coordinates of all units, and another array which has their visibility range. The number of units will very probably never be bigger then 100. The vertex shader needs to test for each considered vertex, if there is some unit which can see it. In order to do this it, will have to loop the array with the coordinates and do some calculations based on distance. The efficiency of this is inversely proportional to the looks of it. A more dense grid will result in a more beautiful fog of war... but will require a greater amount of vertexes to be checked. Is it possible to find a nice compromise or is this a bad solution from the start? Which solution is the best? Are there better alternatives? Which ones? Thank you for your time.

    Read the article

  • World Class Training For Them, an Amazon Gift Certificate For You

    - by Adam Machanic
    We have just two weeks to go before Paul Randal and Kimberly Tripp touch down in the Boston area to deliver their famous SQL Server Immersions course . This is going to be a truly fantastic SQL Server learning experience and we're hoping a few more people will join in the fun. This is where you come in: we have a few vacant seats remaining and we need your help spreading the word. Simply tell your friends and colleagues about the course and e-mail me (adam [at] bostonsqltraining [dot] com) the names...(read more)

    Read the article

  • Limiting DOPs &ndash; Who rules over whom?

    - by jean-pierre.dijcks
    I've gotten a couple of questions from Dan Morgan and figured I start to answer them in this way. While Dan is running on a big system he is running with Database Resource Manager and he is trying to make sure the system doesn't go crazy (remember end user are never, ever crazy!) on very high DOPs. Q: How do I control statements with very high DOPs driven from user hints in queries? A: The best way to do this is to work with DBRM and impose limits on consumer groups. The Max DOP setting you can set in DBRM allows you to overwrite the hint. Now let's go into some more detail here. Assume my object (and for simplicity we assume there is a single object - and do remember that we always pick the highest DOP when in doubt and when conflicting DOPs are available in a query) has PARALLEL 64 as its setting. Assume that the query that selects something cool from that table lives in a consumer group with a max DOP of 32. Assume no goofy things (like running out of parallel_max_servers) are happening. A query selecting from this table will run at DOP 32 because DBRM caps the DOP. As of 11.2.0.1 we also use the DBRM cap to create the original plan (at compile time) and not just enforce the cap at runtime. Now, my user is smart and writes a query with a parallel hint requesting DOP 128. This query is still capped by DBRM and DBRM overrules the hint in the statement. The statement, despite the hint, runs at DOP 32. Note that in the hinted scenario we do compile the statement with DOP 128 (the optimizer obeys the hint). This is another reason to use table decoration rather than hints. Q: What happens if I set parallel_max_servers higher than processes (e.g. the max number of processes allowed to run on my machine)? A: Processes rules. It is important to understand that processes are fixed at startup time. If you increase parallel_max_servers above the number of processes in the processes parameter you should get a warning in the alert log stating it can not take effect. As a follow up, a hinted query requesting more parallel processes than either parallel_max_servers or processes will not be able to acquire the requested number. Parallel_max_processes will prevent this. And since parallel_max_servers should be lower than max processes you can never go over either...

    Read the article

  • Why Doesn’t Partition Elimination Work?

    - by Paul White
    Given a partitioned table and a simple SELECT query that compares the partitioning column to a single literal value, why does SQL Server read all the partitions when it seems obvious that only one partition needs to be examined? Sample Data The following script creates a table, partitioned on the char(3) column ‘Div’, and populates it with 100,000 rows of data: USE Sandpit; GO CREATE PARTITION FUNCTION PF ( char (3)) AS RANGE RIGHT FOR VALUES ( '1' , '2' , '3' , '4' , '5' , '6' , '7' , '8' , '9'...(read more)

    Read the article

  • SQL Spatial: Getting “nearest” calculations working properly

    - by Rob Farley
    If you’ve ever done spatial work with SQL Server, I hope you’ve come across the ‘nearest’ problem. You have five thousand stores around the world, and you want to identify the one that’s closest to a particular place. Maybe you want the store closest to the LobsterPot office in Adelaide, at -34.925806, 138.605073. Or our new US office, at 42.524929, -87.858244. Or maybe both! You know how to do this. You don’t want to use an aggregate MIN or MAX, because you want the whole row, telling you which store it is. You want to use TOP, and if you want to find the closest store for multiple locations, you use APPLY. Let’s do this (but I’m going to use addresses in AdventureWorks2012, as I don’t have a list of stores). Oh, and before I do, let’s make sure we have a spatial index in place. I’m going to use the default options. CREATE SPATIAL INDEX spin_Address ON Person.Address(SpatialLocation); And my actual query: WITH MyLocations AS (SELECT * FROM (VALUES ('LobsterPot Adelaide', geography::Point(-34.925806, 138.605073, 4326)),                        ('LobsterPot USA', geography::Point(42.524929, -87.858244, 4326))                ) t (Name, Geo)) SELECT l.Name, a.AddressLine1, a.City, s.Name AS [State], c.Name AS Country FROM MyLocations AS l CROSS APPLY (     SELECT TOP (1) *     FROM Person.Address AS ad     ORDER BY l.Geo.STDistance(ad.SpatialLocation)     ) AS a JOIN Person.StateProvince AS s     ON s.StateProvinceID = a.StateProvinceID JOIN Person.CountryRegion AS c     ON c.CountryRegionCode = s.CountryRegionCode ; Great! This is definitely working. I know both those City locations, even if the AddressLine1s don’t quite ring a bell. I’m sure I’ll be able to find them next time I’m in the area. But of course what I’m concerned about from a querying perspective is what’s happened behind the scenes – the execution plan. This isn’t pretty. It’s not using my index. It’s sucking every row out of the Address table TWICE (which sucks), and then it’s sorting them by the distance to find the smallest one. It’s not pretty, and it takes a while. Mind you, I do like the fact that it saw an indexed view it could use for the State and Country details – that’s pretty neat. But yeah – users of my nifty website aren’t going to like how long that query takes. The frustrating thing is that I know that I can use the index to find locations that are within a particular distance of my locations quite easily, and Microsoft recommends this for solving the ‘nearest’ problem, as described at http://msdn.microsoft.com/en-au/library/ff929109.aspx. Now, in the first example on this page, it says that the query there will use the spatial index. But when I run it on my machine, it does nothing of the sort. I’m not particularly impressed. But what we see here is that parallelism has kicked in. In my scenario, it’s split the data up into 4 threads, but it’s still slow, and not using my index. It’s disappointing. But I can persuade it with hints! If I tell it to FORCESEEK, or use my index, or even turn off the parallelism with MAXDOP 1, then I get the index being used, and it’s a thing of beauty! Part of the plan is here: It’s massive, and it’s ugly, and it uses a TVF… but it’s quick. The way it works is to hook into the GeodeticTessellation function, which is essentially finds where the point is, and works out through the spatial index cells that surround it. This then provides a framework to be able to see into the spatial index for the items we want. You can read more about it at http://msdn.microsoft.com/en-us/library/bb895265.aspx#tessellation – including a bunch of pretty diagrams. One of those times when we have a much more complex-looking plan, but just because of the good that’s going on. This tessellation stuff was introduced in SQL Server 2012. But my query isn’t using it. When I try to use the FORCESEEK hint on the Person.Address table, I get the friendly error: Msg 8622, Level 16, State 1, Line 1 Query processor could not produce a query plan because of the hints defined in this query. Resubmit the query without specifying any hints and without using SET FORCEPLAN. And I’m almost tempted to just give up and move back to the old method of checking increasingly large circles around my location. After all, I can even leverage multiple OUTER APPLY clauses just like I did in my recent Lookup post. WITH MyLocations AS (SELECT * FROM (VALUES ('LobsterPot Adelaide', geography::Point(-34.925806, 138.605073, 4326)),                        ('LobsterPot USA', geography::Point(42.524929, -87.858244, 4326))                ) t (Name, Geo)) SELECT     l.Name,     COALESCE(a1.AddressLine1,a2.AddressLine1,a3.AddressLine1),     COALESCE(a1.City,a2.City,a3.City),     s.Name AS [State],     c.Name AS Country FROM MyLocations AS l OUTER APPLY (     SELECT TOP (1) *     FROM Person.Address AS ad     WHERE l.Geo.STDistance(ad.SpatialLocation) < 1000     ORDER BY l.Geo.STDistance(ad.SpatialLocation)     ) AS a1 OUTER APPLY (     SELECT TOP (1) *     FROM Person.Address AS ad     WHERE l.Geo.STDistance(ad.SpatialLocation) < 5000     AND a1.AddressID IS NULL     ORDER BY l.Geo.STDistance(ad.SpatialLocation)     ) AS a2 OUTER APPLY (     SELECT TOP (1) *     FROM Person.Address AS ad     WHERE l.Geo.STDistance(ad.SpatialLocation) < 20000     AND a2.AddressID IS NULL     ORDER BY l.Geo.STDistance(ad.SpatialLocation)     ) AS a3 JOIN Person.StateProvince AS s     ON s.StateProvinceID = COALESCE(a1.StateProvinceID,a2.StateProvinceID,a3.StateProvinceID) JOIN Person.CountryRegion AS c     ON c.CountryRegionCode = s.CountryRegionCode ; But this isn’t friendly-looking at all, and I’d use the method recommended by Isaac Kunen, who uses a table of numbers for the expanding circles. It feels old-school though, when I’m dealing with SQL 2012 (and later) versions. So why isn’t my query doing what it’s supposed to? Remember the query... WITH MyLocations AS (SELECT * FROM (VALUES ('LobsterPot Adelaide', geography::Point(-34.925806, 138.605073, 4326)),                        ('LobsterPot USA', geography::Point(42.524929, -87.858244, 4326))                ) t (Name, Geo)) SELECT l.Name, a.AddressLine1, a.City, s.Name AS [State], c.Name AS Country FROM MyLocations AS l CROSS APPLY (     SELECT TOP (1) *     FROM Person.Address AS ad     ORDER BY l.Geo.STDistance(ad.SpatialLocation)     ) AS a JOIN Person.StateProvince AS s     ON s.StateProvinceID = a.StateProvinceID JOIN Person.CountryRegion AS c     ON c.CountryRegionCode = s.CountryRegionCode ; Well, I just wasn’t reading http://msdn.microsoft.com/en-us/library/ff929109.aspx properly. The following requirements must be met for a Nearest Neighbor query to use a spatial index: A spatial index must be present on one of the spatial columns and the STDistance() method must use that column in the WHERE and ORDER BY clauses. The TOP clause cannot contain a PERCENT statement. The WHERE clause must contain a STDistance() method. If there are multiple predicates in the WHERE clause then the predicate containing STDistance() method must be connected by an AND conjunction to the other predicates. The STDistance() method cannot be in an optional part of the WHERE clause. The first expression in the ORDER BY clause must use the STDistance() method. Sort order for the first STDistance() expression in the ORDER BY clause must be ASC. All the rows for which STDistance returns NULL must be filtered out. Let’s start from the top. 1. Needs a spatial index on one of the columns that’s in the STDistance call. Yup, got the index. 2. No ‘PERCENT’. Yeah, I don’t have that. 3. The WHERE clause needs to use STDistance(). Ok, but I’m not filtering, so that should be fine. 4. Yeah, I don’t have multiple predicates. 5. The first expression in the ORDER BY is my distance, that’s fine. 6. Sort order is ASC, because otherwise we’d be starting with the ones that are furthest away, and that’s tricky. 7. All the rows for which STDistance returns NULL must be filtered out. But I don’t have any NULL values, so that shouldn’t affect me either. ...but something’s wrong. I do actually need to satisfy #3. And I do need to make sure #7 is being handled properly, because there are some situations (eg, differing SRIDs) where STDistance can return NULL. It says so at http://msdn.microsoft.com/en-us/library/bb933808.aspx – “STDistance() always returns null if the spatial reference IDs (SRIDs) of the geography instances do not match.” So if I simply make sure that I’m filtering out the rows that return NULL… …then it’s blindingly fast, I get the right results, and I’ve got the complex-but-brilliant plan that I wanted. It just wasn’t overly intuitive, despite being documented. @rob_farley

    Read the article

  • New DMV… not yet

    - by Michael Zilberstein
    Downloaded and installed new toy: And while reading BOL, stumbled upon new extremely useful DMV: sys.dm_exec_query_profiles . This DMV enables DBA to monitor query progress while it is being executed. Counters in the DMV are per operation per thread. So we’ll be able to monitor in real time which thread (even for parallel processing) processes which node in the plan. Or find heavy operations “post mortem”. We all know the uncomfortable feeling when some heavy query runs and the boss starts asking...(read more)

    Read the article

  • Partition Wise Joins

    - by jean-pierre.dijcks
    Some say they are the holy grail of parallel computing and PWJ is the basis for a shared nothing system and the only join method that is available on a shared nothing system (yes this is oversimplified!). The magic in Oracle is of course that is one of many ways to join data. And yes, this is the old flexibility vs. simplicity discussion all over, so I won't go there... the point is that what you must do in a shared nothing system, you can do in Oracle with the same speed and methods. The Theory A partition wise join is a join between (for simplicity) two tables that are partitioned on the same column with the same partitioning scheme. In shared nothing this is effectively hard partitioning locating data on a specific node / storage combo. In Oracle is is logical partitioning. If you now join the two tables on that partitioned column you can break up the join in smaller joins exactly along the partitions in the data. Since they are partitioned (grouped) into the same buckets, all values required to do the join live in the equivalent bucket on either sides. No need to talk to anyone else, no need to redistribute data to anyone else... in short, the optimal join method for parallel processing of two large data sets. PWJ's in Oracle Since we do not hard partition the data across nodes in Oracle we use the Partitioning option to the database to create the buckets, then set the Degree of Parallelism (or run Auto DOP - see here) and get our PWJs. The main questions always asked are: How many partitions should I create? What should my DOP be? In a shared nothing system the answer is of course, as many partitions as there are nodes which will be your DOP. In Oracle we do want you to look at the workload and concurrency, and once you know that to understand the following rules of thumb. Within Oracle we have more ways of joining of data, so it is important to understand some of the PWJ ideas and what it means if you have an uneven distribution across processes. Assume we have a simple scenario where we partition the data on a hash key resulting in 4 hash partitions (H1 -H4). We have 2 parallel processes that have been tasked with reading these partitions (P1 - P2). The work is evenly divided assuming the partitions are the same size and we can scan this in time t1 as shown below. Now assume that we have changed the system and have a 5th partition but still have our 2 workers P1 and P2. The time it takes is actually 50% more assuming the 5th partition has the same size as the original H1 - H4 partitions. In other words to scan these 5 partitions, the time t2 it takes is not 1/5th more expensive, it is a lot more expensive and some other join plans may now start to look exciting to the optimizer. Just to post the disclaimer, it is not as simple as I state it here, but you get the idea on how much more expensive this plan may now look... Based on this little example there are a few rules of thumb to follow to get the partition wise joins. First, choose a DOP that is a factor of two (2). So always choose something like 2, 4, 8, 16, 32 and so on... Second, choose a number of partitions that is larger or equal to 2* DOP. Third, make sure the number of partitions is divisible through 2 without orphans. This is also known as an even number... Fourth, choose a stable partition count strategy, which is typically hash, which can be a sub partitioning strategy rather than the main strategy (range - hash is a popular one). Fifth, make sure you do this on the join key between the two large tables you want to join (and this should be the obvious one...). Translating this into an example: DOP = 8 (determined based on concurrency or by using Auto DOP with a cap due to concurrency) says that the number of partitions >= 16. Number of hash (sub) partitions = 32, which gives each process four partitions to work on. This number is somewhat arbitrary and depends on your data and system. In this case my main reasoning is that if you get more room on the box you can easily move the DOP for the query to 16 without repartitioning... and of course it makes for no leftovers on the table... And yes, we recommend up-to-date statistics. And before you start complaining, do read this post on a cool way to do stats in 11.

    Read the article

  • Have you really fixed that problem?

    - by DavidWimbush
    The day before yesterday I saw our main live server's CPU go up to constantly 100% with just the occasional short drop to a lower level. The exact opposite of what you'd want to see. We're log shipping every 15 minutes and part of that involves calling WinRAR to compress the log backups before copying them over. (We're on SQL2005 so there's no native compression and we have bandwidth issues with the connection to our remote site.) I realised the log shipping jobs were taking about 10 minutes and that most of that was spent shipping a 'live' reporting database that is completely rebuilt every 20 minutes. (I'm just trying to keep this stuff alive until I can improve it.) We can rebuild this database in minutes if we have to fail over so I disabled log shipping of that database. The log shipping went down to less than 2 minutes and I went off to the SQL Social evening in London feeling quite pleased with myself. It was a great evening - fun, educational and thought-provoking. Thanks to Simon Sabin & co for laying that on, and thanks too to the guests for making the effort when they must have been pretty worn out after doing DevWeek all day first. The next morning I came down to earth with a bump: CPU still at 100%. WTF? I looked in the activity monitor but it was confusing because some sessions have been running for a long time so it's not a good guide what's using the CPU now. I tried the standard reports showing queries by CPU (average and total) but they only show the top 10 so they just show my big overnight archiving and data cleaning stuff. But the Profiler showed it was four queries used by our new website usage tracking system. Four simple indexes later the CPU was back where it should be: about 20% with occasional short spikes. So the moral is: even when you're convinced you've found the cause and fixed the problem, you HAVE to go back and confirm that the problem has gone. And, yes, I have checked the CPU again today and it's still looking sweet.

    Read the article

  • ORMs - Should DBAs just lighten up?

    - by simonsabin
    I did a presentation at DDD8 on the entity framework and how to stop your DBA from having a heart attack. You can find my demos and slide deck here http://sqlblogcasts.com/blogs/simons/archive/2010/01/30/Entity-Framework-how-to-stop-your-DBA-having-a-heart-attack.aspx Whilst at DDD Mike Ormond interviewed me about my view on ORMs and the battel between DBAs and Devs. To see what I said go tohttp://bit.ly/bnf1By

    Read the article

  • SSAS Native v .net Provider

    - by ACALVETT
    Recently I was investigating why a new server which is in its parallel running phase was taking significantly longer to process the daily data than the server its due to replace. The server has SQL & SSAS installed so the problem was not likely to be in the network transfer as its using shared memory. As i dug around the SQL dmv’s i noticed in sys.dm_exec_connections that the SSAS connection had a packet size of 8000 bytes instead of the usual 4096 bytes and from there i found that the datasource...(read more)

    Read the article

  • OBIEE 11.1.1 - How to Enable Caching in Internet Information Services (IIS) 7.0+

    - by Ahmed A
    Follow these steps to configure static file caching and content expiration if you are using IIS 7.0 Web Server with Oracle Business Intelligence. Tip: Install IIS URL Rewrite that enables Web administrators to create powerful outbound rules. Following are the steps to set up static file caching for IIS 7.0+ Web Server: 1. In “web.config” file for OBIEE static files virtual directory (ORACLE_HOME/bifoundation/web/app) add the following highlight in bold the outbound rule for caching:<?xml version="1.0" encoding="UTF-8"?><configuration>    <system.webServer>        <urlCompression doDynamicCompression="true" />        <rewrite>            <outboundRules>                <rule name="header1" preCondition="FilesMatch" patternSyntax="Wildcard">                    <match serverVariable="RESPONSE_CACHE_CONTROL" pattern="*" />                    <action type="Rewrite" value="max-age=604800" />                </rule>                <preConditions>    <preCondition name="FilesMatch">                        <add input="{RESPONSE_CONTENT_TYPE}" pattern="^text/css|^text/x-javascript|^text/javascript|^image/gif|^image/jpeg|^image/png" />                    </preCondition>                </preConditions>            </outboundRules>        </rewrite>    </system.webServer></configuration>2. Restart IIS. Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;}

    Read the article

  • IBM System x3850 X5 TPC-H Benchmark

    - by jchang
    IBM just published a TPC-H SF 1000 result for their x3850 X5 , 4-way Xeon 7560 system featuring a special MAX5 memory expansion board to support 1.5TB memory. In Dec 2010, IBM also published a TPC-H SF1000 for their Power 780 system, 8-way, quad-core, (4 logical processors per physical core). In Feb 2011, Ingres published a TPC-H SF 100 on a 2-way Xeon 5680 for their VectorWise column-store engine (plus enhancements for memory architecture, SIMD and compression). The figure table below shows TPC-H...(read more)

    Read the article

  • Expected sprint completion rate and load in scrum?

    - by bjarkef
    Recently at work there have been increased focus on completion rate and load on the developers in our sprints. With completion rate I mean, if we plan 20 user stories for a sprint, what percentage of these user stories are closed at the end of the sprint. And with load I mean, if we have a sprint with 3 developers of 60 hours each, i.e. 180 hours for the sprint, how many hours worth of user stories do we schedule for the sprint. So I am really interested in others experience with this, I guess this is something everybody working with scrum deals with. My question is, what completion rate and load are expected/usual, and how are your team doing with respect to these parameters?

    Read the article

  • Feature pack for SQL Server 2005 SP4 - collection of standalone packages

    - by ssqa.net
    With the release of SQL2005Sp4 an additional task is essential for DBAs & Developers to avoid any compatibility issues with existing code agains SP4 instance. Feature pack for SQL Server 2005 SP4 is available to download which contains the standalone packages such as SQLNative Client, ADOMD, OLAPDM etc.... as it states the feature pack are built on latest versions of add-on and backward compatibility contents for SQL Server 2005. The above link provides individual file to download for each environment...(read more)

    Read the article

  • Downloads killing internet on my home network

    - by Travis
    I am currently having a problem with my wireless. Whenever I try to download anything it kills the internet for every other application(tabs within the same browser, browsers on other computers on the same network) except the process doing the download. This occurs with everything from downloading updates to iso's. I am not using a torrent. It happens when downloading upgrades, browser downloads, or anything else. This problem does not occur when I use Windows 7 on the same computer and it stops killing the internet for other computers if I turn the download/Ubuntu off. I am using an ASUS G74SX laptop running Ubuntu 12.10 with Gnome 3.6. My wireless card is an Intel Corporation Centrino Wireless-N + WiMAX 6150 (rev 67) Thanks!

    Read the article

  • Advanced donut caching: using dynamically loaded controls

    - by DigiMortal
    Yesterday I solved one caching problem with local community portal. I enabled output cache on SharePoint Server 2007 to make site faster. Although caching works fine I needed to do some additional work because there are some controls that show different content to different users. In this example I will show you how to use “donut caching” with user controls – powerful way to drive some content around cache. About donut caching Donut caching means that although you are caching your content you have some holes in it so you can still affect the output that goes to user. By example you can cache front page on your site and still show welcome message that contains correct user name. To get better idea about donut caching I suggest you to read ScottGu posting Tip/Trick: Implement "Donut Caching" with the ASP.NET 2.0 Output Cache Substitution Feature. Basically donut caching uses ASP.NET substitution control. In output this control is replaced by string you return from static method bound to substitution control. Again, take a look at ScottGu blog posting I referred above. Problem If you look at Scott’s example it is pretty plain and easy by its output. All it does is it writes out current user name as string. Here are examples of my login area for anonymous and authenticated users:    It is clear that outputting mark-up for these views as string is pretty lame to implement in code at string level. Every little change in design will end up with new version of controls library because some parts of design “live” there. Solution: using user controls I worked out easy solution to my problem. I used cache substitution and user controls together. I have three user controls: LogInControl – this is the proxy control that checks which “real” control to load. AnonymousLogInControl – template and logic for anonymous users login area. AuthenticatedLogInControl – template and logic for authenticated users login area. This is the control we render for each user separately because it contains user name and user profile fill percent. Anonymous control is not very interesting because it is only about keeping mark-up in separate file. Interesting parts are LogInControl and AuthenticatedLogInControl. Creating proxy control The first thing was to create control that has substitution area where “real” control is loaded. This proxy control should also be available to decide which control to load. The definition of control is very primitive. <%@ Control EnableViewState="false" Inherits="MyPortal.Profiles.LogInControl" %> <asp:Substitution runat="server" MethodName="ShowLogInBox" /> But code is a little bit tricky. Based on current user instance we decide which login control to load. Then we create page instance and load our control through it. When control is loaded we will call DataBind() method. In this method we evaluate all fields in loaded control (it was best choice as Load and other events will not be fired). Take a look at the code. public static string ShowLogInBox(HttpContext context) {     var user = SPContext.Current.Web.CurrentUser;     string controlName;       if (user != null)         controlName = "AuthenticatedLogInControl.ascx";     else         controlName = "AnonymousLogInControl.ascx";       var path = "~/_controltemplates/" + controlName;     var output = new StringBuilder(10000);       using(var page = new Page())     using(var ctl = page.LoadControl(path))     using(var writer = new StringWriter(output))     using(var htmlWriter = new HtmlTextWriter(writer))     {         ctl.DataBind();         ctl.RenderControl(htmlWriter);     }     return output.ToString(); } When control is bound to data we ask to render it its contents to StringBuilder. Now we have the output of control as string and we can return it from our method. Of course, notice how correct I am with resources disposing. :) The method that returns contents for substitution control is static method that has no connection with control instance because hen page is read from cache there are no instances of controls available. Conclusion As you saw it was not very hard to use donut caching with user controls. Instead of writing mark-up of controls to static method that is bound to substitution control we can still use our user controls.

    Read the article

  • Demantra USA Based Companies and SOX Compliance

    - by user702295
    A USA based company is assessing Demantra Trade Promotion Management (TPM) capability.  It appears that SOX is necessary in their case due to the nature of what TPM does and the necessity for auditability.  Do we have any detail on SOX compliance for Demantra? Answser ------- SOX compliance with regards to IT: 1.  Requires auditing of data changes done by who, what, when     a. Audit trail profiles can be set up for key financial series and view them in audit trail reports     b. One functionality we do not have which typically is asked for is user login history. We have only        active sessions, history is not available. 2.  Segregation of duties     a. With respect to TPM, you could have deduction and financial analyst for settlement be different        from promotion creator, promotion approver or sales team.     b. Budget Approver for funds can be different from funds consumer.     c. Promotion creator can be different than promotion approver     d. For a US customer you may have to write some custom scripts to capture promotion status change        and produce an external report as part of compliance. One additional requirement is transparency of forward commitments entered into with retailers / distributors for trade spending, promotions.  Outside of Demantra - Consumer Goods Trade Funds Analytics.

    Read the article

  • DBCC MEMUSAGE in 2005/8 ?

    - by steveh99999
    I used to like using undocumented command DBCC MEMUSAGE in SQL 2000 to see which tables were using space in SQL data cache. In SQL 2005, this command is not longer present. Instead a DMV – sys.dm_os_buffer_descriptors – can be used to display data cache contents,  but this doesn’t quite give you the same output as DBCC MEMUSAGE. I’m also aware that you can use Quest’s spotlight tool to view a summary of data cache contents. Using  this post by Umachandar Jayachandran  of Microsoft, I was able to create the following equivalent for SQL 2005/8. I’ve wrapped Umachandar’s original query in a CTE to produce summary information :- ;WITH memusage_CTE AS (SELECT bd.database_id, bd.file_id, bd.page_id, bd.page_type , COALESCE(p1.object_id, p2.object_id) AS object_id , COALESCE(p1.index_id, p2.index_id) AS index_id , bd.row_count, bd.free_space_in_bytes, CONVERT(TINYINT,bd.is_modified) AS 'DirtyPage' FROM sys.dm_os_buffer_descriptors AS bd JOIN sys.allocation_units AS au ON au.allocation_unit_id = bd.allocation_unit_id OUTER APPLY ( SELECT TOP(1) p.object_id, p.index_id FROM sys.partitions AS p WHERE p.hobt_id = au.container_id AND au.type IN (1, 3) ) AS p1 OUTER APPLY ( SELECT TOP(1) p.object_id, p.index_id FROM sys.partitions AS p WHERE p.partition_id = au.container_id AND au.type = 2 ) AS p2 WHERE  bd.database_id = DB_ID() AND bd.page_type IN ('DATA_PAGE', 'INDEX_PAGE') ) SELECT TOP 20 DB_NAME(database_id) AS 'Database',OBJECT_NAME(object_id,database_id) AS 'Table Name', index_id,COUNT(*) AS 'Pages in Cache', SUM(dirtyPage) AS 'Dirty Pages' FROM memusage_CTE GROUP BY database_id, object_id, index_id ORDER BY COUNT(*) DESC I’m not 100% happy with the results of the above query however… I’ve noticed that on a busy BizTalk messageBox database  it will return information on pages that contain GHOST rows – . ie where data has already been deleted but has yet to be cleaned-up by a background process – I’m need to investigate further why cache on this server apparently contains so much GHOST data… For more information on the background ghost cleanup process, see this article by Paul Randall. However, I think the results of this query should still be of interest to a DBA. I have another post to come shortly regarding an example I encountered where this information proved useful to me… I notice in SQL 2008, sys.dm_os_buffer_descriptors gained an extra column – numa_mode – I’m interested to see how this is populated and how useful this column can be on a NUMA-enabled system. I’m assuming in theory you could use this column to help analyse how your tables are spread across Numa-enabled data-cache ?

    Read the article

  • Two Free Training Webcasts Open for Registration

    - by KKline
    We've got two sessions that you need to sign up for right away. The upcoming webcast for Oracle-oriented folks has huge registration numbers. So get in while you still can before we hit the limit of what LiveMeeting can handle. Pain of the Week: SQL Server for the Oracle DBA Webcast: SQL Server for the Oracle DBA Date: Thursday, May 27, 2010 (Just a couple days hence!) Time: 8 a.m. Pacific / 11 a.m. Eastern / 4 p.m. United Kingdom / 5 p.m. Central Europe Duration: 45-60 minutes Cost: FREE In enterprise...(read more)

    Read the article

  • Query Tuning Mastery at PASS Summit 2012: The Video

    - by Adam Machanic
    An especially clever community member was kind enough to reverse-engineer the video stream for me, and came up with a direct link to the PASS TV video stream for my Query Tuning Mastery: The Art and Science of Manhandling Parallelism talk, delivered at the PASS Summit last Thursday. I'm not sure how long this link will work , but I'd like to share it for my readers who were unable to see it in person or live on the stream. Start here. Skip past the keynote, to the 149 minute mark. Enjoy!...(read more)

    Read the article

  • Idea to develop a caching server between IIS and SQL Server

    - by John
    I work on a few high traffic websites that all share the same database and that are all heavily database driven. Our SQL server is max-ed out and, although we have already implemented many changes that have helped but the server is still working too hard. We employ some caching in our website but the type of queries we use negate using SQL dependency caching. We tried SQL replication to try and kind of load balance but that didn't prove very successful because the replication process is quite demanding on the servers too and it needed to be done frequently as it is important that data is up to date. We do use a Varnish web caching server (Linux based) to take a bit of the load off both the web and database server but as a lot of the sites are customised based on the user we can only do so much. Anyway, the reason for this question... Varnish gave me an idea for a possible application that might help in this situation. Just like Varnish sits between a web browser and the web server and caches response from the web server, I was wondering about the possibility of creating something that sits between the web server and the database server. Imagine that all SQL queries go through this SQL caching server. If it's a first time query then it will get recorded, and the result requested from the SQL server and stored locally on the cache server. If it's a repeat request within a set time then the result gets retrieved from the local copy without the query being sent to the SQL server. The caching server could also take advantage of SQL dependency caching notifications. This seems like a good idea in theory. There's still the same amount of data moving back and forward from the web server, but the SQL server is relieved of the work of processing the repeat queries. I wonder about how difficult it would be to build a service that sort of emulates requests and responses from SQL server, whether SQL server's own caching is doing enough of this already that this wouldn't be a benefit, or even if someone has done this before and I haven't found it? I would welcome any feedback or any references to any relevant projects.

    Read the article

< Previous Page | 138 139 140 141 142 143 144 145 146 147 148 149  | Next Page >