Search Results

Search found 1402 results on 57 pages for 'rob fisher'.

Page 9/57 | < Previous Page | 5 6 7 8 9 10 11 12 13 14 15 16  | Next Page >

  • How many people will be with you during 24HOP?

    - by Rob Farley
    In less than a week, SQLPASS hosts another 24 Hours of PASS event, this time with an array of 24 female speakers (in honour of this month being Women’s History Month). Interestingly, the committee has had a few people ask if there are rules about how the event can be viewed, such as “How many people from any one organisation can watch it?” or “Does it matter if a few people are crowded around the same screen?” From a licensing and marketing perspective, there is value in knowing how many people are watching the event, but there are no restrictions about how the thing is viewed. In fact – if you’re planning to watch any of these events, I want to suggest an idea: Book a meeting room in your office with a projector, and watch 24HOP in there. If you’re planning to have it streaming in the background while you work, obviously this makes life a bit trickier. But if you’re planning to treat it as a training event (a 2-day conference if you like) and block out a bit of time for it (as well you should – there’s going to be some great stuff in there), then why not do it in a way that makes it so that other people can see that you’re watching it, and potentially join you. When an event like this runs, we can see how many different ‘people’ are attending each LiveMeeting session. What we can’t tell is how many actual people there are represented. Jessica Moss spoke to the Adelaide SQL Server User Group a few weeks ago via LiveMeeting, and LiveMeeting told us there were less than a dozen people attending. Really there were at least three times that number, because all the people in the room with me weren’t included. I’d love to imagine that every LiveMeeting attendee represented a crowd in a room, watching a shared screen. So there’s my challenge – don’t let your LiveMeeting session represent just you. Find a way of involving other people. At the very least, you’ll be able to discuss it with them afterwards. Now stick a comment on this post to let me know how many people are going to be joining you. :) If you’re not registered for the event yet, get yourself over to the SQLPASS site and make it happen.

    Read the article

  • T-SQL Tuesday #33: Trick Shots: Undocumented, Underdocumented, and Unknown Conspiracies!

    - by Most Valuable Yak (Rob Volk)
    Mike Fal (b | t) is hosting this month's T-SQL Tuesday on Trick Shots.  I love this choice because I've been preoccupied with sneaky/tricky/evil SQL Server stuff for a long time and have been presenting on it for the past year.  Mike's directives were "Show us a cool trick or process you developed…It doesn’t have to be useful", which most of my blogging definitely fits, and "Tell us what you learned from this trick…tell us how it gave you insight in to how SQL Server works", which is definitely a new concept.  I've done a lot of reading and watching on SQL Server Internals and even attended training, but sometimes I need to go explore on my own, using my own tools and techniques.  It's an itch I get every few months, and, well, it sure beats workin'. I've found some people to be intimidated by SQL Server's internals, and I'll admit there are A LOT of internals to keep track of, but there are tons of excellent resources that clearly document most of them, and show how knowing even the basics of internals can dramatically improve your database's performance.  It may seem like rocket science, or even brain surgery, but you don't have to be a genius to understand it. Although being an "evil genius" can help you learn some things they haven't told you about. ;) This blog post isn't a traditional "deep dive" into internals, it's more of an approach to find out how a program works.  It utilizes an extremely handy tool from an even more extremely handy suite of tools, Sysinternals.  I'm not the only one who finds Sysinternals useful for SQL Server: Argenis Fernandez (b | t), Microsoft employee and former T-SQL Tuesday host, has an excellent presentation on how to troubleshoot SQL Server using Sysinternals, and I highly recommend it.  Argenis didn't cover the Strings.exe utility, but I'll be using it to "hack" the SQL Server executable (DLL and EXE) files. Please note that I'm not promoting software piracy or applying these techniques to attack SQL Server via internal knowledge. This is strictly educational and doesn't reveal any proprietary Microsoft information.  And since Argenis works for Microsoft and demonstrated Sysinternals with SQL Server, I'll just let him take the blame for it. :P (The truth is I've used Strings.exe on SQL Server before I ever met Argenis.) Once you download and install Strings.exe you can run it from the command line.  For our purposes we'll want to run this in the Binn folder of your SQL Server instance (I'm referencing SQL Server 2012 RTM): cd "C:\Program Files\Microsoft SQL Server\MSSQL11\MSSQL\Binn" C:\Program Files\Microsoft SQL Server\MSSQL11\MSSQL\Binn> strings *sql*.dll > sqldll.txt C:\Program Files\Microsoft SQL Server\MSSQL11\MSSQL\Binn> strings *sql*.exe > sqlexe.txt   I've limited myself to DLLs and EXEs that have "sql" in their names.  There are quite a few more but I haven't examined them in any detail. (Homework assignment for you!) If you run this yourself you'll get 2 text files, one with all the extracted strings from every SQL DLL file, and the other with the SQL EXE strings.  You can open these in Notepad, but you're better off using Notepad++, EditPad, Emacs, Vim or another more powerful text editor, as these will be several megabytes in size. And when you do open it…you'll find…a TON of gibberish.  (If you think that's bad, just try opening the raw DLL or EXE file in Notepad.  And by the way, don't do this in production, or even on a running instance of SQL Server.)  Even if you don't clean up the file, you can still use your editor's search function to find a keyword like "SELECT" or some other item you expect to be there.  As dumb as this sounds, I sometimes spend my lunch break just scanning the raw text for anything interesting.  I'm boring like that. Sometimes though, having these files available can lead to some incredible learning experiences.  For me the most recent time was after reading Joe Sack's post on non-parallel plan reasons.  He mentions a new SQL Server 2012 execution plan element called NonParallelPlanReason, and demonstrates a query that generates "MaxDOPSetToOne".  Joe (formerly on the Microsoft SQL Server product team, so he knows this stuff) mentioned that this new element was not currently documented and tried a few more examples to see what other reasons could be generated. Since I'd already run Strings.exe on the SQL Server DLLs and EXE files, it was easy to run grep/find/findstr for MaxDOPSetToOne on those extracts.  Once I found which files it belonged to (sqlmin.dll) I opened the text to see if the other reasons were listed.  As you can see in my comment on Joe's blog, there were about 20 additional non-parallel reasons.  And while it's not "documentation" of this underdocumented feature, the names are pretty self-explanatory about what can prevent parallel processing. I especially like the ones about cursors – more ammo! - and am curious about the PDW compilation and Cloud DB replication reasons. One reason completely stumped me: NoParallelHekatonPlan.  What the heck is a hekaton?  Google and Wikipedia were vague, and the top results were not in English.  I found one reference to Greek, stating "hekaton" can be translated as "hundredfold"; with a little more Wikipedia-ing this leads to hecto, the prefix for "one hundred" as a unit of measure.  I'm not sure why Microsoft chose hekaton for such a plan name, but having already learned some Greek I figured I might as well dig some more in the DLL text for hekaton.  Here's what I found: hekaton_slow_param_passing Occurs when a Hekaton procedure call dispatch goes to slow parameter passing code path The reason why Hekaton parameter passing code took the slow code path hekaton_slow_param_pass_reason sp_deploy_hekaton_database sp_undeploy_hekaton_database sp_drop_hekaton_database sp_checkpoint_hekaton_database sp_restore_hekaton_database e:\sql11_main_t\sql\ntdbms\hekaton\sqlhost\sqllang\hkproc.cpp e:\sql11_main_t\sql\ntdbms\hekaton\sqlhost\sqllang\matgen.cpp e:\sql11_main_t\sql\ntdbms\hekaton\sqlhost\sqllang\matquery.cpp e:\sql11_main_t\sql\ntdbms\hekaton\sqlhost\sqllang\sqlmeta.cpp e:\sql11_main_t\sql\ntdbms\hekaton\sqlhost\sqllang\resultset.cpp Interesting!  The first 4 entries (in red) mention parameters and "slow code".  Could this be the foundation of the mythical DBCC RUNFASTER command?  Have I been passing my parameters the slow way all this time? And what about those sp_xxxx_hekaton_database procedures (in blue)? Could THEY be the secret to a faster SQL Server? Could they promise a "hundredfold" improvement in performance?  Are these special, super-undocumented DIB (databases in black)? I decided to look in the SQL Server system views for any objects with hekaton in the name, or references to them, in hopes of discovering some new code that would answer all my questions: SELECT name FROM sys.all_objects WHERE name LIKE '%hekaton%' SELECT name FROM sys.all_objects WHERE object_definition(OBJECT_ID) LIKE '%hekaton%' Which revealed: name ------------------------ (0 row(s) affected) name ------------------------ sp_createstats sp_recompile sp_updatestats (3 row(s) affected)   Hmm.  Well that didn't find much.  Looks like these procedures are seriously undocumented, unknown, perhaps forbidden knowledge. Maybe a part of some unspeakable evil? (No, I'm not paranoid, I just like mysteries and thought that punching this up with that kind of thing might keep you reading.  I know I'd fall asleep without it.) OK, so let's check out those 3 procedures and see what they reveal when I search for "Hekaton": sp_createstats: -- filter out local temp tables, Hekaton tables, and tables for which current user has no permissions -- Note that OBJECTPROPERTY returns NULL on type="IT" tables, thus we only call it on type='U' tables   OK, that's interesting, let's go looking down a little further: ((@table_type<>'U') or (0 = OBJECTPROPERTY(@table_id, 'TableIsInMemory'))) and -- Hekaton table   Wellllll, that tells us a few new things: There's such a thing as Hekaton tables (UPDATE: I'm not the only one to have found them!) They are not standard user tables and probably not in memory UPDATE: I misinterpreted this because I didn't read all the code when I wrote this blog post. The OBJECTPROPERTY function has an undocumented TableIsInMemory option Let's check out sp_recompile: -- (3) Must not be a Hekaton procedure.   And once again go a little further: if (ObjectProperty(@objid, 'IsExecuted') <> 0 AND ObjectProperty(@objid, 'IsInlineFunction') = 0 AND ObjectProperty(@objid, 'IsView') = 0 AND -- Hekaton procedure cannot be recompiled -- Make them go through schema version bumping branch, which will fail ObjectProperty(@objid, 'ExecIsCompiledProc') = 0)   And now we learn that hekaton procedures also exist, they can't be recompiled, there's a "schema version bumping branch" somewhere, and OBJECTPROPERTY has another undocumented option, ExecIsCompiledProc.  (If you experiment with this you'll find this option returns null, I think it only works when called from a system object.) This is neat! Sadly sp_updatestats doesn't reveal anything new, the comments about hekaton are the same as sp_createstats.  But we've ALSO discovered undocumented features for the OBJECTPROPERTY function, which we can now search for: SELECT name, object_definition(OBJECT_ID) FROM sys.all_objects WHERE object_definition(OBJECT_ID) LIKE '%OBJECTPROPERTY(%'   I'll leave that to you as more homework.  I should add that searching the system procedures was recommended long ago by the late, great Ken Henderson, in his Guru's Guide books, as a great way to find undocumented features.  That seems to be really good advice! Now if you're a programmer/hacker, you've probably been drooling over the last 5 entries for hekaton (in green), because these are the names of source code files for SQL Server!  Does this mean we can access the source code for SQL Server?  As The Oracle suggested to Neo, can we return to The Source??? Actually, no. Well, maybe a little bit.  While you won't get the actual source code from the compiled DLL and EXE files, you'll get references to source files, debugging symbols, variables and module names, error messages, and even the startup flags for SQL Server.  And if you search for "DBCC" or "CHECKDB" you'll find a really nice section listing all the DBCC commands, including the undocumented ones.  Granted those are pretty easy to find online, but you may be surprised what those web sites DIDN'T tell you! (And neither will I, go look for yourself!)  And as we saw earlier, you'll also find execution plan elements, query processing rules, and who knows what else.  It's also instructive to see how Microsoft organizes their source directories, how various components (storage engine, query processor, Full Text, AlwaysOn/HADR) are split into smaller modules. There are over 2000 source file references, go do some exploring! So what did we learn?  We can pull strings out of executable files, search them for known items, browse them for unknown items, and use the results to examine internal code to learn even more things about SQL Server.  We've even learned how to use command-line utilities!  We are now 1337 h4X0rz!  (Not really.  I hate that leetspeak crap.) Although, I must confess I might've gone too far with the "conspiracy" part of this post.  I apologize for that, it's just my overactive imagination.  There's really no hidden agenda or conspiracy regarding SQL Server internals.  It's not The Matrix.  It's not like you'd find anything like that in there: Attach Matrix Database DM_MATRIX_COMM_PIPELINES MATRIXXACTPARTICIPANTS dm_matrix_agents   Alright, enough of this paranoid ranting!  Microsoft are not really evil!  It's not like they're The Borg from Star Trek: ALTER FEDERATION DROP ALTER FEDERATION SPLIT DROP FEDERATION   #tsql2sday

    Read the article

  • SQL Spatial: Getting “nearest” calculations working properly

    - by Rob Farley
    If you’ve ever done spatial work with SQL Server, I hope you’ve come across the ‘nearest’ problem. You have five thousand stores around the world, and you want to identify the one that’s closest to a particular place. Maybe you want the store closest to the LobsterPot office in Adelaide, at -34.925806, 138.605073. Or our new US office, at 42.524929, -87.858244. Or maybe both! You know how to do this. You don’t want to use an aggregate MIN or MAX, because you want the whole row, telling you which store it is. You want to use TOP, and if you want to find the closest store for multiple locations, you use APPLY. Let’s do this (but I’m going to use addresses in AdventureWorks2012, as I don’t have a list of stores). Oh, and before I do, let’s make sure we have a spatial index in place. I’m going to use the default options. CREATE SPATIAL INDEX spin_Address ON Person.Address(SpatialLocation); And my actual query: WITH MyLocations AS (SELECT * FROM (VALUES ('LobsterPot Adelaide', geography::Point(-34.925806, 138.605073, 4326)),                        ('LobsterPot USA', geography::Point(42.524929, -87.858244, 4326))                ) t (Name, Geo)) SELECT l.Name, a.AddressLine1, a.City, s.Name AS [State], c.Name AS Country FROM MyLocations AS l CROSS APPLY (     SELECT TOP (1) *     FROM Person.Address AS ad     ORDER BY l.Geo.STDistance(ad.SpatialLocation)     ) AS a JOIN Person.StateProvince AS s     ON s.StateProvinceID = a.StateProvinceID JOIN Person.CountryRegion AS c     ON c.CountryRegionCode = s.CountryRegionCode ; Great! This is definitely working. I know both those City locations, even if the AddressLine1s don’t quite ring a bell. I’m sure I’ll be able to find them next time I’m in the area. But of course what I’m concerned about from a querying perspective is what’s happened behind the scenes – the execution plan. This isn’t pretty. It’s not using my index. It’s sucking every row out of the Address table TWICE (which sucks), and then it’s sorting them by the distance to find the smallest one. It’s not pretty, and it takes a while. Mind you, I do like the fact that it saw an indexed view it could use for the State and Country details – that’s pretty neat. But yeah – users of my nifty website aren’t going to like how long that query takes. The frustrating thing is that I know that I can use the index to find locations that are within a particular distance of my locations quite easily, and Microsoft recommends this for solving the ‘nearest’ problem, as described at http://msdn.microsoft.com/en-au/library/ff929109.aspx. Now, in the first example on this page, it says that the query there will use the spatial index. But when I run it on my machine, it does nothing of the sort. I’m not particularly impressed. But what we see here is that parallelism has kicked in. In my scenario, it’s split the data up into 4 threads, but it’s still slow, and not using my index. It’s disappointing. But I can persuade it with hints! If I tell it to FORCESEEK, or use my index, or even turn off the parallelism with MAXDOP 1, then I get the index being used, and it’s a thing of beauty! Part of the plan is here: It’s massive, and it’s ugly, and it uses a TVF… but it’s quick. The way it works is to hook into the GeodeticTessellation function, which is essentially finds where the point is, and works out through the spatial index cells that surround it. This then provides a framework to be able to see into the spatial index for the items we want. You can read more about it at http://msdn.microsoft.com/en-us/library/bb895265.aspx#tessellation – including a bunch of pretty diagrams. One of those times when we have a much more complex-looking plan, but just because of the good that’s going on. This tessellation stuff was introduced in SQL Server 2012. But my query isn’t using it. When I try to use the FORCESEEK hint on the Person.Address table, I get the friendly error: Msg 8622, Level 16, State 1, Line 1 Query processor could not produce a query plan because of the hints defined in this query. Resubmit the query without specifying any hints and without using SET FORCEPLAN. And I’m almost tempted to just give up and move back to the old method of checking increasingly large circles around my location. After all, I can even leverage multiple OUTER APPLY clauses just like I did in my recent Lookup post. WITH MyLocations AS (SELECT * FROM (VALUES ('LobsterPot Adelaide', geography::Point(-34.925806, 138.605073, 4326)),                        ('LobsterPot USA', geography::Point(42.524929, -87.858244, 4326))                ) t (Name, Geo)) SELECT     l.Name,     COALESCE(a1.AddressLine1,a2.AddressLine1,a3.AddressLine1),     COALESCE(a1.City,a2.City,a3.City),     s.Name AS [State],     c.Name AS Country FROM MyLocations AS l OUTER APPLY (     SELECT TOP (1) *     FROM Person.Address AS ad     WHERE l.Geo.STDistance(ad.SpatialLocation) < 1000     ORDER BY l.Geo.STDistance(ad.SpatialLocation)     ) AS a1 OUTER APPLY (     SELECT TOP (1) *     FROM Person.Address AS ad     WHERE l.Geo.STDistance(ad.SpatialLocation) < 5000     AND a1.AddressID IS NULL     ORDER BY l.Geo.STDistance(ad.SpatialLocation)     ) AS a2 OUTER APPLY (     SELECT TOP (1) *     FROM Person.Address AS ad     WHERE l.Geo.STDistance(ad.SpatialLocation) < 20000     AND a2.AddressID IS NULL     ORDER BY l.Geo.STDistance(ad.SpatialLocation)     ) AS a3 JOIN Person.StateProvince AS s     ON s.StateProvinceID = COALESCE(a1.StateProvinceID,a2.StateProvinceID,a3.StateProvinceID) JOIN Person.CountryRegion AS c     ON c.CountryRegionCode = s.CountryRegionCode ; But this isn’t friendly-looking at all, and I’d use the method recommended by Isaac Kunen, who uses a table of numbers for the expanding circles. It feels old-school though, when I’m dealing with SQL 2012 (and later) versions. So why isn’t my query doing what it’s supposed to? Remember the query... WITH MyLocations AS (SELECT * FROM (VALUES ('LobsterPot Adelaide', geography::Point(-34.925806, 138.605073, 4326)),                        ('LobsterPot USA', geography::Point(42.524929, -87.858244, 4326))                ) t (Name, Geo)) SELECT l.Name, a.AddressLine1, a.City, s.Name AS [State], c.Name AS Country FROM MyLocations AS l CROSS APPLY (     SELECT TOP (1) *     FROM Person.Address AS ad     ORDER BY l.Geo.STDistance(ad.SpatialLocation)     ) AS a JOIN Person.StateProvince AS s     ON s.StateProvinceID = a.StateProvinceID JOIN Person.CountryRegion AS c     ON c.CountryRegionCode = s.CountryRegionCode ; Well, I just wasn’t reading http://msdn.microsoft.com/en-us/library/ff929109.aspx properly. The following requirements must be met for a Nearest Neighbor query to use a spatial index: A spatial index must be present on one of the spatial columns and the STDistance() method must use that column in the WHERE and ORDER BY clauses. The TOP clause cannot contain a PERCENT statement. The WHERE clause must contain a STDistance() method. If there are multiple predicates in the WHERE clause then the predicate containing STDistance() method must be connected by an AND conjunction to the other predicates. The STDistance() method cannot be in an optional part of the WHERE clause. The first expression in the ORDER BY clause must use the STDistance() method. Sort order for the first STDistance() expression in the ORDER BY clause must be ASC. All the rows for which STDistance returns NULL must be filtered out. Let’s start from the top. 1. Needs a spatial index on one of the columns that’s in the STDistance call. Yup, got the index. 2. No ‘PERCENT’. Yeah, I don’t have that. 3. The WHERE clause needs to use STDistance(). Ok, but I’m not filtering, so that should be fine. 4. Yeah, I don’t have multiple predicates. 5. The first expression in the ORDER BY is my distance, that’s fine. 6. Sort order is ASC, because otherwise we’d be starting with the ones that are furthest away, and that’s tricky. 7. All the rows for which STDistance returns NULL must be filtered out. But I don’t have any NULL values, so that shouldn’t affect me either. ...but something’s wrong. I do actually need to satisfy #3. And I do need to make sure #7 is being handled properly, because there are some situations (eg, differing SRIDs) where STDistance can return NULL. It says so at http://msdn.microsoft.com/en-us/library/bb933808.aspx – “STDistance() always returns null if the spatial reference IDs (SRIDs) of the geography instances do not match.” So if I simply make sure that I’m filtering out the rows that return NULL… …then it’s blindingly fast, I get the right results, and I’ve got the complex-but-brilliant plan that I wanted. It just wasn’t overly intuitive, despite being documented. @rob_farley

    Read the article

  • Stored Procedures with SSRS? Hmm… not so much

    - by Rob Farley
    Little Bobby Tables’ mother says you should always sanitise your data input. Except that I think she’s wrong. The SQL Injection aspect is for another post, where I’ll show you why I think SQL Injection is the same kind of attack as many other attacks, such as the old buffer overflow, but here I want to have a bit of a whinge about the way that some people sanitise data input, and even have a whinge about people who insist on using stored procedures for SSRS reports. Let me say that again, in case you missed it the first time: I want to have a whinge about people who insist on using stored procedures for SSRS reports. Let’s look at the data input sanitisation aspect – except that I’m going to call it ‘parameter validation’. I’m talking about code that looks like this: create procedure dbo.GetMonthSummaryPerSalesPerson(@eomdate datetime) as begin     /* First check that @eomdate is a valid date */     if isdate(@eomdate) != 1     begin         select 'Please enter a valid date' as ErrorMessage;         return;     end     /* Then check that time has passed since @eomdate */     if datediff(day,@eomdate,sysdatetime()) < 5     begin         select 'Sorry - EOM is not complete yet' as ErrorMessage;         return;     end         /* If those checks have succeeded, return the data */     select SalesPersonID, count(*) as NumSales, sum(TotalDue) as TotalSales     from Sales.SalesOrderHeader     where OrderDate >= dateadd(month,-1,@eomdate)         and OrderDate < @eomdate     group by SalesPersonID     order by SalesPersonID; end Notice that the code checks that a date has been entered. Seriously??!! This must only be to check for NULL values being passed in, because anything else would have to be a valid datetime to avoid an error. The other check is maybe fair enough, but I still don’t like it. The two problems I have with this stored procedure are the result sets and the small fact that the stored procedure even exists in the first place. But let’s consider the first one of these problems for starters. I’ll get to the second one in a moment. If you read Jes Borland (@grrl_geek)’s recent post about returning multiple result sets in Reporting Services, you’ll be aware that Reporting Services doesn’t support multiple results sets from a single query. And when it says ‘single query’, it includes ‘stored procedure call’. It’ll only handle the first result set that comes back. But that’s okay – we have RETURN statements, so our stored procedure will only ever return a single result set.  Sometimes that result set might contain a single field called ErrorMessage, but it’s still only one result set. Except that it’s not okay, because Reporting Services needs to know what fields to expect. Your report needs to hook into your fields, so SSRS needs to have a way to get that information. For stored procs, it uses an option called FMTONLY. When Reporting Services tries to figure out what fields are going to be returned by a query (or stored procedure call), it doesn’t want to have to run the whole thing. That could take ages. (Maybe it’s seen some of the stored procedures I’ve had to deal with over the years!) So it turns on FMTONLY before it makes the call (and turns it off again afterwards). FMTONLY is designed to be able to figure out the shape of the output, without actually running the contents. It’s very useful, you might think. set fmtonly on exec dbo.GetMonthSummaryPerSalesPerson '20030401'; set fmtonly off Without the FMTONLY lines, this stored procedure returns a result set that has three columns and fourteen rows. But with FMTONLY turned on, those rows don’t come back. But what I do get back hurts Reporting Services. It doesn’t run the stored procedure at all. It just looks for anything that could be returned and pushes out a result set in that shape. Despite the fact that I’ve made sure that the logic will only ever return a single result set, the FMTONLY option kills me by returning three of them. It would have been much better to push these checks down into the query itself. alter procedure dbo.GetMonthSummaryPerSalesPerson(@eomdate datetime) as begin     select SalesPersonID, count(*) as NumSales, sum(TotalDue) as TotalSales     from Sales.SalesOrderHeader     where     /* Make sure that @eomdate is valid */         isdate(@eomdate) = 1     /* And that it's sufficiently past */     and datediff(day,@eomdate,sysdatetime()) >= 5     /* And now use it in the filter as appropriate */     and OrderDate >= dateadd(month,-1,@eomdate)     and OrderDate < @eomdate     group by SalesPersonID     order by SalesPersonID; end Now if we run it with FMTONLY turned on, we get the single result set back. But let’s consider the execution plan when we pass in an invalid date. First let’s look at one that returns data. I’ve got a semi-useful index in place on OrderDate, which includes the SalesPersonID and TotalDue fields. It does the job, despite a hefty Sort operation. …compared to one that uses a future date: You might notice that the estimated costs are similar – the Index Seek is still 28%, the Sort is still 71%. But the size of that arrow coming out of the Index Seek is a whole bunch smaller. The coolest thing here is what’s going on with that Index Seek. Let’s look at some of the properties of it. Glance down it with me… Estimated CPU cost of 0.0005728, 387 estimated rows, estimated subtree cost of 0.0044385, ForceSeek false, Number of Executions 0. That’s right – it doesn’t run. So much for reading plans right-to-left... The key is the Filter on the left of it. It has a Startup Expression Predicate in it, which means that it doesn’t call anything further down the plan (to the right) if the predicate evaluates to false. Using this method, we can make sure that our stored procedure contains a single query, and therefore avoid any problems with multiple result sets. If we wanted, we could always use UNION ALL to make sure that we can return an appropriate error message. alter procedure dbo.GetMonthSummaryPerSalesPerson(@eomdate datetime) as begin     select SalesPersonID, count(*) as NumSales, sum(TotalDue) as TotalSales, /*Placeholder: */ '' as ErrorMessage     from Sales.SalesOrderHeader     where     /* Make sure that @eomdate is valid */         isdate(@eomdate) = 1     /* And that it's sufficiently past */     and datediff(day,@eomdate,sysdatetime()) >= 5     /* And now use it in the filter as appropriate */     and OrderDate >= dateadd(month,-1,@eomdate)     and OrderDate < @eomdate     group by SalesPersonID     /* Now include the error messages */     union all     select 0, 0, 0, 'Please enter a valid date' as ErrorMessage     where isdate(@eomdate) != 1     union all     select 0, 0, 0, 'Sorry - EOM is not complete yet' as ErrorMessage     where datediff(day,@eomdate,sysdatetime()) < 5     order by SalesPersonID; end But still I don’t like it, because it’s now a stored procedure with a single query. And I don’t like stored procedures that should be functions. That’s right – I think this should be a function, and SSRS should call the function. And I apologise to those of you who are now planning a bonfire for me. Guy Fawkes’ night has already passed this year, so I think you miss out. (And I’m not going to remind you about when the PASS Summit is in 2012.) create function dbo.GetMonthSummaryPerSalesPerson(@eomdate datetime) returns table as return (     select SalesPersonID, count(*) as NumSales, sum(TotalDue) as TotalSales, '' as ErrorMessage     from Sales.SalesOrderHeader     where     /* Make sure that @eomdate is valid */         isdate(@eomdate) = 1     /* And that it's sufficiently past */     and datediff(day,@eomdate,sysdatetime()) >= 5     /* And now use it in the filter as appropriate */     and OrderDate >= dateadd(month,-1,@eomdate)     and OrderDate < @eomdate     group by SalesPersonID     union all     select 0, 0, 0, 'Please enter a valid date' as ErrorMessage     where isdate(@eomdate) != 1     union all     select 0, 0, 0, 'Sorry - EOM is not complete yet' as ErrorMessage     where datediff(day,@eomdate,sysdatetime()) < 5 ); We’ve had to lose the ORDER BY – but that’s fine, as that’s a client thing anyway. We can have our reports leverage this stored query still, but we’re recognising that it’s a query, not a procedure. A procedure is designed to DO stuff, not just return data. We even get entries in sys.columns that confirm what the shape of the result set actually is, which makes sense, because a table-valued function is the right mechanism to return data. And we get so much more flexibility with this. If you haven’t seen the simplification stuff that I’ve preached on before, jump over to http://bit.ly/SimpleRob and watch the video of when I broke a microphone and nearly fell off the stage in Wales. You’ll see the impact of being able to have a simplifiable query. You can also read the procedural functions post I wrote recently, if you didn’t follow the link from a few paragraphs ago. So if we want the list of SalesPeople that made any kind of sales in a given month, we can do something like: select SalesPersonID from dbo.GetMonthSummaryPerSalesPerson(@eomonth) order by SalesPersonID; This doesn’t need to look up the TotalDue field, which makes a simpler plan. select * from dbo.GetMonthSummaryPerSalesPerson(@eomonth) where SalesPersonID is not null order by SalesPersonID; This one can avoid having to do the work on the rows that don’t have a SalesPersonID value, pushing the predicate into the Index Seek rather than filtering the results that come back to the report. If we had joins involved, we might see some of those being simplified out. We also get the ability to include query hints in individual reports. We shift from having a single-use stored procedure to having a reusable stored query – and isn’t that one of the main points of modularisation? Stored procedures in Reporting Services are just a bit limited for my liking. They’re useful in plenty of ways, but if you insist on using stored procedures all the time rather that queries that use functions – that’s rubbish. @rob_farley

    Read the article

  • Visualising data a different way with Pivot collections

    - by Rob Farley
    Roger’s been doing a great job extending PivotViewer recently, and you can find the list of LobsterPot pivots at http://pivot.lobsterpot.com.au Many months back, the TED Talk that Gary Flake did about Pivot caught my imagination, and I did some research into it. At the time, most of what we did with Pivot was geared towards what we could do for clients, including making Pivot collections based on students at a school, and using it to browse PDF invoices by their various properties. We had actual commercial work based on Pivot collections back then, and it was all kinds of fun. Later, we made some collections for events that were happening, and even got featured in the TechEd Australia keynote. But I’m getting ahead of myself... let me explain the concept. A Pivot collection is an XML file (with .cxml extension) which lists Items, each linking to an image that’s stored in a Deep Zoom format (this means that it contains tiles like Bing Maps, so that the browser can request only the ones of interest according to the zoom level). This collection can be shown in a Silverlight application that uses the PivotViewer control, or in the Pivot Browser that’s available from getpivot.com. Filtering and sorting the items according to their facets (attributes, such as size, age, category, etc), the PivotViewer rearranges the way that these are shown in a very dynamic way. To quote Gary Flake, this lets us “see patterns which are otherwise hidden”. This browsing mechanism is very suited to a number of different methods, because it’s just that – browsing. It’s not searching, it’s more akin to window-shopping than doing an internet search. When we decided to put something together for the conferences such as TechEd Australia 2010 and the PASS Summit 2010, we did some screen-scraping to provide a different view of data that was already available online. Nick Hodge and Michael Kordahi from Microsoft liked the idea a lot, and after a bit of tweaking, we produced one that Michael used in the TechEd Australia keynote to show the variety of talks on offer. It’s interesting to see a pattern in this data: The Office track has the most sessions, but if the Interactive Sessions and Instructor-Led Labs are removed, it drops down to only the sixth most popular track, with Cloud Computing taking over. This is something which just isn’t obvious when you look an ordinary search tool. You get a much better feel for the data when moving around it like this. The more observant amongst you will have noticed some difference in the collection that Michael is demonstrating in the picture above with the screenshots I’ve shown. That’s because it’s been extended some more. At the SQLBits conference in the UK this year, I had some interesting discussions with the guys from Xpert360, particularly Phil Carter, who I’d met in 2009 at an earlier SQLBits conference. They had got around to producing a Pivot collection based on the SQLBits data, which we had been planning to do but ran out of time. We discussed some of ways that Pivot could be used, including the ways that my old friend Howard Dierking had extended it for the MSDN Magazine. I’m not suggesting I influenced Xpert360 at all, but they certainly inspired us with some of their posts on the matter So with LobsterPot guys David Gardiner and Roger Noble both having dabbled in Pivot collections (and Dave doing some for clients), I set Roger to work on extending it some more. He’s used various events and so on to be able to make an environment that allows us to do quick deployment of new collections, as well as showing the data in a grid view which behaves as if it were simply a third view of the data (the other two being the array of images and the ‘histogram’ view). I see PivotViewer as being a significant step in data visualisation – so much so that I feature it when I deliver talks on Spatial Data Visualisation methods. Any time when there is information that can be conveyed through an image, you have to ask yourself how best to show that image, and whether that image is the focal point. For Spatial data, the image is most often a map, and the map becomes the central mode for navigation. I show Pivot with postcode areas, since I can browse the postcodes based on their data, and many of the images are recognisable (to locals of South Australia). Naturally, the images could link through to the map itself, and so on, but generally people think of Spatial data in terms of navigating a map, which doesn’t always gel with the information you’re trying to extract. Roger’s even looking into ways to hook PivotViewer into the Bing Maps API, in a similar way to the Deep Earth project, displaying different levels of map detail according to how ‘zoomed in’ the images are. Some of the work that Dave did with one of the schools was generating the Deep Zoom tiles “on the fly”, based on images stored in a database, and Roger has produced a collection which uses images from flickr, that lets you move from one search term to another. Pulling the images down from flickr.com isn’t particularly ideal from a performance aspect, and flickr doesn’t store images in a small-enough format to really lend itself to this use, but you might agree that it’s an interesting concept which compares nicely to using Maps. I’m looking forward to future versions of the PivotViewer control, and hope they provide many more events that can be used, and even more hooks into it. Naturally, LobsterPot could help provide your business with a PivotViewer experience, but you can probably do a lot of it yourself too. There’s a thorough guide at getpivot.com, which is how we got into it. For some examples of what we’ve done, have a look at http://pivot.lobsterpot.com.au. I’d like to see PivotViewer really catch on a data visualisation tool.

    Read the article

  • Slide-decks from recent Adelaide SQL Server UG meetings

    - by Rob Farley
    The UK has been well represented this summer at the Adelaide SQL Server User Group, with presentations from Chris Testa-O’Neill (isn’t that the right link? Maybe try this one) and Martin Cairney. The slides are available here and here. I thought I’d particularly mention Martin’s, and how it’s relevant to this month’s T-SQL Tuesday. Martin spoke about Policy-Based Management and the Enterprise Policy Management Framework – something which is remarkably under-used, and yet which can really impact your ability to look after environments. If you have policies set up, then you can easily test each of your SQL instances to see if they are still satisfying a set of policies as defined. Automation (the topic of this month’s T-SQL Tuesday) should mean that your life is made easier, thereby enabling to you to do more. It shouldn’t remove the human element, but should remove (most of) the human errors. People still need to manage the situation, and work out what needs to be done, etc. We haven’t reached a point where computers can replace people, but they are very good at replace the mundaneness and monotony of our jobs. They’ve made our lives more interesting (although many would rightly argue that they have also made our lives more complex) by letting us focus on the stuff that changes. Martin named his talk Put Your Feet Up, which nicely expresses the fact that managing systems shouldn’t be about running around checking things all the time. It must be about having systems in place which tell you when things aren’t going well. It’s never quite as simple as being able to actually put your feet up, but certainly no system should require constant attention. It’s definitely a policy we at LobsterPot adhere to, whether it’s an alert to let us know that an ETL package has run successfully, or a script that generates some code for a report. If things can be automated, it reduces the chance of error, reduces the repetitive nature of work, and in general, keeps both consultants and clients much happier.

    Read the article

  • SyFy Channel Original Movie Title Generator

    - by Most Valuable Yak (Rob Volk)
    Saw this linked on reddit today and couldn't resist going through all the combinations: create table #pre(name varchar(20))create table #post(name varchar(20), pre varchar(10))insert #pre select 'Dino' union all select'Alien' union all select'Shark' union all select'Raptor' union all select'Tractor' union all select'Arachno' union all select'Cyber' union all select'Robo' union all select'Choco' union all select'Chupa' union all select'Grizzly' union all select'Mega' union all select'Were' union all select'Sabre' union all select'Man' insert #post select 'dactyl','a' union all select'pus','to' union all select'conda','a' union all select'droid',null union all select'dile','o' union all select'bear',null union all select'vampire',null union all select'squito',null union all select'saurus','a' union all select'wolf',null union all select'ghost',null union all select'viper',null union all select'cabra','a' union all select'yeti',null union all select'shark',null select a.name +case when right(a.name,1) not like '[aeiouy]' and b.pre is not null then b.pre else '' end +b.namefrom #pre a cross join #post bwhere a.name<>b.name -- optional, to eliminate the "SharkShark" optionorder by 1  Which one is your favorite?  I like most of the -squito versions, especially Chupasquito and Grizzlysquito.

    Read the article

  • Slide-decks from recent Adelaide SQL Server UG meetings

    - by Rob Farley
    The UK has been well represented this summer at the Adelaide SQL Server User Group, with presentations from Chris Testa-O’Neill (isn’t that the right link? Maybe try this one) and Martin Cairney. The slides are available here and here. I thought I’d particularly mention Martin’s, and how it’s relevant to this month’s T-SQL Tuesday. Martin spoke about Policy-Based Management and the Enterprise Policy Management Framework – something which is remarkably under-used, and yet which can really impact your ability to look after environments. If you have policies set up, then you can easily test each of your SQL instances to see if they are still satisfying a set of policies as defined. Automation (the topic of this month’s T-SQL Tuesday) should mean that your life is made easier, thereby enabling to you to do more. It shouldn’t remove the human element, but should remove (most of) the human errors. People still need to manage the situation, and work out what needs to be done, etc. We haven’t reached a point where computers can replace people, but they are very good at replace the mundaneness and monotony of our jobs. They’ve made our lives more interesting (although many would rightly argue that they have also made our lives more complex) by letting us focus on the stuff that changes. Martin named his talk Put Your Feet Up, which nicely expresses the fact that managing systems shouldn’t be about running around checking things all the time. It must be about having systems in place which tell you when things aren’t going well. It’s never quite as simple as being able to actually put your feet up, but certainly no system should require constant attention. It’s definitely a policy we at LobsterPot adhere to, whether it’s an alert to let us know that an ETL package has run successfully, or a script that generates some code for a report. If things can be automated, it reduces the chance of error, reduces the repetitive nature of work, and in general, keeps both consultants and clients much happier.

    Read the article

  • T-SQL Tuesday #53-Matt's Making Me Do This!

    - by Most Valuable Yak (Rob Volk)
    Hello everyone! It's that time again, time for T-SQL Tuesday, the wonderful blog series started by Adam Machanic (b|t). This month we are hosted by Matt Velic (b|t) who asks the question, "Why So Serious?", in celebration of April Fool's Day. He asks the contributors for their dirty tricks. And for some reason that escapes me, he and Jeff Verheul (b|t) seem to think I might be able to write about those. Shocked, I am! Nah, not really. They're absolutely right, this one is gonna be fun! I took some inspiration from Matt's suggestions, namely Resource Governor and Login Triggers.  I've done some interesting login trigger stuff for a presentation, but nothing yet with Resource Governor. Best way to learn it! One of my oldest pet peeves is abuse of the sa login. Don't get me wrong, I use it too, but typically only as SQL Agent job owner. It's been a while since I've been stuck with it, but back when I started using SQL Server, EVERY application needed sa to function. It was hard-coded and couldn't be changed. (welllllll, that is if you didn't use a hex editor on the EXE file, but who would do such a thing?) My standard warning applies: don't run anything on this page in production. In fact, back up whatever server you're testing this on, including the master database. Snapshotting a VM is a good idea. Also make sure you have other sysadmin level logins on that server. So here's a standard template for a logon trigger to address those pesky sa users: CREATE TRIGGER SA_LOGIN_PRIORITY ON ALL SERVER WITH ENCRYPTION, EXECUTE AS N'sa' AFTER LOGON AS IF ORIGINAL_LOGIN()<>N'sa' OR APP_NAME() LIKE N'SQL Agent%' RETURN; -- interesting stuff goes here GO   What can you do for "interesting stuff"? Books Online limits itself to merely rolling back the logon, which will throw an error (and alert the person that the logon trigger fired).  That's a good use for logon triggers, but really not tricky enough for this blog.  Some of my suggestions are below: WAITFOR DELAY '23:59:59';   Or: EXEC sp_MSforeach_db 'EXEC sp_detach_db ''?'';'   Or: EXEC msdb.dbo.sp_add_job @job_name=N'`', @enabled=1, @start_step_id=1, @notify_level_eventlog=0, @delete_level=3; EXEC msdb.dbo.sp_add_jobserver @job_name=N'`', @server_name=@@SERVERNAME; EXEC msdb.dbo.sp_add_jobstep @job_name=N'`', @step_id=1, @step_name=N'`', @command=N'SHUTDOWN;'; EXEC msdb.dbo.sp_start_job @job_name=N'`';   Really, I don't want to spoil your own exploration, try it yourself!  The thing I really like about these is it lets me promote the idea that "sa is SLOW, sa is BUGGY, don't use sa!".  Before we get into Resource Governor, make sure to drop or disable that logon trigger. They don't work well in combination. (Had to redo all the following code when SSMS locked up) Resource Governor is a feature that lets you control how many resources a single session can consume. The main goal is to limit the damage from a runaway query. But we're not here to read about its main goal or normal usage! I'm trying to make people stop using sa BECAUSE IT'S SLOW! Here's how RG can do that: USE master; GO CREATE FUNCTION dbo.SA_LOGIN_PRIORITY() RETURNS sysname WITH SCHEMABINDING, ENCRYPTION AS BEGIN RETURN CASE WHEN ORIGINAL_LOGIN()=N'sa' AND APP_NAME() NOT LIKE N'SQL Agent%' THEN N'SA_LOGIN_PRIORITY' ELSE N'default' END END GO CREATE RESOURCE POOL SA_LOGIN_PRIORITY WITH ( MIN_CPU_PERCENT = 0 ,MAX_CPU_PERCENT = 1 ,CAP_CPU_PERCENT = 1 ,AFFINITY SCHEDULER = (0) ,MIN_MEMORY_PERCENT = 0 ,MAX_MEMORY_PERCENT = 1 -- ,MIN_IOPS_PER_VOLUME = 1 ,MAX_IOPS_PER_VOLUME = 1 -- uncomment for SQL Server 2014 ); CREATE WORKLOAD GROUP SA_LOGIN_PRIORITY WITH ( IMPORTANCE = LOW ,REQUEST_MAX_MEMORY_GRANT_PERCENT = 1 ,REQUEST_MAX_CPU_TIME_SEC = 1 ,REQUEST_MEMORY_GRANT_TIMEOUT_SEC = 1 ,MAX_DOP = 1 ,GROUP_MAX_REQUESTS = 1 ) USING SA_LOGIN_PRIORITY; ALTER RESOURCE GOVERNOR WITH (CLASSIFIER_FUNCTION=dbo.SA_LOGIN_PRIORITY); ALTER RESOURCE GOVERNOR RECONFIGURE;   From top to bottom: Create a classifier function to determine which pool the session should go to. More info on classifier functions. Create the pool and provide a generous helping of resources for the sa login. Create the workload group and further prioritize those resources for the sa login. Apply the classifier function and reconfigure RG to use it. I have to say this one is a bit sneakier than the logon trigger, least of all you don't get any error messages.  I heartily recommend testing it in Management Studio, and click around the UI a lot, there's some fun behavior there. And DEFINITELY try it on SQL 2014 with the IO settings included!  You'll notice I made allowances for SQL Agent jobs owned by sa, they'll go into the default workload group.  You can add your own overrides to the classifier function if needed. Some interesting ideas I didn't have time for but expect you to get to before me: Set up different pools/workgroups with different settings and randomize which one the classifier chooses Do the same but base it on time of day (Books Online example covers this)... Or, which workstation it connects from. This can be modified for certain special people in your office who either don't listen, or are attracted (and attractive) to you. And if things go wrong you can always use the following from another sysadmin or Dedicated Admin connection: ALTER RESOURCE GOVERNOR DISABLE;   That will let you go in and either fix (or drop) the pools, workgroups and classifier function. So now that you know these types of things are possible, and if you are tired of your team using sa when they shouldn't, I expect you'll enjoy playing with these quite a bit! Unfortunately, the aforementioned Dedicated Admin Connection kinda poops on the party here.  Books Online for both topics will tell you that the DAC will not fire either feature. So if you have a crafty user who does their research, they can still sneak in with sa and do their bidding without being hampered. Of course, you can still detect their login via various methods, like a server trace, SQL Server Audit, extended events, and enabling "Audit Successful Logins" on the server.  These all have their downsides: traces take resources, extended events and SQL Audit can't fire off actions, and enabling successful logins will bloat your error log very quickly.  SQL Audit is also limited unless you have Enterprise Edition, and Resource Governor is Enterprise-only.  And WORST OF ALL, these features are all available and visible through the SSMS UI, so even a doofus developer or manager could find them. Fortunately there are Event Notifications! Event notifications are becoming one of my favorite features of SQL Server (keep an eye out for more blogs from me about them). They are practically unknown and heinously underutilized.  They are also a great gateway drug to using Service Broker, another great but underutilized feature. Hopefully this will get you to start using them, or at least your enemies in the office will once they read this, and then you'll have to learn them in order to fix things. So here's the setup: USE msdb; GO CREATE PROCEDURE dbo.SA_LOGIN_PRIORITY_act WITH ENCRYPTION AS DECLARE @x XML, @message nvarchar(max); RECEIVE @x=CAST(message_body AS XML) FROM SA_LOGIN_PRIORITY_q; IF @x.value('(//LoginName)[1]','sysname')=N'sa' AND @x.value('(//ApplicationName)[1]','sysname') NOT LIKE N'SQL Agent%' BEGIN -- interesting activation procedure stuff goes here END GO CREATE QUEUE SA_LOGIN_PRIORITY_q WITH STATUS=ON, RETENTION=OFF, ACTIVATION (PROCEDURE_NAME=dbo.SA_LOGIN_PRIORITY_act, MAX_QUEUE_READERS=1, EXECUTE AS OWNER); CREATE SERVICE SA_LOGIN_PRIORITY_s ON QUEUE SA_LOGIN_PRIORITY_q([http://schemas.microsoft.com/SQL/Notifications/PostEventNotification]); CREATE EVENT NOTIFICATION SA_LOGIN_PRIORITY_en ON SERVER WITH FAN_IN FOR AUDIT_LOGIN TO SERVICE N'SA_LOGIN_PRIORITY_s', N'current database' GO   From top to bottom: Create activation procedure for event notification queue. Create queue to accept messages from event notification, and activate the procedure to process those messages when received. Create service to send messages to that queue. Create event notification on AUDIT_LOGIN events that fire the service. I placed this in msdb as it is an available system database and already has Service Broker enabled by default. You should change this to another database if you can guarantee it won't get dropped. So what to put in place for "interesting activation procedure code"?  Hmmm, so far I haven't addressed Matt's suggestion of writing a lengthy script to send an annoying message: SET @[email protected]('(//HostName)[1]','sysname') + N' tried to log in to server ' + @x.value('(//ServerName)[1]','sysname') + N' as SA at ' + @x.value('(//StartTime)[1]','sysname') + N' using the ' + @x.value('(//ApplicationName)[1]','sysname') + N' program. That''s why you''re getting this message and the attached pornography which' + N' is bloating your inbox and violating company policy, among other things. If you know' + N' this person you can go to their desk and hit them, or use the following SQL to end their session: KILL ' + @x.value('(//SPID)[1]','sysname') + N'; Hopefully they''re in the middle of a huge query that they need to finish right away.' EXEC msdb.dbo.sp_send_dbmail @recipients=N'[email protected]', @subject=N'SA Login Alert', @query_result_width=32767, @body=@message, @query=N'EXEC sp_readerrorlog;', @attach_query_result_as_file=1, @query_attachment_filename=N'UtterlyGrossPorn_SeriouslyDontOpenIt.jpg' I'm not sure I'd call that a lengthy script, but the attachment should get pretty big, and I'm sure the email admins will love storing multiple copies of it.  The nice thing is that this also fires on Dedicated Admin connections! You can even identify DAC connections from the event data returned, I leave that as an exercise for you. You can use that info to change the action taken by the activation procedure, and since it's a stored procedure, it can pretty much do anything! Except KILL the SPID, or SHUTDOWN the server directly.  I'm still working on those.

    Read the article

  • Don’t miss the Oracle Webcast: Enabling Effective Decision Making with “One Source of the Truth” at BB&T

    - by Rob Reynolds
    Webcast Date:  September 17th, 2012  -  9 a.m. PT / 12 p.m. ET  BB&T Corporation (NYSE: BBT) is one of the largest financial services holding companies in the United States. One of their IT goals is to provide “one source of truth” to enable more effective decision making at the corporate and local level. By using Oracle’s Hyperion Enterprise Planning Suite and Oracle Essbase, BB&T streamlined their planning and financial reporting processes. Large volumes of data were consolidated into a single reporting solution giving stakeholders more timely and accurate information. By providing a central and automated collaboration tool, BB&T is able to prepare more accurate financial forecasts, rapidly consolidate large amounts of data, and make more informed decisions. Join us on September 17th for a live webcast to hear BB&T’s journey to achieve “One Source of Truth” and learn how Oracle’s Hyperion Planning Suite and Oracle’s Essbase allows you to: Adopt best practices like rolling forecasts and driver-based planning Reduce the time and effort dedicated to the annual budget process Reduce the time and effort dedicated to the annual budget process Remove forecasting uncertainty with predictive modeling capabilities Rapidly analyze shifting market conditions with a powerful calculation engine Prioritize resources effectively with complete visibility into all potential risks Link strategy and execution with integrated strategic, financial and operational planning Register here.

    Read the article

  • Using SQL Execution Plans to discover the Swedish alphabet

    - by Rob Farley
    SQL Server is quite remarkable in a bunch of ways. In this post, I’m using the way that the Query Optimizer handles LIKE to keep it SARGable, the Execution Plans that result, Collations, and PowerShell to come up with the Swedish alphabet. SARGability is the ability to seek for items in an index according to a particular set of criteria. If you don’t have SARGability in play, you need to scan the whole index (or table if you don’t have an index). For example, I can find myself in the phonebook easily, because it’s sorted by LastName and I can find Farley in there by moving to the Fs, and so on. I can’t find everyone in my suburb easily, because the phonebook isn’t sorted that way. I can’t even find people who have six letters in their last name, because also the book is sorted by LastName, it’s not sorted by LEN(LastName). This is all stuff I’ve looked at before, including in the talk I gave at SQLBits in October 2010. If I try to find everyone who’s names start with F, I can do that using a query a bit like: SELECT LastName FROM dbo.PhoneBook WHERE LEFT(LastName,1) = 'F'; Unfortunately, the Query Optimizer doesn’t realise that all the entries that satisfy LEFT(LastName,1) = 'F' will be together, and it has to scan the whole table to find them. But if I write: SELECT LastName FROM dbo.PhoneBook WHERE LastName LIKE 'F%'; then SQL is smart enough to understand this, and performs an Index Seek instead. To see why, I look further into the plan, in particular, the properties of the Index Seek operator. The ToolTip shows me what I’m after: You’ll see that it does a Seek to find any entries that are at least F, but not yet G. There’s an extra Predicate in there (a Residual Predicate if you like), which checks that each LastName is really LIKE F% – I suppose it doesn’t consider that the Seek Predicate is quite enough – but most of the benefit is seen by its working out the Seek Predicate, filtering to just the “at least F but not yet G” section of the data. This got me curious though, particularly about where the G comes from, and whether I could leverage it to create the Swedish alphabet. I know that in the Swedish language, there are three extra letters that appear at the end of the alphabet. One of them is ä that appears in the word Västerås. It turns out that Västerås is quite hard to find in an index when you’re looking it up in a Swedish map. I talked about this briefly in my five-minute talk on Collation from SQLPASS (the one which was slightly less than serious). So by looking at the plan, I can work out what the next letter is in the alphabet of the collation used by the column. In other words, if my alphabet were Swedish, I’d be able to tell what the next letter after F is – just in case it’s not G. It turns out it is… Yes, the Swedish letter after F is G. But I worked this out by using a copy of my PhoneBook table that used the Finnish_Swedish_CI_AI collation. I couldn’t find how the Query Optimizer calculates the G, and my friend Paul White (@SQL_Kiwi) tells me that it’s frustratingly internal to the QO. He’s particularly smart, even if he is from New Zealand. To investigate further, I decided to do some PowerShell, leveraging the Get-SqlPlan function that I blogged about recently (make sure you also have the SqlServerCmdletSnapin100 snap-in added). I started by indicating that I was going to use Finnish_Swedish_CI_AI as my collation of choice, and that I’d start whichever letter cam straight after the number 9. I figure that this is a cheat’s way of guessing the first letter of the alphabet (but it doesn’t actually work in Unicode – luckily I’m using varchar not nvarchar. Actually, there are a few aspects of this code that only work using ASCII, so apologies if you were wanting to apply it to Greek, Japanese, etc). I also initialised my $alphabet variable. $collation = 'Finnish_Swedish_CI_AI'; $firstletter = '9'; $alphabet = ''; Now I created the table for my test. A single field would do, and putting a Clustered Index on it would suffice for the Seeks. Invoke-Sqlcmd -server . -data tempdb -query "create table dbo.collation_test (col varchar(10) collate $collation primary key);" Now I get into the looping. $c = $firstletter; $stillgoing = $true; while ($stillgoing) { I construct the query I want, seeking for entries which start with whatever $c has reached, and get the plan for it: $query = "select col from dbo.collation_test where col like '$($c)%';"; [xml] $pl = get-sqlplan $query "." "tempdb"; At this point, my $pl variable is a scary piece of XML, representing the execution plan. A bit of hunting through it showed me that the EndRange element contained what I was after, and that if it contained NULL, then I was done. $stillgoing = ($pl.ShowPlanXML.BatchSequence.Batch.Statements.StmtSimple.QueryPlan.RelOp.IndexScan.SeekPredicates.SeekPredicateNew.SeekKeys.EndRange -ne $null); Now I could grab the value out of it (which came with apostrophes that needed stripping), and append that to my $alphabet variable.   if ($stillgoing)   {  $c=$pl.ShowPlanXML.BatchSequence.Batch.Statements.StmtSimple.QueryPlan.RelOp.IndexScan.SeekPredicates.SeekPredicateNew.SeekKeys.EndRange.RangeExpressions.ScalarOperator.ScalarString.Replace("'","");     $alphabet += $c;   } Finally, finishing the loop, dropping the table, and showing my alphabet! } Invoke-Sqlcmd -server . -data tempdb -query "drop table dbo.collation_test;"; $alphabet; When I run all this, I see that the Swedish alphabet is ABCDEFGHIJKLMNOPQRSTUVXYZÅÄÖ, which matches what I see at Wikipedia. Interesting to see that the letters on the end are still there, even with Case Insensitivity. Turns out they’re not just “letters with accents”, they’re letters in their own right. I’m sure you gave up reading long ago, and really aren’t that fazed about the idea of doing this using PowerShell. I chose PowerShell because I’d already come up with an easy way of grabbing the estimated plan for a query, and PowerShell does allow for easy navigation of XML. I find the most interesting aspect of this as the fact that the Query Optimizer uses the next letter of the alphabet to maintain the SARGability of LIKE. I’m hoping they do something similar for a whole bunch of operations. Oh, and the fact that you know how to find stuff in the IKEA catalogue. Footnote: If you are interested in whether this works in other languages, you might want to consider the following screenshot, which shows that in principle, it should work with Japanese. It might be a bit harder to run this in PowerShell though, as I’m not sure how it translates. In Hiragana, the Japanese alphabet starts ?, ?, ?, ?, ?, ...

    Read the article

  • Plan Operator Tuesday round-up

    - by Rob Farley
    Eighteen posts for T-SQL Tuesday #43 this month, discussing Plan Operators. I put them together and made the following clickable plan. It’s 1000px wide, so I hope you have a monitor wide enough. Let me explain this plan for you (people’s names are the links to the articles on their blogs – the same links as in the plan above). It was clearly a SELECT statement. Wayne Sheffield (@dbawayne) wrote about that, so we start with a SELECT physical operator, leveraging the logical operator Wayne Sheffield. The SELECT operator calls the Paul White operator, discussed by Jason Brimhall (@sqlrnnr) in his post. The Paul White operator is quite remarkable, and can consume three streams of data. Let’s look at those streams. The first pulls data from a Table Scan – Boris Hristov (@borishristov)’s post – using parallel threads (Bradley Ball – @sqlballs) that pull the data eagerly through a Table Spool (Oliver Asmus – @oliverasmus). A scalar operation is also performed on it, thanks to Jeffrey Verheul (@devjef)’s Compute Scalar operator. The second stream of data applies Evil (I figured that must mean a procedural TVF, but could’ve been anything), courtesy of Jason Strate (@stratesql). It performs this Evil on the merging of parallel streams (Steve Jones – @way0utwest), which suck data out of a Switch (Paul White – @sql_kiwi). This Switch operator is consuming data from up to four lookups, thanks to Kalen Delaney (@sqlqueen), Rick Krueger (@dataogre), Mickey Stuewe (@sqlmickey) and Kathi Kellenberger (@auntkathi). Unfortunately Kathi’s name is a bit long and has been truncated, just like in real plans. The last stream performs a join of two others via a Nested Loop (Matan Yungman – @matanyungman). One pulls data from a Spool (my post – @rob_farley) populated from a Table Scan (Jon Morisi). The other applies a catchall operator (the catchall is because Tamera Clark (@tameraclark) didn’t specify any particular operator, and a catchall is what gets shown when SSMS doesn’t know what to show. Surprisingly, it’s showing the yellow one, which is about cursors. Hopefully that’s not what Tamera planned, but anyway...) to the output from an Index Seek operator (Sebastian Meine – @sqlity). Lastly, I think everyone put in 110% effort, so that’s what all the operators cost. That didn’t leave anything for me, unfortunately, but that’s okay. Also, because he decided to use the Paul White operator, Jason Brimhall gets 0%, and his 110% was given to Paul’s Switch operator post. I hope you’ve enjoyed this T-SQL Tuesday, and have learned something extra about Plan Operators. Keep your eye out for next month’s one by watching the Twitter Hashtag #tsql2sday, and why not contribute a post to the party? Big thanks to Adam Machanic as usual for starting all this. @rob_farley

    Read the article

  • 24 hours to pass until 24 Hours of PASS

    - by Rob Farley
    There’s a bunch of stuff going on at the moment in the SQL world, so if you’ve missed this particular piece of news, let me tell you a bit about it. Twice a year, the SQL community puts on its biggest virtual event – 24 Hours of PASS. And the next one is tomorrow – March 21st, 2012. Twenty-four sessions, back-to-back, featuring a selection of some of the best presenters in the SQL world, speakers from all over the world, coming together in an online collaboration that so far has well over thirty thousand registrations across the presentations. Some people are signed up for all 24 sessions, some only one. Traditionally, LiveMeeting has been used as the platform for this event, but this year we’re going with a new platform – IBTalk. It promises big, and we’re hoping it won’t let us down. LiveMeeting has been great, and we thank Microsoft for providing it as a platform for the past few years. However, as the event has grown, we’ve found that a new idea is necessary. Last year a search was done for a new platform, and IBTalk ticked the right boxes. The feedback from the presenters and moderators so far has been overwhelmingly positive, and we’re hoping that this is going to really enhance the user experience. One of my favourite features of the platform is the language side. It provides a pretty good translation service. Users who join a session will see a flag on the left of the screen. If they click it, they can change the language to one of 15 on offer. Picking this changes all the labels on everything. It even translates the text in the Q&A window. What this means is that someone from Brazil can ask their question in Portuguese, and the presenter will see it in English. Then if the answer is typed in English, the questioner will be able to see the answer, also in Portuguese. Or they can switch to English to see it as the answerer typed it. I know there’s always the risk of bad translations going on, but I’ve heard good things about this translation service. But there’s more – IBTalk are providing staff to type up closed captioning live during the event. So if English isn’t your first language, don’t worry! Picking your language will also let you see subtitles in your chosen language. I’m hoping that this event is the start of PASS being able to reach people from all corners of the world. Wouldn’t it be great to find that this event is successful, and that the next 24HOP (later in the year, our Summit Preview event) has just as many non-English speakers tuning in as English speakers? If you haven’t been planning which sessions you’re going to attend, you really should get over to sqlpass.org/24hours and have a look through what’s on offer. There’s some amazing material from some of the industry’s brightest, covering a wide range of topics, from classic SQL areas to the brand new SQL 2012 features. There really should be something for every SQL professional. Check the time zones though – if you’re in the US you might be on Summer time, and an hour closer to GMT than normal. Massive thanks must go to Microsoft, SQL Sentry and Idera for sponsoring this event. Without sponsors we wouldn’t be able to put any of this on. These companies are helping 24HOP continue to grow into an event for the whole world. See you tomorrow! @rob_farley | #24hop | #sqlpass

    Read the article

  • 24 hours to pass until 24 Hours of PASS

    - by Rob Farley
    There’s a bunch of stuff going on at the moment in the SQL world, so if you’ve missed this particular piece of news, let me tell you a bit about it. Twice a year, the SQL community puts on its biggest virtual event – 24 Hours of PASS. And the next one is tomorrow – March 21st, 2012. Twenty-four sessions, back-to-back, featuring a selection of some of the best presenters in the SQL world, speakers from all over the world, coming together in an online collaboration that so far has well over thirty thousand registrations across the presentations. Some people are signed up for all 24 sessions, some only one. Traditionally, LiveMeeting has been used as the platform for this event, but this year we’re going with a new platform – IBTalk. It promises big, and we’re hoping it won’t let us down. LiveMeeting has been great, and we thank Microsoft for providing it as a platform for the past few years. However, as the event has grown, we’ve found that a new idea is necessary. Last year a search was done for a new platform, and IBTalk ticked the right boxes. The feedback from the presenters and moderators so far has been overwhelmingly positive, and we’re hoping that this is going to really enhance the user experience. One of my favourite features of the platform is the language side. It provides a pretty good translation service. Users who join a session will see a flag on the left of the screen. If they click it, they can change the language to one of 15 on offer. Picking this changes all the labels on everything. It even translates the text in the Q&A window. What this means is that someone from Brazil can ask their question in Portuguese, and the presenter will see it in English. Then if the answer is typed in English, the questioner will be able to see the answer, also in Portuguese. Or they can switch to English to see it as the answerer typed it. I know there’s always the risk of bad translations going on, but I’ve heard good things about this translation service. But there’s more – IBTalk are providing staff to type up closed captioning live during the event. So if English isn’t your first language, don’t worry! Picking your language will also let you see subtitles in your chosen language. I’m hoping that this event is the start of PASS being able to reach people from all corners of the world. Wouldn’t it be great to find that this event is successful, and that the next 24HOP (later in the year, our Summit Preview event) has just as many non-English speakers tuning in as English speakers? If you haven’t been planning which sessions you’re going to attend, you really should get over to sqlpass.org/24hours and have a look through what’s on offer. There’s some amazing material from some of the industry’s brightest, covering a wide range of topics, from classic SQL areas to the brand new SQL 2012 features. There really should be something for every SQL professional. Check the time zones though – if you’re in the US you might be on Summer time, and an hour closer to GMT than normal. Massive thanks must go to Microsoft, SQL Sentry and Idera for sponsoring this event. Without sponsors we wouldn’t be able to put any of this on. These companies are helping 24HOP continue to grow into an event for the whole world. See you tomorrow! @rob_farley | #24hop | #sqlpass

    Read the article

  • New Exadata Book Available Soon

    - by Rob Reynolds
    Oracle Press is set to released the first book on data warehouse performance and Exadata on March 14th. Achieving Extreme Performance with Oracle Exadata , by my colleagues Rick Greenwald, Robert Stackowiak, Maqsood Alam, and Mans Bhuller will be available at your favorite booksellers next week. I've seen a sneak peak of the content in this book and its a great way to fully grasp the power of Exadata and how to best apply it to achieve extreme data warehouse performance. From the publisher's description: Achieving Extreme Performance with Oracle Exadata and the Sun Oracle Database Machine is filled with best practices for deployments, hardware sizing, architecting the database machine environments for maximum availability, and backup and recovery. Oracle Database 11gR2 features used within these offerings, as well as migration options and paths for Oracle and non-Oracle databases to Oracle Exadata are covered. This Oracle Press guide also discusses architecture, administration, maintenance, monitoring, and tuning of Oracle Exadata Storage Servers and the Sun Oracle Database Machine. If your company is considering Exadata, or if you need more horsepower out of your data warehouse, I highly recommend grabbing a copy of this book next week.

    Read the article

  • Exploding maps in Reporting Services 2008 R2

    - by Rob Farley
    Kaboom! Well, that was the imagery that secretly appeared in my mind when I saw “USA By State Exploded” in the list of installed maps in Report Builder 3.0 – part of the spatial offering of SQL Server Reporting Server 2008 R2. Alas, it just means that the borders are bigger. Clicking on it showed me. Unfortunately, I’m not interested in maps of the US. None of my clients are there (at least, not yet – feel free to get in touch if you want to change this ‘feature’ of my company). So instead, I’ve recently been getting hold of some data for Australian areas. I’ve just bought some PostCode shapes for South Australia, and will use this in demos for conferences and for showing clients how this kind of report can really impact their reporting. One of the companies I was talking about getting shape files sent me a sample. So I chose the “ESRI shapefile” option you see above, and browsed to my file. It appeared in the window like this: Australians will immediately recognise this as the area around Wollongong, just south of Sydney. Well, apart from me. I didn’t. I had to put a Bing Maps layer behind it to work that out, but that’s not for this post. The thing that I discovered was that if I selected the Exploded USA option (but without clicking Next), and then chose my shape file, then my area around Wollongong would be exploded too! Huh! I think this is actually a bug, but a potentially useful one! Some further investigation (involving creating two identical reports, one with this exploded view, one without), showed that the Exploded View is done by reducing the ScaleFactor property of the PolygonLayer in the map control. The Exploded version has it below 1. If you set to above one, your shapes overlap. I discovered this by accident… I guess I hadn’t looked through all the PolygonLayer options to work out what they all do. And because this post is about Reporting, it can qualify for this month’s T-SQL Tuesday, hosted by Aaron Nelson (@sqlvariant). Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

    Read the article

  • Install of AppFabric RC stops AppFabric Monitoring (some traps for young players)

    - by Rob Addis
    I uninstalled AppFabric Beta 2 and installed AppFabric RC. The AppFabricEventCollection Service is started (runs under Local Service which is a dbo_owner on the Monitoring Database to prove this wasn’t the issue). The SQLServerAgent Service is started. Nothing is being written to the Monitoring DB Staging Table and thus nothing is being written to the Event tables or seen in the AppFabric Dashboard. Nothing has been written to the following event logs     - Microsoft-Windows-Application Server-System Services\Admin     - Microsoft-Windows-Application Server-System Services\Operational The Microsoft-Windows-Application Server-System Services\Debug event log is not shown in the event viewer. The WCF configuration appears fine the connection string to the Monitoring DB is correct. Monitoring is set to “Trouble Shooting” and no errors are shown on the “Configure WCF and WF for Application” dialog. So the problem seems to lie with either AppFabric which writes to the event log or the AppFabricEventCollection Service. I thought I was flummoxed... However one of my colleagues said have you checked the etwProviderId? I was using a config which was created under AppFabric  Beta 2 which had a different etwProviderId. So I deleted the following section and all other references to AppFabric monitoring from the web.config and then recreated them using IIS the “Configure WCF and WF for Application” dialog and set the level to TroubleShooting.         <diagnostics etwProviderId="6b44a7ff-9db4-4723-b8cf-1b584bf1591b">             <endToEndTracing propagateActivity="true" messageFlowTracing="true" />         </diagnostics>   I then called a service to create some log entries. Still nothing was written to the Monitoring DB Staging Table... I checked the Microsoft-Windows-Application Server-System Services\Admin event log. It had the following entry... Configuration error. Please see the details to correct the problem. \rDetailed information:\r Filename: \\?\C:\Users\xxx\Documents\dotnetdev\Frameworks\SOA\xxx.SOA.Framework\xxx.SOA.Framework.MockServices\SimpleServiceParent\web.config Error: Cannot read configuration file due to insufficient permissions    System.UnauthorizedAccessException: Filename: \\?\C:\Users\xxx\Documents\dotnetdev\Frameworks\SOA\xxx.SOA.Framework\IAG.SOA.Framework.MockServices\SimpleServiceParent\web.config Error: Cannot read configuration file due to insufficient permissions   And guess who the user was... Local Service yes yes I should have used a better User in the AppFabric RC setup to run the AppFabricEventCollection Service under! So I changed the user to a more appropriate one and removed Local Service as a DBO and hay presto!

    Read the article

  • LobsterPot Solutions in the USA

    - by Rob Farley
    We’re expanding! I’m thrilled to announce that Microsoft Gold Partner LobsterPot Solutions has started another branch appointing the amazing Ted Krueger (5-time SQL MVP awardee) as the US lead. Ted is well-known in the SQL Server world, having written books on indexing, consulting and on being a DBA (not to mention contributing chapters to both MVP Deep Dives books). He is an expert on replication and high availability, and strong in the Business Intelligence space – vast experience which is both broad and deep. Ted is based in the south east corner of Wisconsin, just north of Chicago. He has been a consultant for eons and has helped many clients with their projects and problems, taking the role as both technical lead and consulting lead. He is also tireless in supporting and developing the SQL Server community, presenting at conferences across America, and helping people through his blog, Twitter and more. Despite all this – it’s neither his technical excellence with SQL Server nor his consulting skill that made me want him to lead LobsterPot’s US venture. I wanted Ted because of his values. In the time I’ve known Ted, I’ve found his integrity to be excellent, and found him to be morally beyond reproach. This is the biggest priority I have when finding people to represent the LobsterPot brand. I have no qualms in recommending Ted’s character or work ethic. It’s not just my thoughts on him – all my trusted friends that know Ted agree about this. So last week, LobsterPot Solutions LLC was formed in the United States, and in a couple of weeks, we will be open for business! LobsterPot Solutions can be contacted via email at [email protected], on the web at either www.lobsterpot.com.au or www.lobsterpotsolutions.com, and on Twitter as @lobsterpot_au and @lobsterpot_us. Ted Kruger blogs at LessThanDot, and can also be found on Twitter and LinkedIn. This post is cross-posted from http://lobsterpotsolutions.com/lobsterpot-solutions-in-the-usa

    Read the article

  • New in MySQL Enterprise Edition: Policy-based Auditing!

    - by Rob Young
    Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} For those with an interest in MySQL, this weekend's MySQL Connect conference in San Francisco has gotten off to a great start. On Saturday Tomas announced the feature complete MySQL 5.6 Release Candidate that is now available for Community adoption and testing. This announcement marks the sprint to GA that should be ready for release within the next 90 days. You can get a quick summary of the key 5.6 features here or better yet download the 5.6 RC (under “Development Releases”), review what's new and try it out for yourself! There were also product related announcements around MySQL Cluster 7.3 and MySQL Enterprise Edition . This latter announcement is of particular interest if you are faced with internal and regulatory compliance requirements as it addresses and solves a pain point that is shared by most developers and DBAs; new, out of the box compliance for MySQL applications via policy-based audit logging of user and query level activity. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} One of the most common requests we get for the MySQL roadmap is for quick and easy logging of audit events. This is mainly due to how web-based applications have evolved from nice-to-have enablers to mission-critical revenue generation and the important role MySQL plays in the new dynamic. In today’s virtual marketplace, PCI compliance guidelines ensure credit card data is secure within e-commerce apps; from a corporate standpoint, Sarbanes-Oxely, HIPAA and other regulations guard the medical, financial, public sector and other personal data centric industries. For supporting applications audit policies and controls that monitor the eyes and hands that have viewed and acted upon the most sensitive of data is most commonly implemented on the back-end database. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} With this in mind, MySQL 5.5 introduced an open audit plugin API that enables all MySQL users to write their own auditing plugins based on application specific requirements. While the supporting docs are very complete and provide working code samples, writing an audit plugin requires time and low-level expertise to develop, test, implement and maintain. To help those who don't have the time and/or expertise to develop such a plugin, Oracle now ships MySQL 5.5.28 and higher with an easy to use, out-of-the-box auditing solution; MySQL Enterprise Audit. MySQL Enterprise Audit The premise behind MySQL Enterprise Audit is simple; we wanted to provide an easy to use, policy-based auditing solution that enables you to quickly and seamlessly add compliance to their MySQL applications. MySQL Enterprise Audit meets this requirement by enabling you to: 1. Easily install the needed components. Installation requires an upgrade to MySQL 5.5.28 (Enterprise edition), which can be downloaded from the My Oracle Support portal or the Oracle Software Delivery Cloud. After installation, you simply add the following to your my.cnf file to register and enable the audit plugin: [mysqld] plugin-load=audit_log.so (keep in mind the audit_log suffix is platform dependent, so .dll on Windows, etc.) or alternatively you can load the plugin at runtime: mysql> INSTALL PLUGIN audit_log SONAME 'audit_log.so'; 2. Dynamically enable and disable the audit stream for a specific MySQL server. A new global variable called audit_log_policy allows you to dynamically enable and disable audit stream logging for a specific MySQL server. The variable parameters are described below. 3. Define audit policy based on what needs to be logged (everything, logins, queries, or nothing), by server. The new audit_log_policy variable uses the following valid, descriptively named values to enable, disable audit stream logging and to filter the audit events that are logged to the audit stream: "ALL" - enable audit stream and log all events "LOGINS" - enable audit stream and log only login events "QUERIES" - enable audit stream and log only querie events "NONE" - disable audit stream 4. Manage audit log files using basic MySQL log rotation features. A new global variable, audit_log_rotate_on_size, allows you to automate the rotation and archival of audit stream log files based on size with archived log files renamed and appended with datetime stamp when a new file is opened for logging. 5. Integrate the MySQL audit stream with MySQL, Oracle tools and other third-party solutions. The MySQL audit stream is written as XML, using UFT-8 and can be easily formatted for viewing using a standard XML parser. This enables you to leverage tools from MySQL and others to view the contents. The audit stream was also developed to meet the Oracle database audit stream specification so combined Oracle/MySQL shops can import and manage MySQL audit images using the same Oracle tools they use for their Oracle databases. So assuming a successful MySQL 5.5.28 upgrade or installation, a common set up and use case scenario might look something like this: Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} It should be noted that MySQL Enterprise Audit was designed to be transparent at the application layer by allowing you to control the mix of log output buffering and asynchronous or synchronous disk writes to minimize the associated overhead that comes when the audit stream is enabled. The net result is that, depending on the chosen audit stream log stream options, most application users will see little to no difference in response times when the audit stream is enabled. So what are your next steps? Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} Get all of the grainy details on MySQL Enterprise Audit, including all of the additional configuration options from the MySQL documentation. MySQL Enterprise Edition customers can download MySQL 5.5.28 with the Audit extension for production use from the My Oracle Support portal. Everyone can download MySQL 5.5.28 with the Audit extension for evaluation from the Oracle Software Delivery Cloud. Learn more about MySQL Enterprise Edition. As always, thanks for your continued support of MySQL!

    Read the article

  • In GLSL is it possible to offset vertices based on height map colour?

    - by Rob
    I am attempting to generate some terrain based upon a heightmap. I have generated a 32 x 32 grid and a corresponding height map - In my vertex shader I am trying to offset the position of the Y axis based upon the colour of the heightmap, white vertices being higher than black ones. //Vertex Shader Code #version 330 uniform mat4 modelMatrix; uniform mat4 viewMatrix; uniform mat4 projectionMatrix; uniform sampler2D heightmap; layout (location=0) in vec4 vertexPos; layout (location=1) in vec4 vertexColour; layout (location=3) in vec2 vertexTextureCoord; layout (location=4) in float offset; out vec4 fragCol; out vec4 fragPos; out vec2 fragTex; void main() { // Retreive the current pixel's colour vec4 hmColour = texture(heightmap,vertexTextureCoord); // Offset the y position by the value of current texel's colour value ? vec4 offset = vec4(vertexPos.x , vertexPos.y + hmColour.r, vertexPos.z , 1.0); // Final Position gl_Position = projectionMatrix * viewMatrix * modelMatrix * offset; // Data sent to Fragment Shader. fragCol = vertexColour; fragPos = vertexPos; fragTex = vertexTextureCoord; } However the code I have produced only creates a grid with none of the y vertices higher than any others. This is the C++ code that generates the grid and texture co-orientates which I believe to be correct as the texture is mapped to the grid, hence the white blob in the middle. The grid-lines are generated in the fragment shader, sorry for any confusion. I have tried multiplying the r value of hmColour by 1000 unfortunately that had no effect. The only other problem it could be is that the texture coordinate data is incorrect ? for (int z = 0; z < MAP_Z ; z++) { for(int x = 0; x < MAP_X ; x++) { //Generate Vertex Buffer vertexData[iVertex++] = float (x) * MAP_X; vertexData[iVertex++] = 0; vertexData[iVertex++] = -(float) (z) * MAP_Z; //Colour Buffer NOT NEEDED colourData[iColour++] = 255.0f; // R colourData[iColour++] = 1.0f; // G colourData[iColour++] = 0.0f; // B //Texture Buffer textureData[iTexture++] = (float ) x * (1.0f / MAP_X); textureData[iTexture++] = (float ) z * (1.0f / MAP_Z); } } The heightmap texture I am trying to use appears like so (without grid-lines). This is the corresponding fragment shader // Fragment Shader Code #version 330 uniform sampler2D hmTexture; layout (location=0) out vec4 fragColour; in vec2 fragTex; in vec4 pos; void main(void) { vec2 line = fragTex * 32; // Without Gridlines fragColour = texture(hmTexture,fragTex); // With grid lines // + mix(vec4(0.0, 0.0, 1.0, 0.0), vec4(1.0, 1.0, 1.0, 1.0), // smoothstep(0.05,fract(line.y), 0.99) * smoothstep(0.05,fract(line.x),0.99)); }

    Read the article

  • To maximize chances of functional programming employment

    - by Rob Agar
    Given that the future of programming is functional, at some point in the nearish future I want to be paid to code in a functional language, preferably Haskell. Assuming I have a firm grasp of the language, plus all the basic programmer attributes (good communication skills/sense of humour/hygiene etc), what should I concentrate on learning to maximize my chances? Are there any particularly sought after libraries I should know? Alternatively, would another language be a better bet, say F#? (I'm not too fussed about the kind of programming work, so long as it's reasonably interesting and reasonably well paid, and with nice people)

    Read the article

  • The blocking nature of aggregates

    - by Rob Farley
    I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here.  In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

    Read the article

  • Suggestions on managing social media accounts

    - by Rob
    As a company we now have Facebook, LinkedIN, Twitter and now Google+, is there a way to easily manage all these accounts without having to log into them individually? Things like posting content to each one is becoming a full time job in itself, is there a way to post once that in turn posts to all other accounts? I used to use http://ping.fm/ a long time ago, has there been any advancements in something similar to this? With friend lists, news feeds etc etc for each one, I wish there was a way to manage them all in one place with a service/tool!

    Read the article

  • Joins in single-table queries

    - by Rob Farley
    Tables are only metadata. They don’t store data. I’ve written something about this before, but I want to take a viewpoint of this idea around the topic of joins, especially since it’s the topic for T-SQL Tuesday this month. Hosted this time by Sebastian Meine (@sqlity), who has a whole series on joins this month. Good for him – it’s a great topic. In that last post I discussed the fact that we write queries against tables, but that the engine turns it into a plan against indexes. My point wasn’t simply that a table is actually just a Clustered Index (or heap, which I consider just a special type of index), but that data access always happens against indexes – never tables – and we should be thinking about the indexes (specifically the non-clustered ones) when we write our queries. I described the scenario of looking up phone numbers, and how it never really occurs to us that there is a master list of phone numbers, because we think in terms of the useful non-clustered indexes that the phone companies provide us, but anyway – that’s not the point of this post. So a table is metadata. It stores information about the names of columns and their data types. Nullability, default values, constraints, triggers – these are all things that define the table, but the data isn’t stored in the table. The data that a table describes is stored in a heap or clustered index, but it goes further than this. All the useful data is going to live in non-clustered indexes. Remember this. It’s important. Stop thinking about tables, and start thinking about indexes. So let’s think about tables as indexes. This applies even in a world created by someone else, who doesn’t have the best indexes in mind for you. I’m sure you don’t need me to explain Covering Index bit – the fact that if you don’t have sufficient columns “included” in your index, your query plan will either have to do a Lookup, or else it’ll give up using your index and use one that does have everything it needs (even if that means scanning it). If you haven’t seen that before, drop me a line and I’ll run through it with you. Or go and read a post I did a long while ago about the maths involved in that decision. So – what I’m going to tell you is that a Lookup is a join. When I run SELECT CustomerID FROM Sales.SalesOrderHeader WHERE SalesPersonID = 285; against the AdventureWorks2012 get the following plan: I’m sure you can see the join. Don’t look in the query, it’s not there. But you should be able to see the join in the plan. It’s an Inner Join, implemented by a Nested Loop. It’s pulling data in from the Index Seek, and joining that to the results of a Key Lookup. It clearly is – the QO wouldn’t call it that if it wasn’t really one. It behaves exactly like any other Nested Loop (Inner Join) operator, pulling rows from one side and putting a request in from the other. You wouldn’t have a problem accepting it as a join if the query were slightly different, such as SELECT sod.OrderQty FROM Sales.SalesOrderHeader AS soh JOIN Sales.SalesOrderDetail as sod on sod.SalesOrderID = soh.SalesOrderID WHERE soh.SalesPersonID = 285; Amazingly similar, of course. This one is an explicit join, the first example was just as much a join, even thought you didn’t actually ask for one. You need to consider this when you’re thinking about your queries. But it gets more interesting. Consider this query: SELECT SalesOrderID FROM Sales.SalesOrderHeader WHERE SalesPersonID = 276 AND CustomerID = 29522; It doesn’t look like there’s a join here either, but look at the plan. That’s not some Lookup in action – that’s a proper Merge Join. The Query Optimizer has worked out that it can get the data it needs by looking in two separate indexes and then doing a Merge Join on the data that it gets. Both indexes used are ordered by the column that’s indexed (one on SalesPersonID, one on CustomerID), and then by the CIX key SalesOrderID. Just like when you seek in the phone book to Farley, the Farleys you have are ordered by FirstName, these seek operations return the data ordered by the next field. This order is SalesOrderID, even though you didn’t explicitly put that column in the index definition. The result is two datasets that are ordered by SalesOrderID, making them very mergeable. Another example is the simple query SELECT CustomerID FROM Sales.SalesOrderHeader WHERE SalesPersonID = 276; This one prefers a Hash Match to a standard lookup even! This isn’t just ordinary index intersection, this is something else again! Just like before, we could imagine it better with two whole tables, but we shouldn’t try to distinguish between joining two tables and joining two indexes. The Query Optimizer can see (using basic maths) that it’s worth doing these particular operations using these two less-than-ideal indexes (because of course, the best indexese would be on both columns – a composite such as (SalesPersonID, CustomerID – and it would have the SalesOrderID column as part of it as the CIX key still). You need to think like this too. Not in terms of excusing single-column indexes like the ones in AdventureWorks2012, but in terms of having a picture about how you’d like your queries to run. If you start to think about what data you need, where it’s coming from, and how it’s going to be used, then you will almost certainly write better queries. …and yes, this would include when you’re dealing with regular joins across multiples, not just against joins within single table queries.

    Read the article

  • The blocking nature of aggregates

    - by Rob Farley
    I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here.  In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

    Read the article

< Previous Page | 5 6 7 8 9 10 11 12 13 14 15 16  | Next Page >