Search Results

Search found 1427 results on 58 pages for 'rob reynolds'.

Page 10/58 | < Previous Page | 6 7 8 9 10 11 12 13 14 15 16 17 | Next Page >

24 hours to pass until 24 Hours of PASS

- by Rob Farley

There’s a bunch of stuff going on at the moment in the SQL world, so if you’ve missed this particular piece of news, let me tell you a bit about it. Twice a year, the SQL community puts on its biggest virtual event – 24 Hours of PASS. And the next one is tomorrow – March 21st, 2012. Twenty-four sessions, back-to-back, featuring a selection of some of the best presenters in the SQL world, speakers from all over the world, coming together in an online collaboration that so far has well over thirty thousand registrations across the presentations. Some people are signed up for all 24 sessions, some only one. Traditionally, LiveMeeting has been used as the platform for this event, but this year we’re going with a new platform – IBTalk. It promises big, and we’re hoping it won’t let us down. LiveMeeting has been great, and we thank Microsoft for providing it as a platform for the past few years. However, as the event has grown, we’ve found that a new idea is necessary. Last year a search was done for a new platform, and IBTalk ticked the right boxes. The feedback from the presenters and moderators so far has been overwhelmingly positive, and we’re hoping that this is going to really enhance the user experience. One of my favourite features of the platform is the language side. It provides a pretty good translation service. Users who join a session will see a flag on the left of the screen. If they click it, they can change the language to one of 15 on offer. Picking this changes all the labels on everything. It even translates the text in the Q&A window. What this means is that someone from Brazil can ask their question in Portuguese, and the presenter will see it in English. Then if the answer is typed in English, the questioner will be able to see the answer, also in Portuguese. Or they can switch to English to see it as the answerer typed it. I know there’s always the risk of bad translations going on, but I’ve heard good things about this translation service. But there’s more – IBTalk are providing staff to type up closed captioning live during the event. So if English isn’t your first language, don’t worry! Picking your language will also let you see subtitles in your chosen language. I’m hoping that this event is the start of PASS being able to reach people from all corners of the world. Wouldn’t it be great to find that this event is successful, and that the next 24HOP (later in the year, our Summit Preview event) has just as many non-English speakers tuning in as English speakers? If you haven’t been planning which sessions you’re going to attend, you really should get over to sqlpass.org/24hours and have a look through what’s on offer. There’s some amazing material from some of the industry’s brightest, covering a wide range of topics, from classic SQL areas to the brand new SQL 2012 features. There really should be something for every SQL professional. Check the time zones though – if you’re in the US you might be on Summer time, and an hour closer to GMT than normal. Massive thanks must go to Microsoft, SQL Sentry and Idera for sponsoring this event. Without sponsors we wouldn’t be able to put any of this on. These companies are helping 24HOP continue to grow into an event for the whole world. See you tomorrow! @rob_farley | #24hop | #sqlpass

Read the article
24 hours to pass until 24 Hours of PASS

- by Rob Farley

There’s a bunch of stuff going on at the moment in the SQL world, so if you’ve missed this particular piece of news, let me tell you a bit about it. Twice a year, the SQL community puts on its biggest virtual event – 24 Hours of PASS. And the next one is tomorrow – March 21st, 2012. Twenty-four sessions, back-to-back, featuring a selection of some of the best presenters in the SQL world, speakers from all over the world, coming together in an online collaboration that so far has well over thirty thousand registrations across the presentations. Some people are signed up for all 24 sessions, some only one. Traditionally, LiveMeeting has been used as the platform for this event, but this year we’re going with a new platform – IBTalk. It promises big, and we’re hoping it won’t let us down. LiveMeeting has been great, and we thank Microsoft for providing it as a platform for the past few years. However, as the event has grown, we’ve found that a new idea is necessary. Last year a search was done for a new platform, and IBTalk ticked the right boxes. The feedback from the presenters and moderators so far has been overwhelmingly positive, and we’re hoping that this is going to really enhance the user experience. One of my favourite features of the platform is the language side. It provides a pretty good translation service. Users who join a session will see a flag on the left of the screen. If they click it, they can change the language to one of 15 on offer. Picking this changes all the labels on everything. It even translates the text in the Q&A window. What this means is that someone from Brazil can ask their question in Portuguese, and the presenter will see it in English. Then if the answer is typed in English, the questioner will be able to see the answer, also in Portuguese. Or they can switch to English to see it as the answerer typed it. I know there’s always the risk of bad translations going on, but I’ve heard good things about this translation service. But there’s more – IBTalk are providing staff to type up closed captioning live during the event. So if English isn’t your first language, don’t worry! Picking your language will also let you see subtitles in your chosen language. I’m hoping that this event is the start of PASS being able to reach people from all corners of the world. Wouldn’t it be great to find that this event is successful, and that the next 24HOP (later in the year, our Summit Preview event) has just as many non-English speakers tuning in as English speakers? If you haven’t been planning which sessions you’re going to attend, you really should get over to sqlpass.org/24hours and have a look through what’s on offer. There’s some amazing material from some of the industry’s brightest, covering a wide range of topics, from classic SQL areas to the brand new SQL 2012 features. There really should be something for every SQL professional. Check the time zones though – if you’re in the US you might be on Summer time, and an hour closer to GMT than normal. Massive thanks must go to Microsoft, SQL Sentry and Idera for sponsoring this event. Without sponsors we wouldn’t be able to put any of this on. These companies are helping 24HOP continue to grow into an event for the whole world. See you tomorrow! @rob_farley | #24hop | #sqlpass

Read the article
Using SQL Execution Plans to discover the Swedish alphabet

- by Rob Farley

SQL Server is quite remarkable in a bunch of ways. In this post, I’m using the way that the Query Optimizer handles LIKE to keep it SARGable, the Execution Plans that result, Collations, and PowerShell to come up with the Swedish alphabet. SARGability is the ability to seek for items in an index according to a particular set of criteria. If you don’t have SARGability in play, you need to scan the whole index (or table if you don’t have an index). For example, I can find myself in the phonebook easily, because it’s sorted by LastName and I can find Farley in there by moving to the Fs, and so on. I can’t find everyone in my suburb easily, because the phonebook isn’t sorted that way. I can’t even find people who have six letters in their last name, because also the book is sorted by LastName, it’s not sorted by LEN(LastName). This is all stuff I’ve looked at before, including in the talk I gave at SQLBits in October 2010. If I try to find everyone who’s names start with F, I can do that using a query a bit like: SELECT LastName FROM dbo.PhoneBook WHERE LEFT(LastName,1) = 'F'; Unfortunately, the Query Optimizer doesn’t realise that all the entries that satisfy LEFT(LastName,1) = 'F' will be together, and it has to scan the whole table to find them. But if I write: SELECT LastName FROM dbo.PhoneBook WHERE LastName LIKE 'F%'; then SQL is smart enough to understand this, and performs an Index Seek instead. To see why, I look further into the plan, in particular, the properties of the Index Seek operator. The ToolTip shows me what I’m after: You’ll see that it does a Seek to find any entries that are at least F, but not yet G. There’s an extra Predicate in there (a Residual Predicate if you like), which checks that each LastName is really LIKE F% – I suppose it doesn’t consider that the Seek Predicate is quite enough – but most of the benefit is seen by its working out the Seek Predicate, filtering to just the “at least F but not yet G” section of the data. This got me curious though, particularly about where the G comes from, and whether I could leverage it to create the Swedish alphabet. I know that in the Swedish language, there are three extra letters that appear at the end of the alphabet. One of them is ä that appears in the word Västerås. It turns out that Västerås is quite hard to find in an index when you’re looking it up in a Swedish map. I talked about this briefly in my five-minute talk on Collation from SQLPASS (the one which was slightly less than serious). So by looking at the plan, I can work out what the next letter is in the alphabet of the collation used by the column. In other words, if my alphabet were Swedish, I’d be able to tell what the next letter after F is – just in case it’s not G. It turns out it is… Yes, the Swedish letter after F is G. But I worked this out by using a copy of my PhoneBook table that used the Finnish_Swedish_CI_AI collation. I couldn’t find how the Query Optimizer calculates the G, and my friend Paul White (@SQL_Kiwi) tells me that it’s frustratingly internal to the QO. He’s particularly smart, even if he is from New Zealand. To investigate further, I decided to do some PowerShell, leveraging the Get-SqlPlan function that I blogged about recently (make sure you also have the SqlServerCmdletSnapin100 snap-in added). I started by indicating that I was going to use Finnish_Swedish_CI_AI as my collation of choice, and that I’d start whichever letter cam straight after the number 9. I figure that this is a cheat’s way of guessing the first letter of the alphabet (but it doesn’t actually work in Unicode – luckily I’m using varchar not nvarchar. Actually, there are a few aspects of this code that only work using ASCII, so apologies if you were wanting to apply it to Greek, Japanese, etc). I also initialised my $alphabet variable. $collation = 'Finnish_Swedish_CI_AI'; $firstletter = '9'; $alphabet = ''; Now I created the table for my test. A single field would do, and putting a Clustered Index on it would suffice for the Seeks. Invoke-Sqlcmd -server . -data tempdb -query "create table dbo.collation_test (col varchar(10) collate $collation primary key);" Now I get into the looping. $c = $firstletter; $stillgoing = $true; while ($stillgoing) { I construct the query I want, seeking for entries which start with whatever $c has reached, and get the plan for it: $query = "select col from dbo.collation_test where col like '$($c)%';"; [xml] $pl = get-sqlplan $query "." "tempdb"; At this point, my $pl variable is a scary piece of XML, representing the execution plan. A bit of hunting through it showed me that the EndRange element contained what I was after, and that if it contained NULL, then I was done. $stillgoing = ($pl.ShowPlanXML.BatchSequence.Batch.Statements.StmtSimple.QueryPlan.RelOp.IndexScan.SeekPredicates.SeekPredicateNew.SeekKeys.EndRange -ne $null); Now I could grab the value out of it (which came with apostrophes that needed stripping), and append that to my $alphabet variable. if ($stillgoing) { $c=$pl.ShowPlanXML.BatchSequence.Batch.Statements.StmtSimple.QueryPlan.RelOp.IndexScan.SeekPredicates.SeekPredicateNew.SeekKeys.EndRange.RangeExpressions.ScalarOperator.ScalarString.Replace("'",""); $alphabet += $c; } Finally, finishing the loop, dropping the table, and showing my alphabet! } Invoke-Sqlcmd -server . -data tempdb -query "drop table dbo.collation_test;"; $alphabet; When I run all this, I see that the Swedish alphabet is ABCDEFGHIJKLMNOPQRSTUVXYZÅÄÖ, which matches what I see at Wikipedia. Interesting to see that the letters on the end are still there, even with Case Insensitivity. Turns out they’re not just “letters with accents”, they’re letters in their own right. I’m sure you gave up reading long ago, and really aren’t that fazed about the idea of doing this using PowerShell. I chose PowerShell because I’d already come up with an easy way of grabbing the estimated plan for a query, and PowerShell does allow for easy navigation of XML. I find the most interesting aspect of this as the fact that the Query Optimizer uses the next letter of the alphabet to maintain the SARGability of LIKE. I’m hoping they do something similar for a whole bunch of operations. Oh, and the fact that you know how to find stuff in the IKEA catalogue. Footnote: If you are interested in whether this works in other languages, you might want to consider the following screenshot, which shows that in principle, it should work with Japanese. It might be a bit harder to run this in PowerShell though, as I’m not sure how it translates. In Hiragana, the Japanese alphabet starts ?, ?, ?, ?, ?, ...

Read the article
Plan Operator Tuesday round-up

- by Rob Farley

Eighteen posts for T-SQL Tuesday #43 this month, discussing Plan Operators. I put them together and made the following clickable plan. It’s 1000px wide, so I hope you have a monitor wide enough. Let me explain this plan for you (people’s names are the links to the articles on their blogs – the same links as in the plan above). It was clearly a SELECT statement. Wayne Sheffield (@dbawayne) wrote about that, so we start with a SELECT physical operator, leveraging the logical operator Wayne Sheffield. The SELECT operator calls the Paul White operator, discussed by Jason Brimhall (@sqlrnnr) in his post. The Paul White operator is quite remarkable, and can consume three streams of data. Let’s look at those streams. The first pulls data from a Table Scan – Boris Hristov (@borishristov)’s post – using parallel threads (Bradley Ball – @sqlballs) that pull the data eagerly through a Table Spool (Oliver Asmus – @oliverasmus). A scalar operation is also performed on it, thanks to Jeffrey Verheul (@devjef)’s Compute Scalar operator. The second stream of data applies Evil (I figured that must mean a procedural TVF, but could’ve been anything), courtesy of Jason Strate (@stratesql). It performs this Evil on the merging of parallel streams (Steve Jones – @way0utwest), which suck data out of a Switch (Paul White – @sql_kiwi). This Switch operator is consuming data from up to four lookups, thanks to Kalen Delaney (@sqlqueen), Rick Krueger (@dataogre), Mickey Stuewe (@sqlmickey) and Kathi Kellenberger (@auntkathi). Unfortunately Kathi’s name is a bit long and has been truncated, just like in real plans. The last stream performs a join of two others via a Nested Loop (Matan Yungman – @matanyungman). One pulls data from a Spool (my post – @rob_farley) populated from a Table Scan (Jon Morisi). The other applies a catchall operator (the catchall is because Tamera Clark (@tameraclark) didn’t specify any particular operator, and a catchall is what gets shown when SSMS doesn’t know what to show. Surprisingly, it’s showing the yellow one, which is about cursors. Hopefully that’s not what Tamera planned, but anyway...) to the output from an Index Seek operator (Sebastian Meine – @sqlity). Lastly, I think everyone put in 110% effort, so that’s what all the operators cost. That didn’t leave anything for me, unfortunately, but that’s okay. Also, because he decided to use the Paul White operator, Jason Brimhall gets 0%, and his 110% was given to Paul’s Switch operator post. I hope you’ve enjoyed this T-SQL Tuesday, and have learned something extra about Plan Operators. Keep your eye out for next month’s one by watching the Twitter Hashtag #tsql2sday, and why not contribute a post to the party? Big thanks to Adam Machanic as usual for starting all this. @rob_farley

Read the article
LobsterPot Solutions in the USA

- by Rob Farley

We’re expanding! I’m thrilled to announce that Microsoft Gold Partner LobsterPot Solutions has started another branch appointing the amazing Ted Krueger (5-time SQL MVP awardee) as the US lead. Ted is well-known in the SQL Server world, having written books on indexing, consulting and on being a DBA (not to mention contributing chapters to both MVP Deep Dives books). He is an expert on replication and high availability, and strong in the Business Intelligence space – vast experience which is both broad and deep. Ted is based in the south east corner of Wisconsin, just north of Chicago. He has been a consultant for eons and has helped many clients with their projects and problems, taking the role as both technical lead and consulting lead. He is also tireless in supporting and developing the SQL Server community, presenting at conferences across America, and helping people through his blog, Twitter and more. Despite all this – it’s neither his technical excellence with SQL Server nor his consulting skill that made me want him to lead LobsterPot’s US venture. I wanted Ted because of his values. In the time I’ve known Ted, I’ve found his integrity to be excellent, and found him to be morally beyond reproach. This is the biggest priority I have when finding people to represent the LobsterPot brand. I have no qualms in recommending Ted’s character or work ethic. It’s not just my thoughts on him – all my trusted friends that know Ted agree about this. So last week, LobsterPot Solutions LLC was formed in the United States, and in a couple of weeks, we will be open for business! LobsterPot Solutions can be contacted via email at [email protected], on the web at either www.lobsterpot.com.au or www.lobsterpotsolutions.com, and on Twitter as @lobsterpot_au and @lobsterpot_us. Ted Kruger blogs at LessThanDot, and can also be found on Twitter and LinkedIn. This post is cross-posted from http://lobsterpotsolutions.com/lobsterpot-solutions-in-the-usa

Read the article
Exploding maps in Reporting Services 2008 R2

- by Rob Farley

Kaboom! Well, that was the imagery that secretly appeared in my mind when I saw “USA By State Exploded” in the list of installed maps in Report Builder 3.0 – part of the spatial offering of SQL Server Reporting Server 2008 R2. Alas, it just means that the borders are bigger. Clicking on it showed me. Unfortunately, I’m not interested in maps of the US. None of my clients are there (at least, not yet – feel free to get in touch if you want to change this ‘feature’ of my company). So instead, I’ve recently been getting hold of some data for Australian areas. I’ve just bought some PostCode shapes for South Australia, and will use this in demos for conferences and for showing clients how this kind of report can really impact their reporting. One of the companies I was talking about getting shape files sent me a sample. So I chose the “ESRI shapefile” option you see above, and browsed to my file. It appeared in the window like this: Australians will immediately recognise this as the area around Wollongong, just south of Sydney. Well, apart from me. I didn’t. I had to put a Bing Maps layer behind it to work that out, but that’s not for this post. The thing that I discovered was that if I selected the Exploded USA option (but without clicking Next), and then chose my shape file, then my area around Wollongong would be exploded too! Huh! I think this is actually a bug, but a potentially useful one! Some further investigation (involving creating two identical reports, one with this exploded view, one without), showed that the Exploded View is done by reducing the ScaleFactor property of the PolygonLayer in the map control. The Exploded version has it below 1. If you set to above one, your shapes overlap. I discovered this by accident… I guess I hadn’t looked through all the PolygonLayer options to work out what they all do. And because this post is about Reporting, it can qualify for this month’s T-SQL Tuesday, hosted by Aaron Nelson (@sqlvariant). Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
In GLSL is it possible to offset vertices based on height map colour?

- by Rob

I am attempting to generate some terrain based upon a heightmap. I have generated a 32 x 32 grid and a corresponding height map - In my vertex shader I am trying to offset the position of the Y axis based upon the colour of the heightmap, white vertices being higher than black ones. //Vertex Shader Code #version 330 uniform mat4 modelMatrix; uniform mat4 viewMatrix; uniform mat4 projectionMatrix; uniform sampler2D heightmap; layout (location=0) in vec4 vertexPos; layout (location=1) in vec4 vertexColour; layout (location=3) in vec2 vertexTextureCoord; layout (location=4) in float offset; out vec4 fragCol; out vec4 fragPos; out vec2 fragTex; void main() { // Retreive the current pixel's colour vec4 hmColour = texture(heightmap,vertexTextureCoord); // Offset the y position by the value of current texel's colour value ? vec4 offset = vec4(vertexPos.x , vertexPos.y + hmColour.r, vertexPos.z , 1.0); // Final Position gl_Position = projectionMatrix * viewMatrix * modelMatrix * offset; // Data sent to Fragment Shader. fragCol = vertexColour; fragPos = vertexPos; fragTex = vertexTextureCoord; } However the code I have produced only creates a grid with none of the y vertices higher than any others. This is the C++ code that generates the grid and texture co-orientates which I believe to be correct as the texture is mapped to the grid, hence the white blob in the middle. The grid-lines are generated in the fragment shader, sorry for any confusion. I have tried multiplying the r value of hmColour by 1000 unfortunately that had no effect. The only other problem it could be is that the texture coordinate data is incorrect ? for (int z = 0; z < MAP_Z ; z++) { for(int x = 0; x < MAP_X ; x++) { //Generate Vertex Buffer vertexData[iVertex++] = float (x) * MAP_X; vertexData[iVertex++] = 0; vertexData[iVertex++] = -(float) (z) * MAP_Z; //Colour Buffer NOT NEEDED colourData[iColour++] = 255.0f; // R colourData[iColour++] = 1.0f; // G colourData[iColour++] = 0.0f; // B //Texture Buffer textureData[iTexture++] = (float ) x * (1.0f / MAP_X); textureData[iTexture++] = (float ) z * (1.0f / MAP_Z); } } The heightmap texture I am trying to use appears like so (without grid-lines). This is the corresponding fragment shader // Fragment Shader Code #version 330 uniform sampler2D hmTexture; layout (location=0) out vec4 fragColour; in vec2 fragTex; in vec4 pos; void main(void) { vec2 line = fragTex * 32; // Without Gridlines fragColour = texture(hmTexture,fragTex); // With grid lines // + mix(vec4(0.0, 0.0, 1.0, 0.0), vec4(1.0, 1.0, 1.0, 1.0), // smoothstep(0.05,fract(line.y), 0.99) * smoothstep(0.05,fract(line.x),0.99)); }

Read the article
New in MySQL Enterprise Edition: Policy-based Auditing!

- by Rob Young

Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} For those with an interest in MySQL, this weekend's MySQL Connect conference in San Francisco has gotten off to a great start. On Saturday Tomas announced the feature complete MySQL 5.6 Release Candidate that is now available for Community adoption and testing. This announcement marks the sprint to GA that should be ready for release within the next 90 days. You can get a quick summary of the key 5.6 features here or better yet download the 5.6 RC (under “Development Releases”), review what's new and try it out for yourself! There were also product related announcements around MySQL Cluster 7.3 and MySQL Enterprise Edition . This latter announcement is of particular interest if you are faced with internal and regulatory compliance requirements as it addresses and solves a pain point that is shared by most developers and DBAs; new, out of the box compliance for MySQL applications via policy-based audit logging of user and query level activity. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} One of the most common requests we get for the MySQL roadmap is for quick and easy logging of audit events. This is mainly due to how web-based applications have evolved from nice-to-have enablers to mission-critical revenue generation and the important role MySQL plays in the new dynamic. In today’s virtual marketplace, PCI compliance guidelines ensure credit card data is secure within e-commerce apps; from a corporate standpoint, Sarbanes-Oxely, HIPAA and other regulations guard the medical, financial, public sector and other personal data centric industries. For supporting applications audit policies and controls that monitor the eyes and hands that have viewed and acted upon the most sensitive of data is most commonly implemented on the back-end database. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} With this in mind, MySQL 5.5 introduced an open audit plugin API that enables all MySQL users to write their own auditing plugins based on application specific requirements. While the supporting docs are very complete and provide working code samples, writing an audit plugin requires time and low-level expertise to develop, test, implement and maintain. To help those who don't have the time and/or expertise to develop such a plugin, Oracle now ships MySQL 5.5.28 and higher with an easy to use, out-of-the-box auditing solution; MySQL Enterprise Audit. MySQL Enterprise Audit The premise behind MySQL Enterprise Audit is simple; we wanted to provide an easy to use, policy-based auditing solution that enables you to quickly and seamlessly add compliance to their MySQL applications. MySQL Enterprise Audit meets this requirement by enabling you to: 1. Easily install the needed components. Installation requires an upgrade to MySQL 5.5.28 (Enterprise edition), which can be downloaded from the My Oracle Support portal or the Oracle Software Delivery Cloud. After installation, you simply add the following to your my.cnf file to register and enable the audit plugin: [mysqld] plugin-load=audit_log.so (keep in mind the audit_log suffix is platform dependent, so .dll on Windows, etc.) or alternatively you can load the plugin at runtime: mysql> INSTALL PLUGIN audit_log SONAME 'audit_log.so'; 2. Dynamically enable and disable the audit stream for a specific MySQL server. A new global variable called audit_log_policy allows you to dynamically enable and disable audit stream logging for a specific MySQL server. The variable parameters are described below. 3. Define audit policy based on what needs to be logged (everything, logins, queries, or nothing), by server. The new audit_log_policy variable uses the following valid, descriptively named values to enable, disable audit stream logging and to filter the audit events that are logged to the audit stream: "ALL" - enable audit stream and log all events "LOGINS" - enable audit stream and log only login events "QUERIES" - enable audit stream and log only querie events "NONE" - disable audit stream 4. Manage audit log files using basic MySQL log rotation features. A new global variable, audit_log_rotate_on_size, allows you to automate the rotation and archival of audit stream log files based on size with archived log files renamed and appended with datetime stamp when a new file is opened for logging. 5. Integrate the MySQL audit stream with MySQL, Oracle tools and other third-party solutions. The MySQL audit stream is written as XML, using UFT-8 and can be easily formatted for viewing using a standard XML parser. This enables you to leverage tools from MySQL and others to view the contents. The audit stream was also developed to meet the Oracle database audit stream specification so combined Oracle/MySQL shops can import and manage MySQL audit images using the same Oracle tools they use for their Oracle databases. So assuming a successful MySQL 5.5.28 upgrade or installation, a common set up and use case scenario might look something like this: Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} It should be noted that MySQL Enterprise Audit was designed to be transparent at the application layer by allowing you to control the mix of log output buffering and asynchronous or synchronous disk writes to minimize the associated overhead that comes when the audit stream is enabled. The net result is that, depending on the chosen audit stream log stream options, most application users will see little to no difference in response times when the audit stream is enabled. So what are your next steps? Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} Get all of the grainy details on MySQL Enterprise Audit, including all of the additional configuration options from the MySQL documentation. MySQL Enterprise Edition customers can download MySQL 5.5.28 with the Audit extension for production use from the My Oracle Support portal. Everyone can download MySQL 5.5.28 with the Audit extension for evaluation from the Oracle Software Delivery Cloud. Learn more about MySQL Enterprise Edition. As always, thanks for your continued support of MySQL!

Read the article
Install of AppFabric RC stops AppFabric Monitoring (some traps for young players)

- by Rob Addis

I uninstalled AppFabric Beta 2 and installed AppFabric RC. The AppFabricEventCollection Service is started (runs under Local Service which is a dbo_owner on the Monitoring Database to prove this wasn’t the issue). The SQLServerAgent Service is started. Nothing is being written to the Monitoring DB Staging Table and thus nothing is being written to the Event tables or seen in the AppFabric Dashboard. Nothing has been written to the following event logs - Microsoft-Windows-Application Server-System Services\Admin - Microsoft-Windows-Application Server-System Services\Operational The Microsoft-Windows-Application Server-System Services\Debug event log is not shown in the event viewer. The WCF configuration appears fine the connection string to the Monitoring DB is correct. Monitoring is set to “Trouble Shooting” and no errors are shown on the “Configure WCF and WF for Application” dialog. So the problem seems to lie with either AppFabric which writes to the event log or the AppFabricEventCollection Service. I thought I was flummoxed... However one of my colleagues said have you checked the etwProviderId? I was using a config which was created under AppFabric Beta 2 which had a different etwProviderId. So I deleted the following section and all other references to AppFabric monitoring from the web.config and then recreated them using IIS the “Configure WCF and WF for Application” dialog and set the level to TroubleShooting. <diagnostics etwProviderId="6b44a7ff-9db4-4723-b8cf-1b584bf1591b"> <endToEndTracing propagateActivity="true" messageFlowTracing="true" /> </diagnostics> I then called a service to create some log entries. Still nothing was written to the Monitoring DB Staging Table... I checked the Microsoft-Windows-Application Server-System Services\Admin event log. It had the following entry... Configuration error. Please see the details to correct the problem. \rDetailed information:\r Filename: \\?\C:\Users\xxx\Documents\dotnetdev\Frameworks\SOA\xxx.SOA.Framework\xxx.SOA.Framework.MockServices\SimpleServiceParent\web.config Error: Cannot read configuration file due to insufficient permissions System.UnauthorizedAccessException: Filename: \\?\C:\Users\xxx\Documents\dotnetdev\Frameworks\SOA\xxx.SOA.Framework\IAG.SOA.Framework.MockServices\SimpleServiceParent\web.config Error: Cannot read configuration file due to insufficient permissions And guess who the user was... Local Service yes yes I should have used a better User in the AppFabric RC setup to run the AppFabricEventCollection Service under! So I changed the user to a more appropriate one and removed Local Service as a DBO and hay presto!

Read the article
Summit Old, Summit New, Summit Borrowed...

- by Rob Farley

PASS Summit is coming up, and I thought I’d post a few things. Summit Old... At the PASS Summit, you will get the chance to hear presentations by the SQL Server establishment. Just about every big name in the SQL Server world is a regular at the PASS Summit, so you will get to hear and meet people like Kalen Delaney (@sqlqueen) (who just recently got awarded MVP status for the 20th year running), and from all around the world such as the UK’s Chris Webb (@technitrain) or Pinal Dave (@pinaldave) from India. Almost all the household names in SQL Server will be there, including a large contingent from Microsoft. The PASS Summit is by far the best place to meet the legends of SQL Server. And they’re not all old. Some are, but most of them are younger than you might think. ...Summit New... The hottest topics are often about the newest technologies (such as SQL Server 2012). But you will almost certainly learn new stuff about older versions too. But that’s not what I wanted to pick on for this point. There are many new speakers at every PASS Summit, and content that has not been covered in other places. This year, for example, LobsterPot’s Roger Noble (@roger_noble) is giving a presentation for the first time. He’s a regular around the Australian circuit, but this is his first time presenting to a US audience. New Zealand’s Paul White (@sql_kiwi) is attending his first PASS Summit, and will be giving over four hours of incredibly deep stuff that has never been presented anywhere in the US before (I can’t say the world, because he did present similar material in Adelaide earlier in the year). ...Summit Borrowed... No, I’m not talking about plagiarism – the talks you’ll hear are all their own work. But you will get a lot of stuff you’ll be able to take back and apply at work. The PASS Summit sessions are not full of sales-pitches, telling you about how great things could be if only you’d buy some third-party vendor product. It’s simply not that kind of conference, and PASS doesn’t allow that kind of talk to take place. Instead, you’ll be taught techniques, and be able to download scripts and slides to let you perform that magic back at work when you get home. You will definitely find plenty of ideas to borrow at the PASS Summit. ...Summit Blue Yeah – and there’s karaoke. Blue - Jason - SQL Karaoke - YouTube

Read the article
To maximize chances of functional programming employment

- by Rob Agar

Given that the future of programming is functional, at some point in the nearish future I want to be paid to code in a functional language, preferably Haskell. Assuming I have a firm grasp of the language, plus all the basic programmer attributes (good communication skills/sense of humour/hygiene etc), what should I concentrate on learning to maximize my chances? Are there any particularly sought after libraries I should know? Alternatively, would another language be a better bet, say F#? (I'm not too fussed about the kind of programming work, so long as it's reasonably interesting and reasonably well paid, and with nice people)

Read the article
The blocking nature of aggregates

- by Rob Farley

I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here. In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

Read the article
Joins in single-table queries

- by Rob Farley

Tables are only metadata. They don’t store data. I’ve written something about this before, but I want to take a viewpoint of this idea around the topic of joins, especially since it’s the topic for T-SQL Tuesday this month. Hosted this time by Sebastian Meine (@sqlity), who has a whole series on joins this month. Good for him – it’s a great topic. In that last post I discussed the fact that we write queries against tables, but that the engine turns it into a plan against indexes. My point wasn’t simply that a table is actually just a Clustered Index (or heap, which I consider just a special type of index), but that data access always happens against indexes – never tables – and we should be thinking about the indexes (specifically the non-clustered ones) when we write our queries. I described the scenario of looking up phone numbers, and how it never really occurs to us that there is a master list of phone numbers, because we think in terms of the useful non-clustered indexes that the phone companies provide us, but anyway – that’s not the point of this post. So a table is metadata. It stores information about the names of columns and their data types. Nullability, default values, constraints, triggers – these are all things that define the table, but the data isn’t stored in the table. The data that a table describes is stored in a heap or clustered index, but it goes further than this. All the useful data is going to live in non-clustered indexes. Remember this. It’s important. Stop thinking about tables, and start thinking about indexes. So let’s think about tables as indexes. This applies even in a world created by someone else, who doesn’t have the best indexes in mind for you. I’m sure you don’t need me to explain Covering Index bit – the fact that if you don’t have sufficient columns “included” in your index, your query plan will either have to do a Lookup, or else it’ll give up using your index and use one that does have everything it needs (even if that means scanning it). If you haven’t seen that before, drop me a line and I’ll run through it with you. Or go and read a post I did a long while ago about the maths involved in that decision. So – what I’m going to tell you is that a Lookup is a join. When I run SELECT CustomerID FROM Sales.SalesOrderHeader WHERE SalesPersonID = 285; against the AdventureWorks2012 get the following plan: I’m sure you can see the join. Don’t look in the query, it’s not there. But you should be able to see the join in the plan. It’s an Inner Join, implemented by a Nested Loop. It’s pulling data in from the Index Seek, and joining that to the results of a Key Lookup. It clearly is – the QO wouldn’t call it that if it wasn’t really one. It behaves exactly like any other Nested Loop (Inner Join) operator, pulling rows from one side and putting a request in from the other. You wouldn’t have a problem accepting it as a join if the query were slightly different, such as SELECT sod.OrderQty FROM Sales.SalesOrderHeader AS soh JOIN Sales.SalesOrderDetail as sod on sod.SalesOrderID = soh.SalesOrderID WHERE soh.SalesPersonID = 285; Amazingly similar, of course. This one is an explicit join, the first example was just as much a join, even thought you didn’t actually ask for one. You need to consider this when you’re thinking about your queries. But it gets more interesting. Consider this query: SELECT SalesOrderID FROM Sales.SalesOrderHeader WHERE SalesPersonID = 276 AND CustomerID = 29522; It doesn’t look like there’s a join here either, but look at the plan. That’s not some Lookup in action – that’s a proper Merge Join. The Query Optimizer has worked out that it can get the data it needs by looking in two separate indexes and then doing a Merge Join on the data that it gets. Both indexes used are ordered by the column that’s indexed (one on SalesPersonID, one on CustomerID), and then by the CIX key SalesOrderID. Just like when you seek in the phone book to Farley, the Farleys you have are ordered by FirstName, these seek operations return the data ordered by the next field. This order is SalesOrderID, even though you didn’t explicitly put that column in the index definition. The result is two datasets that are ordered by SalesOrderID, making them very mergeable. Another example is the simple query SELECT CustomerID FROM Sales.SalesOrderHeader WHERE SalesPersonID = 276; This one prefers a Hash Match to a standard lookup even! This isn’t just ordinary index intersection, this is something else again! Just like before, we could imagine it better with two whole tables, but we shouldn’t try to distinguish between joining two tables and joining two indexes. The Query Optimizer can see (using basic maths) that it’s worth doing these particular operations using these two less-than-ideal indexes (because of course, the best indexese would be on both columns – a composite such as (SalesPersonID, CustomerID – and it would have the SalesOrderID column as part of it as the CIX key still). You need to think like this too. Not in terms of excusing single-column indexes like the ones in AdventureWorks2012, but in terms of having a picture about how you’d like your queries to run. If you start to think about what data you need, where it’s coming from, and how it’s going to be used, then you will almost certainly write better queries. …and yes, this would include when you’re dealing with regular joins across multiples, not just against joins within single table queries.

Read the article
The blocking nature of aggregates

- by Rob Farley

I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here. In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

Read the article
Visually stunning maps and PivotViewer

- by Rob Farley

One of the things about PivotViewer is that it runs in the Silverlight platform and can be extended recently. One of my guys at LobsterPot, Roger Noble, has used this to incorporate a Bing Maps layer, showing items which have Latitude and Longitude values there. We’re already talking to a hospital about using this to allow them to browse their patient data, including showing the patients on a map according to which bed they’re in. Interesting times – this will involve having custom tiles instead of the ones from Bing Maps, but the idea is similar. Of course, we’ll be using Bing Maps to show where the patients live. I should also mention that this is a work-in-progress still. Figuring out how to use PivotViewer isn’t trivial, and we’ve done quite a lot of experimenting to see how to get things working. If you find bugs, please feel free to let me know (rob_farley at hotmail will usually reach me), and we’ll add them to our list. Here are some screenshots that I made recently using the collection at http://pivot.lobsterpot.com.au/flickr – by selecting a tag, you can get a new bunch of images. A couple of images that were taken in Iceland. Some from St Mary’s Lighthouse near Newcastle, UK. And some from around Big Ben in London. I’d recommend using either Firefox or Internet Explorer if you choose to browse this yourself. It seems the Chrome browser support for Silverlight doesn’t quite handle things as nicely as we’d all like. I imagine that at some point, we may enhance the Flickr collection, to be able to search on more than tags, but as a sample collection, it seems to work quite well.

Read the article
What's New in 5.6 RC and more from MySQL Connect conference

- by Rob Young

Keeping with the tradition of great MySQL Community events, the first annual MySQL Connect conference is now in the books. It was great to see so many familiar faces in the crowd and at the podium sharing their ideas and thoughts on the evolution of MySQL under Oracle. The headliner of the conference was Tomas' keynote announcement of the fully featured and fully enabled MySQL 5.6 Release Candidate. This new article on the MySQL DevZone summarizes all of the great new features ready for Community adoption, all MySQL Engineering blogs and where and how to download all of the bits. As always, early adoption and feedback on the 5.6 RC is appreciated and the sooner we get your feedback the sooner we release the "ready for production" sanctioned GA product. Also available now, Cluster 7.3 provides support for Foreign Keys, node.js NoSQL access to underlying data and a new Auto Installer that helps you quickly and easily get up and running with Cluster 7.2 and 7.3. The 7.3 downloads are provided in the first 7.3 Development Milestone Release (under "Development Releases" tab) and via the MySQL Labs. Oracle also announced key new additions to MySQL Enterprise Edition: New policy-based compliance Auditing. MySQL Enterprise Edition Audit adds policy-based auditing compliance to existing MySQL applications without the need to change any code. This new plugin is available for MySQL 5.5.28 and higher; existing MySQL Enterprise Edition customers can download the upgrade from the My Oracle Support portal and all can download for evaluation from Oracle's Software Delivery Cloud. New MySQL Enterprise High Available additions provide even more options for ensuring MySQL applications remain available and running a their peak: Oracle Linux + DRBD Oracle Solaris Clustering for MySQL All in all, the first MySQL Connect conference was a great success and with refinements planned in response to attendee, sponsor and speaker feedback we expect it to grow and improve going forward. As always, thanks for your continued support of MySQL!

Read the article
Database Mirroring on SQL Server Express Edition

- by Most Valuable Yak (Rob Volk)

Like most SQL Server users I'm rather frustrated by Microsoft's insistence on making the really cool features only available in Enterprise Edition. And it really doesn't help that they changed the licensing for SQL 2012 to be core-based, so now it's like 4 times as expensive! It almost makes you want to go with Oracle. That, and a desire to have Larry Ellison do things to your orifices. And since they've introduced Availability Groups, and marked database mirroring as deprecated, you'd think they'd make make mirroring available in all editions. Alas…they don't…officially anyway. Thanks to my constant poking around in places I'm not "supposed" to, I've discovered the low-level code that implements database mirroring, and found that it's available in all editions! It turns out that the query processor in all SQL Server editions prepends a simple check before every edition-specific DDL statement: IF CAST(SERVERPROPERTY('Edition') as nvarchar(max)) NOT LIKE '%e%e%e% Edition%' print 'Lame' else print 'Cool' If that statement returns true, it fails. (the print statements are just placeholders) Go ahead and test it on Standard, Workgroup, and Express editions compared to an Enterprise or Developer edition instance (which support everything). Once again thanks to Argenis Fernandez (b | t) and his awesome sessions on using Sysinternals, I was able to watch the exact process SQL Server performs when setting up a mirror. Surprisingly, it's not actually implemented in SQL Server! Some of it is, but that's something of a smokescreen, the real meat of it is simple filesystem primitives. The NTFS filesystem supports links, both hard links and symbolic, so that you can create two entries for the same file in different directories and/or different names. You can create them using the MKLINK command in a command prompt: mklink /D D:\SkyDrive\Data D:\Data mklink /D D:\SkyDrive\Log D:\Log This creates a symbolic link from my data and log folders to my Skydrive folder. Any file saved in either location will instantly appear in the other. And since my Skydrive will be automatically synchronized with the cloud, any changes I make will be copied instantly (depending on my internet bandwidth of course). So what does this have to do with database mirroring? Well, it seems that the mirroring endpoint that you have to create between mirror and principal servers is really nothing more than a Skydrive link. Although it doesn't actually use Skydrive, it performs the same function. So in effect, the following statement: ALTER DATABASE Mir SET PARTNER='TCP://MyOtherServer.domain.com:5022' Is turned into: mklink /D "D:\Data" "\\MyOtherServer.domain.com\5022$" The 5022$ "port" is actually a hidden system directory on the principal and mirror servers. I haven't quite figured out how the log files are included in this, or why you have to SET PARTNER on both principal and mirror servers, except maybe that mklink has to do something special when linking across servers. I couldn't get the above statement to work correctly, but found that doing mklink to a local Skydrive folder gave me similar functionality. To wrap this up, all you have to do is the following: Install Skydrive on both SQL Servers (principal and mirror) and set the local Skydrive folder (D:\SkyDrive in these examples) On the principal server, run mklink /D on the data and log folders to point to SkyDrive: mklink /D D:\SkyDrive\Data D:\Data On the mirror server, run the complementary linking: mklink /D D:\Data D:\SkyDrive\Data Create your database and make sure the files map to the principal data and log folders (D:\Data and D:\Log) Viola! Your databases are kept in sync on multiple servers! One wrinkle you will encounter is that the mirror server will show the data and log files, but you won't be able to attach them to the mirror SQL instance while they are attached to the principal. I think this is a bug in the Skydrive, but as it turns out that's fine: you can't access a mirror while it's hosted on the principal either. So you don't quite get automatic failover, but you can attach the files to the mirror if the principal goes offline. It's also not exactly synchronous, but it's better than nothing, and easier than either replication or log shipping with a lot less latency. I will end this with the obvious "not supported by Microsoft" and "Don't do this in production without an updated resume" spiel that you should by now assume with every one of my blog posts, especially considering the date.

Read the article
T-SQL Tuesday #31 - Logging Tricks with CONTEXT_INFO

- by Most Valuable Yak (Rob Volk)

This month's T-SQL Tuesday is being hosted by Aaron Nelson [b | t], fellow Atlantan (the city in Georgia, not the famous sunken city, or the resort in the Bahamas) and covers the topic of logging (the recording of information, not the harvesting of trees) and maintains the fine T-SQL Tuesday tradition begun by Adam Machanic [b | t] (the SQL Server guru, not the guy who fixes cars, check the spelling again, there will be a quiz later). This is a trick I learned from Fernando Guerrero [b | t] waaaaaay back during the PASS Summit 2004 in sunny, hurricane-infested Orlando, during his session on Secret SQL Server (not sure if that's the correct title, and I haven't used parentheses in this paragraph yet). CONTEXT_INFO is a neat little feature that's existed since SQL Server 2000 and perhaps even earlier. It lets you assign data to the current session/connection, and maintains that data until you disconnect or change it. In addition to the CONTEXT_INFO() function, you can also query the context_info column in sys.dm_exec_sessions, or even sysprocesses if you're still running SQL Server 2000, if you need to see it for another session. While you're limited to 128 bytes, one big advantage that CONTEXT_INFO has is that it's independent of any transactions. If you've ever logged to a table in a transaction and then lost messages when it rolled back, you can understand how aggravating it can be. CONTEXT_INFO also survives across multiple SQL batches (GO separators) in the same connection, so for those of you who were going to suggest "just log to a table variable, they don't get rolled back": HA-HA, I GOT YOU! Since GO starts a new batch all variable declarations are lost. Here's a simple example I recently used at work. I had to test database mirroring configurations for disaster recovery scenarios and measure the network throughput. I also needed to log how long it took for the script to run and include the mirror settings for the database in question. I decided to use AdventureWorks as my database model, and Adam Machanic's Big Adventure script to provide a fairly large workload that's repeatable and easily scalable. My test would consist of several copies of AdventureWorks running the Big Adventure script while I mirrored the databases (or not). Since Adam's script contains several batches, I decided CONTEXT_INFO would have to be used. As it turns out, I only needed to grab the start time at the beginning, I could get the rest of the data at the end of the process. The code is pretty small: declare @time binary(128)=cast(getdate() as binary(8)) set context_info @time ... rest of Big Adventure code ... go use master; insert mirror_test(server,role,partner,db,state,safety,start,duration) select @@servername, mirroring_role_desc, mirroring_partner_instance, db_name(database_id), mirroring_state_desc, mirroring_safety_level_desc, cast(cast(context_info() as binary(8)) as datetime), datediff(s,cast(cast(context_info() as binary(8)) as datetime),getdate()) from sys.database_mirroring where db_name(database_id) like 'Adv%'; I declared @time as a binary(128) since CONTEXT_INFO is defined that way. I couldn't convert GETDATE() to binary(128) as it would pad the first 120 bytes as 0x00. To keep the CAST functions simple and avoid using SUBSTRING, I decided to CAST GETDATE() as binary(8) and let SQL Server do the implicit conversion. It's not the safest way perhaps, but it works on my machine. :) As I mentioned earlier, you can query system views for sessions and get their CONTEXT_INFO. With a little boilerplate code this can be used to monitor long-running procedures, in case you need to kill a process, or are just curious how long certain parts take. In this example, I added code to Adam's Big Adventure script to set CONTEXT_INFO messages at strategic places I want to monitor. (His code is in UPPERCASE as it was in the original, mine is all lowercase): declare @msg binary(128) set @msg=cast('Altering bigProduct.ProductID' as binary(128)) set context_info @msg go ALTER TABLE bigProduct ALTER COLUMN ProductID INT NOT NULL GO set context_info 0x0 go declare @msg1 binary(128) set @msg1=cast('Adding pk_bigProduct Constraint' as binary(128)) set context_info @msg1 go ALTER TABLE bigProduct ADD CONSTRAINT pk_bigProduct PRIMARY KEY (ProductID) GO set context_info 0x0 go declare @msg2 binary(128) set @msg2=cast('Altering bigTransactionHistory.TransactionID' as binary(128)) set context_info @msg2 go ALTER TABLE bigTransactionHistory ALTER COLUMN TransactionID INT NOT NULL GO set context_info 0x0 go declare @msg3 binary(128) set @msg3=cast('Adding pk_bigTransactionHistory Constraint' as binary(128)) set context_info @msg3 go ALTER TABLE bigTransactionHistory ADD CONSTRAINT pk_bigTransactionHistory PRIMARY KEY NONCLUSTERED(TransactionID) GO set context_info 0x0 go declare @msg4 binary(128) set @msg4=cast('Creating IX_ProductId_TransactionDate Index' as binary(128)) set context_info @msg4 go CREATE NONCLUSTERED INDEX IX_ProductId_TransactionDate ON bigTransactionHistory(ProductId,TransactionDate) INCLUDE(Quantity,ActualCost) GO set context_info 0x0 This doesn't include the entire script, only those portions that altered a table or created an index. One annoyance is that SET CONTEXT_INFO requires a literal or variable, you can't use an expression. And since GO starts a new batch I need to declare a variable in each one. And of course I have to use CAST because it won't implicitly convert varchar to binary. And even though context_info is a nullable column, you can't SET CONTEXT_INFO NULL, so I have to use SET CONTEXT_INFO 0x0 to clear the message after the statement completes. And if you're thinking of turning this into a UDF, you can't, although a stored procedure would work. So what does all this aggravation get you? As the code runs, if I want to see which stage the session is at, I can run the following (assuming SPID 51 is the one I want): select CAST(context_info as varchar(128)) from sys.dm_exec_sessions where session_id=51 Since SQL Server 2005 introduced the new system and dynamic management views (DMVs) there's not as much need for tagging a session with these kinds of messages. You can get the session start time and currently executing statement from them, and neatly presented if you use Adam's sp_whoisactive utility (and you absolutely should be using it). Of course you can always use xp_cmdshell, a CLR function, or some other tricks to log information outside of a SQL transaction. All the same, I've used this trick to monitor long-running reports at a previous job, and I still think CONTEXT_INFO is a great feature, especially if you're still using SQL Server 2000 or want to supplement your instrumentation. If you'd like an exercise, consider adding the system time to the messages in the last example, and an automated job to query and parse it from the system tables. That would let you track how long each statement ran without having to run Profiler. #TSQL2sDay

Read the article
Suggestions on managing social media accounts

- by Rob

As a company we now have Facebook, LinkedIN, Twitter and now Google+, is there a way to easily manage all these accounts without having to log into them individually? Things like posting content to each one is becoming a full time job in itself, is there a way to post once that in turn posts to all other accounts? I used to use http://ping.fm/ a long time ago, has there been any advancements in something similar to this? With friend lists, news feeds etc etc for each one, I wish there was a way to manage them all in one place with a service/tool!

Read the article
Top 5 Developer Enabling Nuggets in MySQL 5.6

- by Rob Young

MySQL 5.6 is truly a better MySQL and reflects Oracle's commitment to the evolution of the most popular and widelyused open source database on the planet. The feature-complete 5.6 release candidate was announced at MySQL Connect in late September and the production-ready, generally available ("GA") product should be available in early 2013. While the message around 5.6 has been focused mainly on mass appeal, advanced topics like performance/scale, high availability, and self-healing replication clusters, MySQL 5.6 also provides many developer-friendly nuggets that are designed to enable those who are building the next generation of web-based and embedded applications and services. Boiling down the 5.6 feature set into a smaller set, of simple, easy to use goodies designed with developer agility in mind, these things deserve a quick look:Subquery Optimizations Using semi-JOINs and late materialization, the MySQL 5.6 Optimizer delivers greatly improved subquery performance. Specifically, the optimizer is now more efficient in handling subqueries in the FROM clause; materialization of subqueries in the FROM clause is now postponed until their contents are needed during execution. Additionally, the optimizer may add an index to derived tables during execution to speed up row retrieval. Internal tests run using the DBT-3 benchmark Query #13, shown below, demonstrate an order of magnitude improvement in execution times (from days to seconds) over previous versions. select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity)from customer, orders, lineitemwhere o_orderkey in ( select l_orderkey from lineitem group by l_orderkey having sum(l_quantity) > 313 ) and c_custkey = o_custkey and o_orderkey = l_orderkeygroup by c_name, c_custkey, o_orderkey, o_orderdate, o_totalpriceorder by o_totalprice desc, o_orderdateLIMIT 100;What does this mean for developers? For starters, simplified subqueries can now be coded instead of complex joins for cross table lookups: SELECT title FROM film WHERE film_id IN (SELECT film_id FROM film_actor GROUP BY film_id HAVING count(*) > 12); And even more importantly subqueries embedded in packaged applications no longer need to be re-written into joins. This is good news for both ISVs and their customers who have access to the underlying queries and who have spent development cycles writing, testing and maintaining their own versions of re-written queries across updated versions of a packaged app.The details are in the MySQL 5.6 docs. Online DDL OperationsToday's web-based applications are designed to rapidly evolve and adapt to meet business and revenue-generationrequirements. As a result, development SLAs are now most often measured in minutes vs days or weeks. For example, when an application must quickly support new product lines or new products within existing product lines, the backend database schema must adapt in kind, and most commonly while the application remains available for normal business operations. MySQL 5.6 supports this level of online schema flexibility and agility by providing the following new ALTER TABLE online DDL syntax additions: CREATE INDEX DROP INDEX Change AUTO_INCREMENT value for a column ADD/DROP FOREIGN KEY Rename COLUMN Change ROW FORMAT, KEY_BLOCK_SIZE for a table Change COLUMN NULL, NOT_NULL Add, drop, reorder COLUMN Again, the details are in the MySQL 5.6 docs. Key-value access to InnoDB via Memcached APIMany of the next generation of web, cloud, social and mobile applications require fast operations against simple Key/Value pairs. At the same time, they must retain the ability to run complex queries against the same data, as well as ensure the data is protected with ACID guarantees. With the new NoSQL API for InnoDB, developers have allthe benefits of a transactional RDBMS, coupled with the performance capabilities of Key/Value store.MySQL 5.6 provides simple, key-value interaction with InnoDB data via the familiar Memcached API. Implemented via a new Memcached daemon plug-in to mysqld, the new Memcached protocol is mapped directly to the native InnoDB API and enables developers to use existing Memcached clients to bypass the expense of query parsing and go directly to InnoDB data for lookups and transactional compliant updates. The API makes it possible to re-use standard Memcached libraries and clients, while extending Memcached functionality by integrating a persistent, crash-safe, transactional database back-end. The implementation is shown here:So does this option provide a performance benefit over SQL? Internal performance benchmarks using a customized Java application and test harness show some very promising results with a 9X improvement in overall throughput for SET/INSERT operations:You can follow the InnoDB team blog for the methodology, implementation and internal test cases that generated these results here. How to get started with Memcached API to InnoDB is here. New Instrumentation in Performance SchemaThe MySQL Performance Schema was introduced in MySQL 5.5 and is designed to provide point in time metrics for key performance indicators. MySQL 5.6 improves the Performance Schema in answer to the most common DBA and Developer problems. New instrumentations include: Statements/Stages What are my most resource intensive queries? Where do they spend time? Table/Index I/O, Table Locks Which application tables/indexes cause the most load or contention? Users/Hosts/Accounts Which application users, hosts, accounts are consuming the most resources? Network I/O What is the network load like? How long do sessions idle? Summaries Aggregated statistics grouped by statement, thread, user, host, account or object. The MySQL 5.6 Performance Schema is now enabled by default in the my.cnf file with optimized and auto-tune settings that minimize overhead (< 5%, but mileage will vary), so using the Performance Schema ona production server to monitor the most common application use cases is less of an issue. In addition, new atomic levels of instrumentation enable the capture of granular levels of resource consumption by users, hosts, accounts, applications, etc. for billing and chargeback purposes in cloud computing environments.The MySQL docs are an excellent resource for all that is available and that can be done with the 5.6 Performance Schema. Better Condition Handling - GET DIAGNOSTICSMySQL 5.6 enables developers to easily check for error conditions and code for exceptions by introducing the new MySQL Diagnostics Area and corresponding GET DIAGNOSTICS interface command. The Diagnostic Area can be populated via multiple options and provides 2 kinds of information:Statement - which provides affected row count and number of conditions that occurredCondition - which provides error codes and messages for all conditions that were returned by a previous operation The addressable items for each are: The new GET DIAGNOSTICS command provides a standard interface into the Diagnostics Area and can be used via the CLI or from within application code to easily retrieve and handle the results of the most recent statement execution. An example of how it is used might be:mysql> DROP TABLE test.no_such_table; ERROR 1051 (42S02): Unknown table 'test.no_such_table' mysql> GET DIAGNOSTICS CONDITION 1 -> @p1 = RETURNED_SQLSTATE, @p2 = MESSAGE_TEXT; mysql> SELECT @p1, @p2; +-------+------------------------------------+| @p1 | @p2 | +-------+------------------------------------+| 42S02 | Unknown table 'test.no_such_table' | +-------+------------------------------------+ Options for leveraging the MySQL Diagnotics Area and GET DIAGNOSTICS are detailed in the MySQL Docs.While the above is a summary of some of the key developer enabling 5.6 features, it is by no means exhaustive. You can dig deeper into what MySQL 5.6 has to offer by reading this developer zone article or checking out "What's New in MySQL 5.6" in the MySQL docs.BONUS ALERT! If you are developing on Windows or are considering MySQL as an alternative to SQL Server for your next project, application or shipping product, you should check out the MySQL Installer for Windows. The installer includes the MySQL 5.6 RC database, all drivers, Visual Studio and Excel plugins, tray monitor and development tools all a single download and GUI installer. So what are your next steps? Register for Dec. 13 "MySQL 5.6: Building the Next Generation of Web-Based Applications and Services" live web event. Hurry! Seats are limited. Download the MySQL 5.6 Release Candidate (look under the Development Releases tab) Provide Feedback <link to http://bugs.mysql.com/> Join the Developer discussion on the MySQL Forums Explore all MySQL Products and Developer Tools As always, thanks for your continued support of MySQL!

Read the article
I spoke at SQL Saturday #77 and all I got was this really awesome speaker's shirt!

- by Most Valuable Yak (Rob Volk)

Yeah, it was 2 weeks ago, but I'm finally blogging about something! I presented Revenge: The SQL! at SQL Saturday #77 in Pensacola on June 4. The session abstract is here, and you can download the slides from that page too. You can see how I look in the speaker's shirt here. Overall it went pretty well. I discovered a new bit of evil just that morning and in a carefully considered, agonizing decision-making process that was full documented, tested, and approved…nah, I just went ahead and added it at the last minute. Which worked out even better than (not) planned, since it screwed me up a bit and made my point perfectly. I had a few fans in the audience, and one of them recorded it for blackmail material posterity. I'd like to thank Karla Landrum (blog | twitter) and all the volunteers for putting together such a great event, and for being kind enough to let me present. (Note to Karla: I'll get the next $100 to you as soon as I can. Might need a few extra days on the next $100.) Thanks to Audrey (blog | twitter), Peg, and Dorothy for attending and keeping the heckling down. Thanks also to Aaron (blog | twitter) for providing room and board and also not heckling. Thanks to Julie (blog | twitter) for coming up with the title for the presentation. (boo to Julie for getting sick and bailing out on us) And thanks to all of them for listening to a preview and offering their suggestions and advice! Cross your fingers that I get accepted at SQL Saturday 81 in Birmingham, SQL Saturday 85 in Orlando, or SQL Saturday 89 in Atlanta, or just attend them anyway!

Read the article
HSSFS Part 2.1 - Parsing @@VERSION

- by Most Valuable Yak (Rob Volk)

For Part 2 of the Handy SQL Server Function Series I decided to tackle parsing useful information from the @@VERSION function, because I am an idiot. It turns out I was confused about CHARINDEX() vs. PATINDEX() and it pretty much invalidated my original solution. All is not lost though, this mistake turned out to be informative for me, and hopefully for you. Referring back to the "Version" view in the prelude I started with the following query to extract the version number: SELECT DISTINCT SQLVersion, SUBSTRING(VersionString,PATINDEX('%-%',VersionString)+2, 12) VerNum FROM VERSION I used PATINDEX() to find the first hyphen "-" character in the string, since the version number appears 2 positions after it, and got these results: SQLVersion VerNum ----------- ------------ 2000 8.00.2055 (I 2005 9.00.3080.00 2005 9.00.4053.00 2008 10.50.1600.1 As you can see it was good enough for most of the values, but not for the SQL 2000 @@VERSION. You'll notice it has only 3 version sections/octets where the others have 4, and the SUBSTRING() grabbed the non-numeric characters after. To properly parse the version number will require a non-fixed value for the 3rd parameter of SUBSTRING(), which is the number of characters to extract. The best value is the position of the first space to occur after the version number (VN), the trick is to figure out how to find it. Here's where my confusion about PATINDEX() came about. The CHARINDEX() function has a handy optional 3rd parameter: CHARINDEX (expression1 ,expression2 [ ,start_location ] ) While PATINDEX(): PATINDEX ('%pattern%',expression ) Does not. I had expected to use PATINDEX() to start searching for a space AFTER the position of the VN, but it doesn't work that way. Since there are plenty of spaces before the VN, I thought I'd try PATINDEX() on another character that doesn't appear before, and tried "(": SELECT SQLVersion, SUBSTRING(VersionString,PATINDEX('%-%',VersionString)+2, PATINDEX('%(%',VersionString)) FROM VERSION Unfortunately this messes up the length calculation and yields: SQLVersion VerNum ----------- --------------------------- 2000 8.00.2055 (Intel X86) Dec 16 2008 19:4 2005 9.00.3080.00 (Intel X86) Sep 6 2009 01: 2005 9.00.4053.00 (Intel X86) May 26 2009 14: 2008 10.50.1600.1 (Intel X86) Apr 2008 10.50.1600.1 (X64) Apr 2 20 Yuck. The problem is that PATINDEX() returns position, and SUBSTRING() needs length, so I have to subtract the VN starting position: SELECT SQLVersion, SUBSTRING(VersionString,PATINDEX('%-%',VersionString)+2, PATINDEX('%(%',VersionString)-PATINDEX('%-%',VersionString)) VerNum FROM VERSION And the results are: SQLVersion VerNum ----------- -------------------------------------------------------- 2000 8.00.2055 (I 2005 9.00.4053.00 (I Msg 537, Level 16, State 2, Line 1 Invalid length parameter passed to the LEFT or SUBSTRING function. Ummmm, whoops. Turns out SQL Server 2008 R2 includes "(RTM)" before the VN, and that causes the length to turn negative. So now that that blew up, I started to think about matching digit and dot (.) patterns. Sadly, a quick look at the first set of results will quickly scuttle that idea, since different versions have different digit patterns and lengths. At this point (which took far longer than I wanted) I decided to cut my losses and redo the query using CHARINDEX(), which I'll cover in Part 2.2. So to do a little post-mortem on this technique: PATINDEX() doesn't have the flexibility to match the digit pattern of the version number; PATINDEX() doesn't have a "start" parameter like CHARINDEX(), that allows us to skip over parts of the string; The SUBSTRING() expression is getting pretty complicated for this relatively simple task! This doesn't mean that PATINDEX() isn't useful, it's just not a good fit for this particular problem. I'll include a version in the next post that extracts the version number properly. UPDATE: Sorry if you saw the unformatted version of this earlier, I'm on a quest to find blog software that ACTUALLY WORKS.

Read the article
Spooling in SQL execution plans

- by Rob Farley

Sewing has never been my thing. I barely even know the terminology, and when discussing this with American friends, I even found out that half the words that Americans use are different to the words that English and Australian people use. That said – let’s talk about spools! In particular, the Spool operators that you find in some SQL execution plans. This post is for T-SQL Tuesday, hosted this month by me! I’ve chosen to write about spools because they seem to get a bad rap (even in my song I used the line “There’s spooling from a CTE, they’ve got recursion needlessly”). I figured it was worth covering some of what spools are about, and hopefully explain why they are remarkably necessary, and generally very useful. If you have a look at the Books Online page about Plan Operators, at http://msdn.microsoft.com/en-us/library/ms191158.aspx, and do a search for the word ‘spool’, you’ll notice it says there are 46 matches. 46! Yeah, that’s what I thought too... Spooling is mentioned in several operators: Eager Spool, Lazy Spool, Index Spool (sometimes called a Nonclustered Index Spool), Row Count Spool, Spool, Table Spool, and Window Spool (oh, and Cache, which is a special kind of spool for a single row, but as it isn’t used in SQL 2012, I won’t describe it any further here). Spool, Table Spool, Index Spool, Window Spool and Row Count Spool are all physical operators, whereas Eager Spool and Lazy Spool are logical operators, describing the way that the other spools work. For example, you might see a Table Spool which is either Eager or Lazy. A Window Spool can actually act as both, as I’ll mention in a moment. In sewing, cotton is put onto a spool to make it more useful. You might buy it in bulk on a cone, but if you’re going to be using a sewing machine, then you quite probably want to have it on a spool or bobbin, which allows it to be used in a more effective way. This is the picture that I want you to think about in relation to your data. I’m sure you use spools every time you use your sewing machine. I know I do. I can’t think of a time when I’ve got out my sewing machine to do some sewing and haven’t used a spool. However, I often run SQL queries that don’t use spools. You see, the data that is consumed by my query is typically in a useful state without a spool. It’s like I can just sew with my cotton despite it not being on a spool! Many of my favourite features in T-SQL do like to use spools though. This looks like a very similar query to before, but includes an OVER clause to return a column telling me the number of rows in my data set. I’ll describe what’s going on in a few paragraphs’ time. So what does a Spool operator actually do? The spool operator consumes a set of data, and stores it in a temporary structure, in the tempdb database. This structure is typically either a Table (ie, a heap), or an Index (ie, a b-tree). If no data is actually needed from it, then it could also be a Row Count spool, which only stores the number of rows that the spool operator consumes. A Window Spool is another option if the data being consumed is tightly linked to windows of data, such as when the ROWS/RANGE clause of the OVER clause is being used. You could maybe think about the type of spool being like whether the cotton is going onto a small bobbin to fit in the base of the sewing machine, or whether it’s a larger spool for the top. A Table or Index Spool is either Eager or Lazy in nature. Eager and Lazy are Logical operators, which talk more about the behaviour, rather than the physical operation. If I’m sewing, I can either be all enthusiastic and get all my cotton onto the spool before I start, or I can do it as I need it. “Lazy” might not the be the best word to describe a person – in the SQL world it describes the idea of either fetching all the rows to build up the whole spool when the operator is called (Eager), or populating the spool only as it’s needed (Lazy). Window Spools are both physical and logical. They’re eager on a per-window basis, but lazy between windows. And when is it needed? The way I see it, spools are needed for two reasons. 1 – When data is going to be needed AGAIN. 2 – When data needs to be kept away from the original source. If you’re someone that writes long stored procedures, you are probably quite aware of the second scenario. I see plenty of stored procedures being written this way – where the query writer populates a temporary table, so that they can make updates to it without risking the original table. SQL does this too. Imagine I’m updating my contact list, and some of my changes move data to later in the book. If I’m not careful, I might update the same row a second time (or even enter an infinite loop, updating it over and over). A spool can make sure that I don’t, by using a copy of the data. This problem is known as the Halloween Effect (not because it’s spooky, but because it was discovered in late October one year). As I’m sure you can imagine, the kind of spool you’d need to protect against the Halloween Effect would be eager, because if you’re only handling one row at a time, then you’re not providing the protection... An eager spool will block the flow of data, waiting until it has fetched all the data before serving it up to the operator that called it. In the query below I’m forcing the Query Optimizer to use an index which would be upset if the Name column values got changed, and we see that before any data is fetched, a spool is created to load the data into. This doesn’t stop the index being maintained, but it does mean that the index is protected from the changes that are being done. There are plenty of times, though, when you need data repeatedly. Consider the query I put above. A simple join, but then counting the number of rows that came through. The way that this has executed (be it ideal or not), is to ask that a Table Spool be populated. That’s the Table Spool operator on the top row. That spool can produce the same set of rows repeatedly. This is the behaviour that we see in the bottom half of the plan. In the bottom half of the plan, we see that the a join is being done between the rows that are being sourced from the spool – one being aggregated and one not – producing the columns that we need for the query. Table v Index When considering whether to use a Table Spool or an Index Spool, the question that the Query Optimizer needs to answer is whether there is sufficient benefit to storing the data in a b-tree. The idea of having data in indexes is great, but of course there is a cost to maintaining them. Here we’re creating a temporary structure for data, and there is a cost associated with populating each row into its correct position according to a b-tree, as opposed to simply adding it to the end of the list of rows in a heap. Using a b-tree could even result in page-splits as the b-tree is populated, so there had better be a reason to use that kind of structure. That all depends on how the data is going to be used in other parts of the plan. If you’ve ever thought that you could use a temporary index for a particular query, well this is it – and the Query Optimizer can do that if it thinks it’s worthwhile. It’s worth noting that just because a Spool is populated using an Index Spool, it can still be fetched using a Table Spool. The details about whether or not a Spool used as a source shows as a Table Spool or an Index Spool is more about whether a Seek predicate is used, rather than on the underlying structure. Recursive CTE I’ve already shown you an example of spooling when the OVER clause is used. You might see them being used whenever you have data that is needed multiple times, and CTEs are quite common here. With the definition of a set of data described in a CTE, if the query writer is leveraging this by referring to the CTE multiple times, and there’s no simplification to be leveraged, a spool could theoretically be used to avoid reapplying the CTE’s logic. Annoyingly, this doesn’t happen. Consider this query, which really looks like it’s using the same data twice. I’m creating a set of data (which is completely deterministic, by the way), and then joining it back to itself. There seems to be no reason why it shouldn’t use a spool for the set described by the CTE, but it doesn’t. On the other hand, if we don’t pull as many columns back, we might see a very different plan. You see, CTEs, like all sub-queries, are simplified out to figure out the best way of executing the whole query. My example is somewhat contrived, and although there are plenty of cases when it’s nice to give the Query Optimizer hints about how to execute queries, it usually doesn’t do a bad job, even without spooling (and you can always use a temporary table). When recursion is used, though, spooling should be expected. Consider what we’re asking for in a recursive CTE. We’re telling the system to construct a set of data using an initial query, and then use set as a source for another query, piping this back into the same set and back around. It’s very much a spool. The analogy of cotton is long gone here, as the idea of having a continual loop of cotton feeding onto a spool and off again doesn’t quite fit, but that’s what we have here. Data is being fed onto the spool, and getting pulled out a second time when the spool is used as a source. (This query is running on AdventureWorks, which has a ManagerID column in HumanResources.Employee, not AdventureWorks2012) The Index Spool operator is sucking rows into it – lazily. It has to be lazy, because at the start, there’s only one row to be had. However, as rows get populated onto the spool, the Table Spool operator on the right can return rows when asked, ending up with more rows (potentially) getting back onto the spool, ready for the next round. (The Assert operator is merely checking to see if we’ve reached the MAXRECURSION point – it vanishes if you use OPTION (MAXRECURSION 0), which you can try yourself if you like). Spools are useful. Don’t lose sight of that. Every time you use temporary tables or table variables in a stored procedure, you’re essentially doing the same – don’t get upset at the Query Optimizer for doing so, even if you think the spool looks like an expensive part of the query. I hope you’re enjoying this T-SQL Tuesday. Why not head over to my post that is hosting it this month to read about some other plan operators? At some point I’ll write a summary post – once I have you should find a comment below pointing at it. @rob_farley

Read the article
Am I correctly handling duplicate URLs for my homepage?

- by Rob Goldstein

I own a Job Search site named www.conservationjobboard.com and have a concern about how the domain is viewed by search engines. The issue is that when the site was first designed, the default page was left as default.php, but the homepage was actually JobBoard.php. To handle this, the default.php page performed a redirect to the JobBoard.php file when www.conservationjobboard.com/ was requested. The main problem resulted because the redirect was a temporary redirect causing search engines to index conservationjobboard.com/ and conservationjobboard.com/JobBoard.php as 2 separate pages. This has since been corrected to use the .htaccess file so that JobBoard.php is now the default file for the root directory eliminating the need for the redirect. Problem is that search engines still show both URL's in search results (one including JobBoard.php and one that ends with /). Another potential problem is that some of my early backlinks are to conservationjobboard.com/JobBoard.php while the rest are to conservationjobboard.com The 2 outstanding questions are as follows: 1. Is my domain still being penalized by search engines like Google for having duplicate homepage URL's? 2. Are all of the back links to my homepage being considered as the same now or is the total number of back links being split between the 2 different URL's? If you think there are still issues with how we have this set-up, I was wondering if you could give me advice on what we should do differently. Thanks.

Read the article
24 Hours of PASS: Whine, Whine, Whine

- by Most Valuable Yak (Rob Volk)

24 Hours of Pass (or 24HOP) is a great program offered by PASS to provide free, online training for anyone who wants to learn more about SQL Server. They routinely have the best SQL Server presenters available for these sessions, and attract hundreds, perhaps even a thousand attendees from around the world. This is definitely one of the best things they've started doing in the past few years, and every session I've attended has been excellent. So why am I so grumpy about it? I'm not really, pretty much everything here is a minor annoyance that I can deal with. However since they're so minor they seem to be things that can be easily corrected and would make the process much better. First off, this is my biggest gripe, the registration page: https://www323.livemeeting.com/lrs/8000181573/Registration.aspx?pageName=lj6378f4fhf5hpdm What grinds my gears about this? I have to scroll alllllllllllllllllllllllllllll the way to bottom to actually register for the sessions. This wouldn't be so bad except all the details of the session, including the presenter, is in a separate list at the top. Both lists contain info the other does not, and scrolling between them to determine "Should I make time to listen to this? Who is speaking at this time anyway?" is really unnecessary. My preference would be to keep the top list and add the checkboxes and schedule info in separate columns. This is a full-width design, so there's plenty of space for this data, which is pretty small anyway. The other huge benefit is halving the size of the page, which improves performance and lowers bandwidth usage considerably. And if you know HTML/ASP.Net, and you view the page source, you can find PLENTY of other things that can be reduced even further. (not just viewstate) One nice thing that PASS does is send iCal reminders to your email address so you can accept them to your calendar. Again, they leave off the presenter in the appointment details, while still duplicating the meeting title in the body. Sometimes I make decisions based on speaker rather than content (Natalie Portman is reading the Yellow Pages??? I'M THERE!) and having the speaker in the iCal is helpful. Next minor annoyances are the necessity for providing a company name, and the survey questions. I know PASS needs to market themselves effectively, and they need information to do that, and since this is a free event it's really not worth complaining about, but why ask the survey question twice? (once at registration, once again when joining the LiveMeeting) Same thing for the company name. All of this should be tied to email address, so that's all I should need to enter when joining the LiveMeeting. The last one is also minor, but it irks me in this day and age of multiple browsers and the decline of Internet Explorer as a dominant platform. The registration page was originally created in Visual Studio 2003, and has a lot of IE-specific crud representative of the browser situation of 2003. (IE5 references? really? and is the aforementioned viewstate big enough?) This causes some grief with other browsers like Firefox, Chrome, and sometimes IE8 or 9. And don't get me started on using the page on a Mac or in Safari. My main point is that PASS is an international organization, welcoming everyone from all levels of SQL Server proficiency, and in that spirit I think it would help to accommodate a wider range of browser software, especially since the registration page is extremely simple. I recognize that this page is not hosted on the PASS website and may be maintained by some division of Microsoft, but to me that's even worse if MS can't update their own pages. They've deprecated IE6, so they don't need to maintain support on their own websites anymore. OK, I'll shut up now. #sqlpass #24HOP

Read the article

Search Results

Search found 1427 results on 58 pages for 'rob reynolds'.

Page 10/58 | < Previous Page | 6 7 8 9 10 11 12 13 14 15 16 17 | Next Page >

- by Rob Farley

- by Rob Farley

- by Rob Farley

- by Rob Farley

- by Rob Farley

- by Rob Farley

- by Rob

- by Rob Young

- by Rob Addis

- by Rob Farley

- by Rob Agar

- by Rob Farley

- by Rob Farley

- by Rob Farley

- by Rob Farley

- by Rob Young

- by Most Valuable Yak (Rob Volk)

- by Most Valuable Yak (Rob Volk)

- by Rob

- by Rob Young

- by Most Valuable Yak (Rob Volk)

- by Most Valuable Yak (Rob Volk)

- by Rob Farley

- by Rob Goldstein

- by Most Valuable Yak (Rob Volk)

< Previous Page | 6 7 8 9 10 11 12 13 14 15 16 17 | Next Page >