Search Results

Search found 1373 results on 55 pages for 'rob ellis'.

Page 9/55 | < Previous Page | 5 6 7 8 9 10 11 12 13 14 15 16  | Next Page >

  • To maximize chances of functional programming employment

    - by Rob Agar
    Given that the future of programming is functional, at some point in the nearish future I want to be paid to code in a functional language, preferably Haskell. Assuming I have a firm grasp of the language, plus all the basic programmer attributes (good communication skills/sense of humour/hygiene etc), what should I concentrate on learning to maximize my chances? Are there any particularly sought after libraries I should know? Alternatively, would another language be a better bet, say F#? (I'm not too fussed about the kind of programming work, so long as it's reasonably interesting and reasonably well paid, and with nice people)

    Read the article

  • The blocking nature of aggregates

    - by Rob Farley
    I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here.  In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

    Read the article

  • Suggestions on managing social media accounts

    - by Rob
    As a company we now have Facebook, LinkedIN, Twitter and now Google+, is there a way to easily manage all these accounts without having to log into them individually? Things like posting content to each one is becoming a full time job in itself, is there a way to post once that in turn posts to all other accounts? I used to use http://ping.fm/ a long time ago, has there been any advancements in something similar to this? With friend lists, news feeds etc etc for each one, I wish there was a way to manage them all in one place with a service/tool!

    Read the article

  • Joins in single-table queries

    - by Rob Farley
    Tables are only metadata. They don’t store data. I’ve written something about this before, but I want to take a viewpoint of this idea around the topic of joins, especially since it’s the topic for T-SQL Tuesday this month. Hosted this time by Sebastian Meine (@sqlity), who has a whole series on joins this month. Good for him – it’s a great topic. In that last post I discussed the fact that we write queries against tables, but that the engine turns it into a plan against indexes. My point wasn’t simply that a table is actually just a Clustered Index (or heap, which I consider just a special type of index), but that data access always happens against indexes – never tables – and we should be thinking about the indexes (specifically the non-clustered ones) when we write our queries. I described the scenario of looking up phone numbers, and how it never really occurs to us that there is a master list of phone numbers, because we think in terms of the useful non-clustered indexes that the phone companies provide us, but anyway – that’s not the point of this post. So a table is metadata. It stores information about the names of columns and their data types. Nullability, default values, constraints, triggers – these are all things that define the table, but the data isn’t stored in the table. The data that a table describes is stored in a heap or clustered index, but it goes further than this. All the useful data is going to live in non-clustered indexes. Remember this. It’s important. Stop thinking about tables, and start thinking about indexes. So let’s think about tables as indexes. This applies even in a world created by someone else, who doesn’t have the best indexes in mind for you. I’m sure you don’t need me to explain Covering Index bit – the fact that if you don’t have sufficient columns “included” in your index, your query plan will either have to do a Lookup, or else it’ll give up using your index and use one that does have everything it needs (even if that means scanning it). If you haven’t seen that before, drop me a line and I’ll run through it with you. Or go and read a post I did a long while ago about the maths involved in that decision. So – what I’m going to tell you is that a Lookup is a join. When I run SELECT CustomerID FROM Sales.SalesOrderHeader WHERE SalesPersonID = 285; against the AdventureWorks2012 get the following plan: I’m sure you can see the join. Don’t look in the query, it’s not there. But you should be able to see the join in the plan. It’s an Inner Join, implemented by a Nested Loop. It’s pulling data in from the Index Seek, and joining that to the results of a Key Lookup. It clearly is – the QO wouldn’t call it that if it wasn’t really one. It behaves exactly like any other Nested Loop (Inner Join) operator, pulling rows from one side and putting a request in from the other. You wouldn’t have a problem accepting it as a join if the query were slightly different, such as SELECT sod.OrderQty FROM Sales.SalesOrderHeader AS soh JOIN Sales.SalesOrderDetail as sod on sod.SalesOrderID = soh.SalesOrderID WHERE soh.SalesPersonID = 285; Amazingly similar, of course. This one is an explicit join, the first example was just as much a join, even thought you didn’t actually ask for one. You need to consider this when you’re thinking about your queries. But it gets more interesting. Consider this query: SELECT SalesOrderID FROM Sales.SalesOrderHeader WHERE SalesPersonID = 276 AND CustomerID = 29522; It doesn’t look like there’s a join here either, but look at the plan. That’s not some Lookup in action – that’s a proper Merge Join. The Query Optimizer has worked out that it can get the data it needs by looking in two separate indexes and then doing a Merge Join on the data that it gets. Both indexes used are ordered by the column that’s indexed (one on SalesPersonID, one on CustomerID), and then by the CIX key SalesOrderID. Just like when you seek in the phone book to Farley, the Farleys you have are ordered by FirstName, these seek operations return the data ordered by the next field. This order is SalesOrderID, even though you didn’t explicitly put that column in the index definition. The result is two datasets that are ordered by SalesOrderID, making them very mergeable. Another example is the simple query SELECT CustomerID FROM Sales.SalesOrderHeader WHERE SalesPersonID = 276; This one prefers a Hash Match to a standard lookup even! This isn’t just ordinary index intersection, this is something else again! Just like before, we could imagine it better with two whole tables, but we shouldn’t try to distinguish between joining two tables and joining two indexes. The Query Optimizer can see (using basic maths) that it’s worth doing these particular operations using these two less-than-ideal indexes (because of course, the best indexese would be on both columns – a composite such as (SalesPersonID, CustomerID – and it would have the SalesOrderID column as part of it as the CIX key still). You need to think like this too. Not in terms of excusing single-column indexes like the ones in AdventureWorks2012, but in terms of having a picture about how you’d like your queries to run. If you start to think about what data you need, where it’s coming from, and how it’s going to be used, then you will almost certainly write better queries. …and yes, this would include when you’re dealing with regular joins across multiples, not just against joins within single table queries.

    Read the article

  • The blocking nature of aggregates

    - by Rob Farley
    I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here.  In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

    Read the article

  • Visually stunning maps and PivotViewer

    - by Rob Farley
    One of the things about PivotViewer is that it runs in the Silverlight platform and can be extended recently. One of my guys at LobsterPot, Roger Noble, has used this to incorporate a Bing Maps layer, showing items which have  Latitude and Longitude values there. We’re already talking to a hospital about using this to allow them to browse their patient data, including showing the patients on a map according to which bed they’re in. Interesting times – this will involve having custom tiles instead of the ones from Bing Maps, but the idea is similar. Of course, we’ll be using Bing Maps to show where the patients live. I should also mention that this is a work-in-progress still. Figuring out how to use PivotViewer isn’t trivial, and we’ve done quite a lot of experimenting to see how to get things working. If you find bugs, please feel free to let me know (rob_farley at hotmail will usually reach me), and we’ll add them to our list. Here are some screenshots that I made recently using the collection at http://pivot.lobsterpot.com.au/flickr – by selecting a tag, you can get a new bunch of images. A couple of images that were taken in Iceland. Some from St Mary’s Lighthouse near Newcastle, UK. And some from around Big Ben in London. I’d recommend using either Firefox or Internet Explorer if you choose to browse this yourself. It seems the Chrome browser support for Silverlight doesn’t quite handle things as nicely as we’d all like. I imagine that at some point, we may enhance the Flickr collection, to be able to search on more than tags, but as a sample collection, it seems to work quite well.

    Read the article

  • T-SQL Tuesday #31 - Logging Tricks with CONTEXT_INFO

    - by Most Valuable Yak (Rob Volk)
    This month's T-SQL Tuesday is being hosted by Aaron Nelson [b | t], fellow Atlantan (the city in Georgia, not the famous sunken city, or the resort in the Bahamas) and covers the topic of logging (the recording of information, not the harvesting of trees) and maintains the fine T-SQL Tuesday tradition begun by Adam Machanic [b | t] (the SQL Server guru, not the guy who fixes cars, check the spelling again, there will be a quiz later). This is a trick I learned from Fernando Guerrero [b | t] waaaaaay back during the PASS Summit 2004 in sunny, hurricane-infested Orlando, during his session on Secret SQL Server (not sure if that's the correct title, and I haven't used parentheses in this paragraph yet).  CONTEXT_INFO is a neat little feature that's existed since SQL Server 2000 and perhaps even earlier.  It lets you assign data to the current session/connection, and maintains that data until you disconnect or change it.  In addition to the CONTEXT_INFO() function, you can also query the context_info column in sys.dm_exec_sessions, or even sysprocesses if you're still running SQL Server 2000, if you need to see it for another session. While you're limited to 128 bytes, one big advantage that CONTEXT_INFO has is that it's independent of any transactions.  If you've ever logged to a table in a transaction and then lost messages when it rolled back, you can understand how aggravating it can be.  CONTEXT_INFO also survives across multiple SQL batches (GO separators) in the same connection, so for those of you who were going to suggest "just log to a table variable, they don't get rolled back":  HA-HA, I GOT YOU!  Since GO starts a new batch all variable declarations are lost. Here's a simple example I recently used at work.  I had to test database mirroring configurations for disaster recovery scenarios and measure the network throughput.  I also needed to log how long it took for the script to run and include the mirror settings for the database in question.  I decided to use AdventureWorks as my database model, and Adam Machanic's Big Adventure script to provide a fairly large workload that's repeatable and easily scalable.  My test would consist of several copies of AdventureWorks running the Big Adventure script while I mirrored the databases (or not). Since Adam's script contains several batches, I decided CONTEXT_INFO would have to be used.  As it turns out, I only needed to grab the start time at the beginning, I could get the rest of the data at the end of the process.   The code is pretty small: declare @time binary(128)=cast(getdate() as binary(8)) set context_info @time   ... rest of Big Adventure code ...   go use master; insert mirror_test(server,role,partner,db,state,safety,start,duration) select @@servername, mirroring_role_desc, mirroring_partner_instance, db_name(database_id), mirroring_state_desc, mirroring_safety_level_desc, cast(cast(context_info() as binary(8)) as datetime), datediff(s,cast(cast(context_info() as binary(8)) as datetime),getdate()) from sys.database_mirroring where db_name(database_id) like 'Adv%';   I declared @time as a binary(128) since CONTEXT_INFO is defined that way.  I couldn't convert GETDATE() to binary(128) as it would pad the first 120 bytes as 0x00.  To keep the CAST functions simple and avoid using SUBSTRING, I decided to CAST GETDATE() as binary(8) and let SQL Server do the implicit conversion.  It's not the safest way perhaps, but it works on my machine. :) As I mentioned earlier, you can query system views for sessions and get their CONTEXT_INFO.  With a little boilerplate code this can be used to monitor long-running procedures, in case you need to kill a process, or are just curious  how long certain parts take.  In this example, I added code to Adam's Big Adventure script to set CONTEXT_INFO messages at strategic places I want to monitor.  (His code is in UPPERCASE as it was in the original, mine is all lowercase): declare @msg binary(128) set @msg=cast('Altering bigProduct.ProductID' as binary(128)) set context_info @msg go ALTER TABLE bigProduct ALTER COLUMN ProductID INT NOT NULL GO set context_info 0x0 go declare @msg1 binary(128) set @msg1=cast('Adding pk_bigProduct Constraint' as binary(128)) set context_info @msg1 go ALTER TABLE bigProduct ADD CONSTRAINT pk_bigProduct PRIMARY KEY (ProductID) GO set context_info 0x0 go declare @msg2 binary(128) set @msg2=cast('Altering bigTransactionHistory.TransactionID' as binary(128)) set context_info @msg2 go ALTER TABLE bigTransactionHistory ALTER COLUMN TransactionID INT NOT NULL GO set context_info 0x0 go declare @msg3 binary(128) set @msg3=cast('Adding pk_bigTransactionHistory Constraint' as binary(128)) set context_info @msg3 go ALTER TABLE bigTransactionHistory ADD CONSTRAINT pk_bigTransactionHistory PRIMARY KEY NONCLUSTERED(TransactionID) GO set context_info 0x0 go declare @msg4 binary(128) set @msg4=cast('Creating IX_ProductId_TransactionDate Index' as binary(128)) set context_info @msg4 go CREATE NONCLUSTERED INDEX IX_ProductId_TransactionDate ON bigTransactionHistory(ProductId,TransactionDate) INCLUDE(Quantity,ActualCost) GO set context_info 0x0   This doesn't include the entire script, only those portions that altered a table or created an index.  One annoyance is that SET CONTEXT_INFO requires a literal or variable, you can't use an expression.  And since GO starts a new batch I need to declare a variable in each one.  And of course I have to use CAST because it won't implicitly convert varchar to binary.  And even though context_info is a nullable column, you can't SET CONTEXT_INFO NULL, so I have to use SET CONTEXT_INFO 0x0 to clear the message after the statement completes.  And if you're thinking of turning this into a UDF, you can't, although a stored procedure would work. So what does all this aggravation get you?  As the code runs, if I want to see which stage the session is at, I can run the following (assuming SPID 51 is the one I want): select CAST(context_info as varchar(128)) from sys.dm_exec_sessions where session_id=51   Since SQL Server 2005 introduced the new system and dynamic management views (DMVs) there's not as much need for tagging a session with these kinds of messages.  You can get the session start time and currently executing statement from them, and neatly presented if you use Adam's sp_whoisactive utility (and you absolutely should be using it).  Of course you can always use xp_cmdshell, a CLR function, or some other tricks to log information outside of a SQL transaction.  All the same, I've used this trick to monitor long-running reports at a previous job, and I still think CONTEXT_INFO is a great feature, especially if you're still using SQL Server 2000 or want to supplement your instrumentation.  If you'd like an exercise, consider adding the system time to the messages in the last example, and an automated job to query and parse it from the system tables.  That would let you track how long each statement ran without having to run Profiler. #TSQL2sDay

    Read the article

  • HSSFS Part 2.1 - Parsing @@VERSION

    - by Most Valuable Yak (Rob Volk)
    For Part 2 of the Handy SQL Server Function Series I decided to tackle parsing useful information from the @@VERSION function, because I am an idiot.  It turns out I was confused about CHARINDEX() vs. PATINDEX() and it pretty much invalidated my original solution.  All is not lost though, this mistake turned out to be informative for me, and hopefully for you. Referring back to the "Version" view in the prelude I started with the following query to extract the version number: SELECT DISTINCT SQLVersion, SUBSTRING(VersionString,PATINDEX('%-%',VersionString)+2, 12) VerNum FROM VERSION I used PATINDEX() to find the first hyphen "-" character in the string, since the version number appears 2 positions after it, and got these results: SQLVersion VerNum ----------- ------------ 2000 8.00.2055 (I 2005 9.00.3080.00 2005 9.00.4053.00 2008 10.50.1600.1 As you can see it was good enough for most of the values, but not for the SQL 2000 @@VERSION.  You'll notice it has only 3 version sections/octets where the others have 4, and the SUBSTRING() grabbed the non-numeric characters after.  To properly parse the version number will require a non-fixed value for the 3rd parameter of SUBSTRING(), which is the number of characters to extract. The best value is the position of the first space to occur after the version number (VN), the trick is to figure out how to find it.  Here's where my confusion about PATINDEX() came about.  The CHARINDEX() function has a handy optional 3rd parameter: CHARINDEX (expression1 ,expression2 [ ,start_location ] ) While PATINDEX(): PATINDEX ('%pattern%',expression ) Does not.  I had expected to use PATINDEX() to start searching for a space AFTER the position of the VN, but it doesn't work that way.  Since there are plenty of spaces before the VN, I thought I'd try PATINDEX() on another character that doesn't appear before, and tried "(": SELECT SQLVersion, SUBSTRING(VersionString,PATINDEX('%-%',VersionString)+2, PATINDEX('%(%',VersionString)) FROM VERSION Unfortunately this messes up the length calculation and yields: SQLVersion VerNum ----------- --------------------------- 2000 8.00.2055 (Intel X86) Dec 16 2008 19:4 2005 9.00.3080.00 (Intel X86) Sep 6 2009 01: 2005 9.00.4053.00 (Intel X86) May 26 2009 14: 2008 10.50.1600.1 (Intel X86) Apr 2008 10.50.1600.1 (X64) Apr 2 20 Yuck.  The problem is that PATINDEX() returns position, and SUBSTRING() needs length, so I have to subtract the VN starting position: SELECT SQLVersion, SUBSTRING(VersionString,PATINDEX('%-%',VersionString)+2, PATINDEX('%(%',VersionString)-PATINDEX('%-%',VersionString)) VerNum FROM VERSION And the results are: SQLVersion VerNum ----------- -------------------------------------------------------- 2000 8.00.2055 (I 2005 9.00.4053.00 (I Msg 537, Level 16, State 2, Line 1 Invalid length parameter passed to the LEFT or SUBSTRING function. Ummmm, whoops.  Turns out SQL Server 2008 R2 includes "(RTM)" before the VN, and that causes the length to turn negative. So now that that blew up, I started to think about matching digit and dot (.) patterns.  Sadly, a quick look at the first set of results will quickly scuttle that idea, since different versions have different digit patterns and lengths. At this point (which took far longer than I wanted) I decided to cut my losses and redo the query using CHARINDEX(), which I'll cover in Part 2.2.  So to do a little post-mortem on this technique: PATINDEX() doesn't have the flexibility to match the digit pattern of the version number; PATINDEX() doesn't have a "start" parameter like CHARINDEX(), that allows us to skip over parts of the string; The SUBSTRING() expression is getting pretty complicated for this relatively simple task! This doesn't mean that PATINDEX() isn't useful, it's just not a good fit for this particular problem.  I'll include a version in the next post that extracts the version number properly. UPDATE: Sorry if you saw the unformatted version of this earlier, I'm on a quest to find blog software that ACTUALLY WORKS.

    Read the article

  • Summit Old, Summit New, Summit Borrowed...

    - by Rob Farley
    PASS Summit is coming up, and I thought I’d post a few things. Summit Old... At the PASS Summit, you will get the chance to hear presentations by the SQL Server establishment. Just about every big name in the SQL Server world is a regular at the PASS Summit, so you will get to hear and meet people like Kalen Delaney (@sqlqueen) (who just recently got awarded MVP status for the 20th year running), and from all around the world such as the UK’s Chris Webb (@technitrain) or Pinal Dave (@pinaldave) from India. Almost all the household names in SQL Server will be there, including a large contingent from Microsoft. The PASS Summit is by far the best place to meet the legends of SQL Server. And they’re not all old. Some are, but most of them are younger than you might think. ...Summit New... The hottest topics are often about the newest technologies (such as SQL Server 2012). But you will almost certainly learn new stuff about older versions too. But that’s not what I wanted to pick on for this point. There are many new speakers at every PASS Summit, and content that has not been covered in other places. This year, for example, LobsterPot’s Roger Noble (@roger_noble) is giving a presentation for the first time. He’s a regular around the Australian circuit, but this is his first time presenting to a US audience. New Zealand’s Paul White (@sql_kiwi) is attending his first PASS Summit, and will be giving over four hours of incredibly deep stuff that has never been presented anywhere in the US before (I can’t say the world, because he did present similar material in Adelaide earlier in the year). ...Summit Borrowed... No, I’m not talking about plagiarism – the talks you’ll hear are all their own work. But you will get a lot of stuff you’ll be able to take back and apply at work. The PASS Summit sessions are not full of sales-pitches, telling you about how great things could be if only you’d buy some third-party vendor product. It’s simply not that kind of conference, and PASS doesn’t allow that kind of talk to take place. Instead, you’ll be taught techniques, and be able to download scripts and slides to let you perform that magic back at work when you get home. You will definitely find plenty of ideas to borrow at the PASS Summit. ...Summit Blue Yeah – and there’s karaoke. Blue - Jason - SQL Karaoke - YouTube

    Read the article

  • Am I correctly handling duplicate URLs for my homepage?

    - by Rob Goldstein
    I own a Job Search site named www.conservationjobboard.com and have a concern about how the domain is viewed by search engines. The issue is that when the site was first designed, the default page was left as default.php, but the homepage was actually JobBoard.php. To handle this, the default.php page performed a redirect to the JobBoard.php file when www.conservationjobboard.com/ was requested. The main problem resulted because the redirect was a temporary redirect causing search engines to index conservationjobboard.com/ and conservationjobboard.com/JobBoard.php as 2 separate pages. This has since been corrected to use the .htaccess file so that JobBoard.php is now the default file for the root directory eliminating the need for the redirect. Problem is that search engines still show both URL's in search results (one including JobBoard.php and one that ends with /). Another potential problem is that some of my early backlinks are to conservationjobboard.com/JobBoard.php while the rest are to conservationjobboard.com The 2 outstanding questions are as follows: 1. Is my domain still being penalized by search engines like Google for having duplicate homepage URL's? 2. Are all of the back links to my homepage being considered as the same now or is the total number of back links being split between the 2 different URL's? If you think there are still issues with how we have this set-up, I was wondering if you could give me advice on what we should do differently. Thanks.

    Read the article

  • MySQL 5.5.18 Debian packaging now available

    - by Rob Young
    I am happy to announce that MySQL 5.5.18 is now available via Debian native packaging.  We have gotten many requests for this and our build and release teams have pulled together to ensure that our DEB packages are delivered with the highest quality.  You can download MySQL 5.5.18 Debian 5 and 6 packages from the MySQL Community Download page or from the My Oracle Support portal. As always, thanks for your continued support of MySQL!

    Read the article

  • What's New in 5.6 RC and more from MySQL Connect conference

    - by Rob Young
    Keeping with the tradition of great MySQL Community events, the first annual MySQL Connect conference is now in the books.  It was great to see so many familiar faces in the crowd and at the podium sharing their ideas and thoughts on the evolution of MySQL under Oracle. The headliner of the conference was Tomas' keynote announcement of the fully featured and fully enabled MySQL 5.6 Release Candidate.  This new article on the MySQL DevZone summarizes all of the great new features ready for Community adoption, all MySQL Engineering blogs and where and how to download all of the bits. As always, early adoption and feedback on the 5.6 RC is appreciated and the sooner we get your feedback the sooner we release the "ready for production" sanctioned GA product.    Also available now, Cluster 7.3 provides support for Foreign Keys, node.js NoSQL access to underlying data and a new Auto Installer that helps you quickly and easily get up and running with Cluster 7.2 and 7.3.  The 7.3 downloads are provided in the first 7.3 Development Milestone Release (under "Development Releases" tab) and via the MySQL Labs. Oracle also announced key new additions to MySQL Enterprise Edition: New policy-based compliance Auditing. MySQL Enterprise Edition Audit adds policy-based auditing compliance to existing MySQL applications without the need to change any code.  This new plugin is available for MySQL 5.5.28 and higher; existing MySQL Enterprise Edition customers can download the upgrade from the My Oracle Support portal and all can download for evaluation from Oracle's Software Delivery Cloud. New MySQL Enterprise High Available additions provide even more options for ensuring MySQL applications remain available and running a their peak: Oracle Linux + DRBD Oracle Solaris Clustering for MySQL All in all, the first MySQL Connect conference was a great success and with refinements planned in response to attendee, sponsor and speaker feedback we expect it to grow and improve going forward. As always, thanks for your continued support of MySQL!

    Read the article

  • Spooling in SQL execution plans

    - by Rob Farley
    Sewing has never been my thing. I barely even know the terminology, and when discussing this with American friends, I even found out that half the words that Americans use are different to the words that English and Australian people use. That said – let’s talk about spools! In particular, the Spool operators that you find in some SQL execution plans. This post is for T-SQL Tuesday, hosted this month by me! I’ve chosen to write about spools because they seem to get a bad rap (even in my song I used the line “There’s spooling from a CTE, they’ve got recursion needlessly”). I figured it was worth covering some of what spools are about, and hopefully explain why they are remarkably necessary, and generally very useful. If you have a look at the Books Online page about Plan Operators, at http://msdn.microsoft.com/en-us/library/ms191158.aspx, and do a search for the word ‘spool’, you’ll notice it says there are 46 matches. 46! Yeah, that’s what I thought too... Spooling is mentioned in several operators: Eager Spool, Lazy Spool, Index Spool (sometimes called a Nonclustered Index Spool), Row Count Spool, Spool, Table Spool, and Window Spool (oh, and Cache, which is a special kind of spool for a single row, but as it isn’t used in SQL 2012, I won’t describe it any further here). Spool, Table Spool, Index Spool, Window Spool and Row Count Spool are all physical operators, whereas Eager Spool and Lazy Spool are logical operators, describing the way that the other spools work. For example, you might see a Table Spool which is either Eager or Lazy. A Window Spool can actually act as both, as I’ll mention in a moment. In sewing, cotton is put onto a spool to make it more useful. You might buy it in bulk on a cone, but if you’re going to be using a sewing machine, then you quite probably want to have it on a spool or bobbin, which allows it to be used in a more effective way. This is the picture that I want you to think about in relation to your data. I’m sure you use spools every time you use your sewing machine. I know I do. I can’t think of a time when I’ve got out my sewing machine to do some sewing and haven’t used a spool. However, I often run SQL queries that don’t use spools. You see, the data that is consumed by my query is typically in a useful state without a spool. It’s like I can just sew with my cotton despite it not being on a spool! Many of my favourite features in T-SQL do like to use spools though. This looks like a very similar query to before, but includes an OVER clause to return a column telling me the number of rows in my data set. I’ll describe what’s going on in a few paragraphs’ time. So what does a Spool operator actually do? The spool operator consumes a set of data, and stores it in a temporary structure, in the tempdb database. This structure is typically either a Table (ie, a heap), or an Index (ie, a b-tree). If no data is actually needed from it, then it could also be a Row Count spool, which only stores the number of rows that the spool operator consumes. A Window Spool is another option if the data being consumed is tightly linked to windows of data, such as when the ROWS/RANGE clause of the OVER clause is being used. You could maybe think about the type of spool being like whether the cotton is going onto a small bobbin to fit in the base of the sewing machine, or whether it’s a larger spool for the top. A Table or Index Spool is either Eager or Lazy in nature. Eager and Lazy are Logical operators, which talk more about the behaviour, rather than the physical operation. If I’m sewing, I can either be all enthusiastic and get all my cotton onto the spool before I start, or I can do it as I need it. “Lazy” might not the be the best word to describe a person – in the SQL world it describes the idea of either fetching all the rows to build up the whole spool when the operator is called (Eager), or populating the spool only as it’s needed (Lazy). Window Spools are both physical and logical. They’re eager on a per-window basis, but lazy between windows. And when is it needed? The way I see it, spools are needed for two reasons. 1 – When data is going to be needed AGAIN. 2 – When data needs to be kept away from the original source. If you’re someone that writes long stored procedures, you are probably quite aware of the second scenario. I see plenty of stored procedures being written this way – where the query writer populates a temporary table, so that they can make updates to it without risking the original table. SQL does this too. Imagine I’m updating my contact list, and some of my changes move data to later in the book. If I’m not careful, I might update the same row a second time (or even enter an infinite loop, updating it over and over). A spool can make sure that I don’t, by using a copy of the data. This problem is known as the Halloween Effect (not because it’s spooky, but because it was discovered in late October one year). As I’m sure you can imagine, the kind of spool you’d need to protect against the Halloween Effect would be eager, because if you’re only handling one row at a time, then you’re not providing the protection... An eager spool will block the flow of data, waiting until it has fetched all the data before serving it up to the operator that called it. In the query below I’m forcing the Query Optimizer to use an index which would be upset if the Name column values got changed, and we see that before any data is fetched, a spool is created to load the data into. This doesn’t stop the index being maintained, but it does mean that the index is protected from the changes that are being done. There are plenty of times, though, when you need data repeatedly. Consider the query I put above. A simple join, but then counting the number of rows that came through. The way that this has executed (be it ideal or not), is to ask that a Table Spool be populated. That’s the Table Spool operator on the top row. That spool can produce the same set of rows repeatedly. This is the behaviour that we see in the bottom half of the plan. In the bottom half of the plan, we see that the a join is being done between the rows that are being sourced from the spool – one being aggregated and one not – producing the columns that we need for the query. Table v Index When considering whether to use a Table Spool or an Index Spool, the question that the Query Optimizer needs to answer is whether there is sufficient benefit to storing the data in a b-tree. The idea of having data in indexes is great, but of course there is a cost to maintaining them. Here we’re creating a temporary structure for data, and there is a cost associated with populating each row into its correct position according to a b-tree, as opposed to simply adding it to the end of the list of rows in a heap. Using a b-tree could even result in page-splits as the b-tree is populated, so there had better be a reason to use that kind of structure. That all depends on how the data is going to be used in other parts of the plan. If you’ve ever thought that you could use a temporary index for a particular query, well this is it – and the Query Optimizer can do that if it thinks it’s worthwhile. It’s worth noting that just because a Spool is populated using an Index Spool, it can still be fetched using a Table Spool. The details about whether or not a Spool used as a source shows as a Table Spool or an Index Spool is more about whether a Seek predicate is used, rather than on the underlying structure. Recursive CTE I’ve already shown you an example of spooling when the OVER clause is used. You might see them being used whenever you have data that is needed multiple times, and CTEs are quite common here. With the definition of a set of data described in a CTE, if the query writer is leveraging this by referring to the CTE multiple times, and there’s no simplification to be leveraged, a spool could theoretically be used to avoid reapplying the CTE’s logic. Annoyingly, this doesn’t happen. Consider this query, which really looks like it’s using the same data twice. I’m creating a set of data (which is completely deterministic, by the way), and then joining it back to itself. There seems to be no reason why it shouldn’t use a spool for the set described by the CTE, but it doesn’t. On the other hand, if we don’t pull as many columns back, we might see a very different plan. You see, CTEs, like all sub-queries, are simplified out to figure out the best way of executing the whole query. My example is somewhat contrived, and although there are plenty of cases when it’s nice to give the Query Optimizer hints about how to execute queries, it usually doesn’t do a bad job, even without spooling (and you can always use a temporary table). When recursion is used, though, spooling should be expected. Consider what we’re asking for in a recursive CTE. We’re telling the system to construct a set of data using an initial query, and then use set as a source for another query, piping this back into the same set and back around. It’s very much a spool. The analogy of cotton is long gone here, as the idea of having a continual loop of cotton feeding onto a spool and off again doesn’t quite fit, but that’s what we have here. Data is being fed onto the spool, and getting pulled out a second time when the spool is used as a source. (This query is running on AdventureWorks, which has a ManagerID column in HumanResources.Employee, not AdventureWorks2012) The Index Spool operator is sucking rows into it – lazily. It has to be lazy, because at the start, there’s only one row to be had. However, as rows get populated onto the spool, the Table Spool operator on the right can return rows when asked, ending up with more rows (potentially) getting back onto the spool, ready for the next round. (The Assert operator is merely checking to see if we’ve reached the MAXRECURSION point – it vanishes if you use OPTION (MAXRECURSION 0), which you can try yourself if you like). Spools are useful. Don’t lose sight of that. Every time you use temporary tables or table variables in a stored procedure, you’re essentially doing the same – don’t get upset at the Query Optimizer for doing so, even if you think the spool looks like an expensive part of the query. I hope you’re enjoying this T-SQL Tuesday. Why not head over to my post that is hosting it this month to read about some other plan operators? At some point I’ll write a summary post – once I have you should find a comment below pointing at it. @rob_farley

    Read the article

  • I spoke at SQL Saturday #77 and all I got was this really awesome speaker's shirt!

    - by Most Valuable Yak (Rob Volk)
    Yeah, it was 2 weeks ago, but I'm finally blogging about something! I presented Revenge: The SQL! at SQL Saturday #77 in Pensacola on June 4.  The session abstract is here, and you can download the slides from that page too.  You can see how I look in the speaker's shirt here. Overall it went pretty well.  I discovered a new bit of evil just that morning and in a carefully considered, agonizing decision-making process that was full documented, tested, and approved…nah, I just went ahead and added it at the last minute.  Which worked out even better than (not) planned, since it screwed me up a bit and made my point perfectly.  I had a few fans in the audience, and one of them recorded it for blackmail material posterity. I'd like to thank Karla Landrum (blog | twitter) and all the volunteers for putting together such a great event, and for being kind enough to let me present. (Note to Karla: I'll get the next $100 to you as soon as I can.  Might need a few extra days on the next $100.) Thanks to Audrey (blog | twitter), Peg, and Dorothy for attending and keeping the heckling down.  Thanks also to Aaron (blog | twitter) for providing room and board and also not heckling.  Thanks to Julie (blog | twitter) for coming up with the title for the presentation.  (boo to Julie for getting sick and bailing out on us)  And thanks to all of them for listening to a preview and offering their suggestions and advice! Cross your fingers that I get accepted at SQL Saturday 81 in Birmingham, SQL Saturday 85 in Orlando, or SQL Saturday 89 in Atlanta, or just attend them anyway!

    Read the article

  • T-SQL Tuesday #21 - Crap!

    - by Most Valuable Yak (Rob Volk)
    Adam Machanic's (blog | twitter) ever popular T-SQL Tuesday series is being held on Wednesday this time, and the topic is… SHIT CRAP. No, not fecal material.  But crap code.  Crap SQL.  Crap ideas that you thought were good at the time, or were forced to do due (doo-doo?) to lack of time. The challenge for me is to look back on my SQL Server career and find something that WASN'T crap.  Well, there's a lot that wasn't, but for some reason I don't remember those that well.  So the additional challenge is to pick one particular turd that I really wish I hadn't squeezed out.  Let's see if this outline fits the bill: An ETL process on text files; That had to interface between SQL Server and an AS/400 system; That didn't use SSIS (should have) or BizTalk (ummm, no) but command-line scripting, using Unix utilities(!) via: xp_cmdshell; That had to email reports and financial data, some of it sensitive Yep, the stench smell is coming back to me now, as if it was yesterday… As to why SSIS and BizTalk were not options, basically I didn't know either of them well enough to get the job done (and I still don't).  I also had a strict deadline of 3 days, in addition to all the other responsibilities I had, so no time to learn them.  And seeing how screwed up the rest of the process was: Payment files from multiple vendors in multiple formats; Sent via FTP, PGP encrypted email, or some other wizardry; Manually opened/downloaded and saved to a particular set of folders (couldn't change this); Once processed, had to be placed BACK in the same folders with the original archived; x2 divisions that had to run separately; Plus an additional vendor file in another format on a completely different schedule; So that they could be MANUALLY uploaded into the AS/400 system (couldn't change this either, even if it was technically possible) I didn't feel so bad about the solution I came up with, which was naturally: Copy the payment files to the local SQL Server drives, using xp_cmdshell Run batch files (via xp_cmdshell) to parse the different formats using sed, a Unix utility (this was before Powershell) Use other Unix utilities (join, split, grep, wc) to process parsed files and generate metadata (size, date, checksum, line count) Run sqlcmd to execute a stored procedure that passed the parsed file names so it would bulk load the data to do a comparison bcp the compared data out to ANOTHER text file so that I could grep that data out of the original file Run another stored procedure to import the matched data into SQL Server so it could process the payments, including file metadata Process payment batches and log which division and vendor they belong to Email the payment details to the finance group (since it was too hard for them to run a web report with the same data…which they ran anyway to compare the emailed file against…which always matched, surprisingly) Email another report showing unmatched payments so they could manually void them…about 3 months afterward All in "Excel" format, using xp_sendmail (SQL 2000 system) Copy the unmatched data back to the original folder locations, making sure to match the file format exactly (if you've ever worked with ACH files, you'll understand why this sucked) If you're one of the 10 people who have read my blog before, you know that I love the DOS "for" command.  Like passionately.  Like fairy-tale love.  So my batch files were riddled with for loops, nested within other for loops, that called other batch files containing for loops.  I think there was one section that had 4 or 5 nested for commands.  It was wrong, disturbed, and completely un-maintainable by anyone, even myself.  Months, even a year, after I left the company I got calls from someone who had to make a minor change to it, and they called me to talk them out of spraying the office with an AK-47 after looking at this code.  (for you Star Trek TOS fans) The funniest part of this, well, one of the funniest, is that I made the deadline…sort of, I was only a day late…and the DAMN THING WORKED practically unchanged for 3 years.  Most of the problems came from the manual parts of the overall process, like forgetting to decrypt the files, or missing/late files, or saved to the wrong folders.  I'm definitely not trying to toot my own horn here, because this was truly one of the dumbest, crappiest solutions I ever came up with.  Fortunately as far as I know it's no longer in use and someone has written a proper replacement.  Today I would knuckle down and do it in SSIS or Powershell, even if it took me weeks to get it right. The real lesson from this crap code is to make things MAINTAINABLE and UNDERSTANDABLE.  sed scripting regular expressions doesn't fit that criteria in any way.  If you ever find yourself under pressure to do something fast at all costs, DON'T DO IT.  Stop and consider long-term maintainability, not just for yourself but for others on your team.  If you can't explain the basic approach in under 5 minutes, it ultimately won't succeed.  And while you may love to leave all that crap behind, it may follow you anyway, and you'll step in it again.   P.S. - if you're wondering about all the manual stuff that couldn't be changed, it was because the entire process had gone through Six Sigma, and was deemed the best possible way.  Phew!  Talk about stink!

    Read the article

  • New Replication, Optimizer and High Availability features in MySQL 5.6.5!

    - by Rob Young
    As the Product Manager for the MySQL database it is always great to announce when the MySQL Engineering team delivers another great product release.  As a field DBA and developer it is even better when that release contains improvements and innovation that I know will help those currently using MySQL for apps that range from modest intranet sites to the most highly trafficked web sites on the web.  That said, it is my pleasure to take my hat off to MySQL Engineering for today's release of the MySQL 5.6.5 Development Milestone Release ("DMR"). The new highlighted features in MySQL 5.6.5 are discussed here: New Self-Healing Replication ClustersThe 5.6.5 DMR improves MySQL Replication by adding Global Transaction Ids and automated utilities for self-healing Replication clusters.  Prior to 5.6.5 this has been somewhat of a pain point for MySQL users with most developing custom solutions or looking to costly, complex third-party solutions for these capabilities.  With 5.6.5 these shackles are all but removed by a solution that is included with the GPL version of the database and supporting GPL tools.  You can learn all about the details of the great, problem solving Replication features in MySQL 5.6 in Mat Keep's Developer Zone article.  New Replication Administration and Failover UtilitiesAs mentioned above, the new Replication features, Global Transaction Ids specifically, are now supported by a set of automated GPL utilities that leverage the new GTIDs to provide administration and manual or auto failover to the most up to date slave (that is the default, but user configurable if needed) in the event of a master failure. The new utilities, along with links to Engineering related blogs, are discussed in detail in the DevZone Article noted above. Better Query Optimization and ThroughputThe MySQL Optimizer team continues to amaze with the latest round of improvements in 5.6.5. Along with much refactoring of the legacy code base, the Optimizer team has improved complex query optimization and throughput by adding these functional improvements: Subquery Optimizations - Subqueries are now included in the Optimizer path for runtime optimization.  Better throughput of nested queries enables application developers to simplify and consolidate multiple queries and result sets into a single unit or work. Optimizer now uses CURRENT_TIMESTAMP as default for DATETIME columns - For simplification, this eliminates the need for application developers to assign this value when a column of this type is blank by default. Optimizations for Range based queries - Optimizer now uses ready statistics vs Index based scans for queries with multiple range values. Optimizations for queries using filesort and ORDER BY.  Optimization criteria/decision on execution method is done now at optimization vs parsing stage. Print EXPLAIN in JSON format for hierarchical readability and Enterprise tool consumption. You can learn the details about these new features as well all of the Optimizer based improvements in MySQL 5.6 by following the Optimizer team blog. You can download and try the MySQL 5.6.5 DMR here. (look under "Development Releases")  Please let us know what you think!  The new HA utilities for Replication Administration and Failover are available as part of the MySQL Workbench Community Edition, which you can download here .Also New in MySQL LabsAs has become our tradition when announcing DMRs we also like to provide "Early Access" development features to the MySQL Community via the MySQL Labs.  Today is no exception as we are also releasing the following to Labs for you to download, try and let us know your thoughts on where we need to improve:InnoDB Online OperationsMySQL 5.6 now provides Online ADD Index, FK Drop and Online Column RENAME.  These operations are non-blocking and will continue to evolve in future DMRs.  You can learn the grainy details by following John Russell's blog.InnoDB data access via Memcached API ("NotOnlySQL") - Improved refresh of an earlier feature releaseSimilar to Cluster 7.2, MySQL 5.6 provides direct NotOnlySQL access to InnoDB data via the familiar Memcached API. This provides the ultimate in flexibility for developers who need fast, simple key/value access and complex query support commingled within their applications.Improved Transactional Performance, ScaleThe InnoDB Engineering team has once again under promised and over delivered in the area of improved performance and scale.  These improvements are also included in the aggregated Spring 2012 labs release:InnoDB CPU cache performance improvements for modern, multi-core/CPU systems show great promise with internal tests showing:    2x throughput improvement for read only activity 6x throughput improvement for SELECT range Read/Write benchmarks are in progress More details on the above are available here. You can download all of the above in an aggregated "InnoDB 2012 Spring Labs Release" binary from the MySQL Labs. You can also learn more about these improvements and about related fixes to mysys mutex and hash sort by checking out the InnoDB team blog.MySQL 5.6.5 is another installment in what we believe will be the best release of the MySQL database ever.  It also serves as a shining example of how the MySQL Engineering team at Oracle leads in MySQL innovation.You can get the overall Oracle message on the MySQL 5.6.5 DMR and Early Access labs features here. As always, thanks for your continued support of MySQL, the #1 open source database on the planet!

    Read the article

  • DotNetNuke Boston User Group

    - by Rob Chartier
    Eric, over at the Boston DNN User Group has graciously invited me to give a presentation to his User Group on May 17th.  Come join me for an open discussion on “DotNetNuke – A look inside”.  I will cover topics like how we are adopting the Agile methodologies at a corporate level, how we are best utilizing Scrum, a sneak peek at the roadmap for 2010, and how YOU can participate with the future direction of the product. If you are currently a partner or a customer of DotNetNuke please feel free to attend and reach out, I’m sure Eric would love the extra attendance!  I would love to start putting faces to the names of so many of you.

    Read the article

  • Top 5 Developer Enabling Nuggets in MySQL 5.6

    - by Rob Young
    MySQL 5.6 is truly a better MySQL and reflects Oracle's commitment to the evolution of the most popular and widelyused open source database on the planet.  The feature-complete 5.6 release candidate was announced at MySQL Connect in late September and the production-ready, generally available ("GA") product should be available in early 2013.  While the message around 5.6 has been focused mainly on mass appeal, advanced topics like performance/scale, high availability, and self-healing replication clusters, MySQL 5.6 also provides many developer-friendly nuggets that are designed to enable those who are building the next generation of web-based and embedded applications and services. Boiling down the 5.6 feature set into a smaller set, of simple, easy to use goodies designed with developer agility in mind, these things deserve a quick look:Subquery Optimizations Using semi-JOINs and late materialization, the MySQL 5.6 Optimizer delivers greatly improved subquery performance. Specifically, the optimizer is now more efficient in handling subqueries in the FROM clause; materialization of subqueries in the FROM clause is now postponed until their contents are needed during execution. Additionally, the optimizer may add an index to derived tables during execution to speed up row retrieval. Internal tests run using the DBT-3 benchmark Query #13, shown below, demonstrate an order of magnitude improvement in execution times (from days to seconds) over previous versions. select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity)from customer, orders, lineitemwhere o_orderkey in (                select l_orderkey                from lineitem                group by l_orderkey                having sum(l_quantity) > 313  )  and c_custkey = o_custkey  and o_orderkey = l_orderkeygroup by c_name, c_custkey, o_orderkey, o_orderdate, o_totalpriceorder by o_totalprice desc, o_orderdateLIMIT 100;What does this mean for developers?  For starters, simplified subqueries can now be coded instead of complex joins for cross table lookups: SELECT title FROM film WHERE film_id IN (SELECT film_id FROM film_actor GROUP BY film_id HAVING count(*) > 12); And even more importantly subqueries embedded in packaged applications no longer need to be re-written into joins.  This is good news for both ISVs and their customers who have access to the underlying queries and who have spent development cycles writing, testing and maintaining their own versions of re-written queries across updated versions of a packaged app.The details are in the MySQL 5.6 docs. Online DDL OperationsToday's web-based applications are designed to rapidly evolve and adapt to meet business and revenue-generationrequirements. As a result, development SLAs are now most often measured in minutes vs days or weeks. For example, when an application must quickly support new product lines or new products within existing product lines, the backend database schema must adapt in kind, and most commonly while the application remains available for normal business operations.  MySQL 5.6 supports this level of online schema flexibility and agility by providing the following new ALTER TABLE online DDL syntax additions:  CREATE INDEX DROP INDEX Change AUTO_INCREMENT value for a column ADD/DROP FOREIGN KEY Rename COLUMN Change ROW FORMAT, KEY_BLOCK_SIZE for a table Change COLUMN NULL, NOT_NULL Add, drop, reorder COLUMN Again, the details are in the MySQL 5.6 docs. Key-value access to InnoDB via Memcached APIMany of the next generation of web, cloud, social and mobile applications require fast operations against simple Key/Value pairs. At the same time, they must retain the ability to run complex queries against the same data, as well as ensure the data is protected with ACID guarantees. With the new NoSQL API for InnoDB, developers have allthe benefits of a transactional RDBMS, coupled with the performance capabilities of Key/Value store.MySQL 5.6 provides simple, key-value interaction with InnoDB data via the familiar Memcached API.  Implemented via a new Memcached daemon plug-in to mysqld, the new Memcached protocol is mapped directly to the native InnoDB API and enables developers to use existing Memcached clients to bypass the expense of query parsing and go directly to InnoDB data for lookups and transactional compliant updates.  The API makes it possible to re-use standard Memcached libraries and clients, while extending Memcached functionality by integrating a persistent, crash-safe, transactional database back-end.  The implementation is shown here:So does this option provide a performance benefit over SQL?  Internal performance benchmarks using a customized Java application and test harness show some very promising results with a 9X improvement in overall throughput for SET/INSERT operations:You can follow the InnoDB team blog for the methodology, implementation and internal test cases that generated these results here. How to get started with Memcached API to InnoDB is here. New Instrumentation in Performance SchemaThe MySQL Performance Schema was introduced in MySQL 5.5 and is designed to provide point in time metrics for key performance indicators.  MySQL 5.6 improves the Performance Schema in answer to the most common DBA and Developer problems.  New instrumentations include: Statements/Stages What are my most resource intensive queries? Where do they spend time? Table/Index I/O, Table Locks Which application tables/indexes cause the most load or contention? Users/Hosts/Accounts Which application users, hosts, accounts are consuming the most resources? Network I/O What is the network load like? How long do sessions idle? Summaries Aggregated statistics grouped by statement, thread, user, host, account or object. The MySQL 5.6 Performance Schema is now enabled by default in the my.cnf file with optimized and auto-tune settings that minimize overhead (< 5%, but mileage will vary), so using the Performance Schema ona production server to monitor the most common application use cases is less of an issue.  In addition, new atomic levels of instrumentation enable the capture of granular levels of resource consumption by users, hosts, accounts, applications, etc. for billing and chargeback purposes in cloud computing environments.The MySQL docs are an excellent resource for all that is available and that can be done with the 5.6 Performance Schema. Better Condition Handling - GET DIAGNOSTICSMySQL 5.6 enables developers to easily check for error conditions and code for exceptions by introducing the new MySQL Diagnostics Area and corresponding GET DIAGNOSTICS interface command. The Diagnostic Area can be populated via multiple options and provides 2 kinds of information:Statement - which provides affected row count and number of conditions that occurredCondition - which provides error codes and messages for all conditions that were returned by a previous operation The addressable items for each are: The new GET DIAGNOSTICS command provides a standard interface into the Diagnostics Area and can be used via the CLI or from within application code to easily retrieve and handle the results of the most recent statement execution.  An example of how it is used might be:mysql> DROP TABLE test.no_such_table; ERROR 1051 (42S02): Unknown table 'test.no_such_table' mysql> GET DIAGNOSTICS CONDITION 1 -> @p1 = RETURNED_SQLSTATE, @p2 = MESSAGE_TEXT; mysql> SELECT @p1, @p2; +-------+------------------------------------+| @p1   | @p2                                | +-------+------------------------------------+| 42S02 | Unknown table 'test.no_such_table' | +-------+------------------------------------+ Options for leveraging the MySQL Diagnotics Area and GET DIAGNOSTICS are detailed in the MySQL Docs.While the above is a summary of some of the key developer enabling 5.6 features, it is by no means exhaustive. You can dig deeper into what MySQL 5.6 has to offer by reading this developer zone article or checking out "What's New in MySQL 5.6" in the MySQL docs.BONUS ALERT!  If you are developing on Windows or are considering MySQL as an alternative to SQL Server for your next project, application or shipping product, you should check out the MySQL Installer for Windows.  The installer includes the MySQL 5.6 RC database, all drivers, Visual Studio and Excel plugins, tray monitor and development tools all a single download and GUI installer.   So what are your next steps? Register for Dec. 13 "MySQL 5.6: Building the Next Generation of Web-Based Applications and Services" live web event.  Hurry!  Seats are limited. Download the MySQL 5.6 Release Candidate (look under the Development Releases tab) Provide Feedback <link to http://bugs.mysql.com/> Join the Developer discussion on the MySQL Forums Explore all MySQL Products and Developer Tools As always, thanks for your continued support of MySQL!

    Read the article

  • Database Mirroring on SQL Server Express Edition

    - by Most Valuable Yak (Rob Volk)
    Like most SQL Server users I'm rather frustrated by Microsoft's insistence on making the really cool features only available in Enterprise Edition.  And it really doesn't help that they changed the licensing for SQL 2012 to be core-based, so now it's like 4 times as expensive!  It almost makes you want to go with Oracle.  That, and a desire to have Larry Ellison do things to your orifices. And since they've introduced Availability Groups, and marked database mirroring as deprecated, you'd think they'd make make mirroring available in all editions.  Alas…they don't…officially anyway.  Thanks to my constant poking around in places I'm not "supposed" to, I've discovered the low-level code that implements database mirroring, and found that it's available in all editions! It turns out that the query processor in all SQL Server editions prepends a simple check before every edition-specific DDL statement: IF CAST(SERVERPROPERTY('Edition') as nvarchar(max)) NOT LIKE '%e%e%e% Edition%' print 'Lame' else print 'Cool' If that statement returns true, it fails. (the print statements are just placeholders)  Go ahead and test it on Standard, Workgroup, and Express editions compared to an Enterprise or Developer edition instance (which support everything). Once again thanks to Argenis Fernandez (b | t) and his awesome sessions on using Sysinternals, I was able to watch the exact process SQL Server performs when setting up a mirror.  Surprisingly, it's not actually implemented in SQL Server!  Some of it is, but that's something of a smokescreen, the real meat of it is simple filesystem primitives. The NTFS filesystem supports links, both hard links and symbolic, so that you can create two entries for the same file in different directories and/or different names.  You can create them using the MKLINK command in a command prompt: mklink /D D:\SkyDrive\Data D:\Data mklink /D D:\SkyDrive\Log D:\Log This creates a symbolic link from my data and log folders to my Skydrive folder.  Any file saved in either location will instantly appear in the other.  And since my Skydrive will be automatically synchronized with the cloud, any changes I make will be copied instantly (depending on my internet bandwidth of course). So what does this have to do with database mirroring?  Well, it seems that the mirroring endpoint that you have to create between mirror and principal servers is really nothing more than a Skydrive link.  Although it doesn't actually use Skydrive, it performs the same function.  So in effect, the following statement: ALTER DATABASE Mir SET PARTNER='TCP://MyOtherServer.domain.com:5022' Is turned into: mklink /D "D:\Data" "\\MyOtherServer.domain.com\5022$" The 5022$ "port" is actually a hidden system directory on the principal and mirror servers. I haven't quite figured out how the log files are included in this, or why you have to SET PARTNER on both principal and mirror servers, except maybe that mklink has to do something special when linking across servers.  I couldn't get the above statement to work correctly, but found that doing mklink to a local Skydrive folder gave me similar functionality. To wrap this up, all you have to do is the following: Install Skydrive on both SQL Servers (principal and mirror) and set the local Skydrive folder (D:\SkyDrive in these examples) On the principal server, run mklink /D on the data and log folders to point to SkyDrive: mklink /D D:\SkyDrive\Data D:\Data On the mirror server, run the complementary linking: mklink /D D:\Data D:\SkyDrive\Data Create your database and make sure the files map to the principal data and log folders (D:\Data and D:\Log) Viola! Your databases are kept in sync on multiple servers! One wrinkle you will encounter is that the mirror server will show the data and log files, but you won't be able to attach them to the mirror SQL instance while they are attached to the principal. I think this is a bug in the Skydrive, but as it turns out that's fine: you can't access a mirror while it's hosted on the principal either.  So you don't quite get automatic failover, but you can attach the files to the mirror if the principal goes offline.  It's also not exactly synchronous, but it's better than nothing, and easier than either replication or log shipping with a lot less latency. I will end this with the obvious "not supported by Microsoft" and "Don't do this in production without an updated resume" spiel that you should by now assume with every one of my blog posts, especially considering the date.

    Read the article

  • MySQL Policy-Based Auditing Webinar Recording Now Availabile

    - by Rob Young
    For those who missed the live event, the recording of the "How to Add Policy-Based Auditing to your MySQL Applications" webinar is now available.  You can view it here. This presentation builds on my earlier blog post on MySQL Enterprise Audit that was announced at MySQL Connect in late September.  The web presentation expands on the introductory blog and covers: The regulatory problem to be solved (internal audit, PCI, Sarbanes-Oxley, HIPAA, others) MySQL Audit solutions for both Community and Enterprise users: General Log - use the basic features of the MySQL server MySQL 5.5 open audit API - or use your time and talent to build your own solution MySQL Enterprise Audit - or use the out of the box, ready for production solution from MySQL Simple, step-by-step process for installing, enabling and configuring the MySQL Enterprise Audit plugin for use with existing apps New variables and options for tuning the MySQL Enterprise Audit plugin for your specific use case Best practices for securing and managing audit log files and archived images Roadmap for adding an integrated solution around MySQL Enterprise Audit for MySQL only and Oracle/MySQL shops You can learn all the technical details on MySQL Enterprise Audit in the MySQL docs and learn all about MySQL Enterprise Edition and Auditing here. As always, thanks for your support of MySQL!

    Read the article

  • Google Fonts API JSON Data in WordPress Options-Framework-Theme

    - by Rob
    I'm developing a child-theme off of the new Twenty Twelve theme using Wordpress 3.4.2 and the development version of the Options Theme Framework by Devin Price. In Devin's tutorial, it shows of a way to implement 15 Google Web Fonts into the Theme Options page, but not all of them (roughly 560). I know I can create a "manual list", like in the tutorial that states each one with fallbacks, but this is time consuming and unproductive as Google may or may not add to, update, change or remove some of these fonts from their list. The list I've created above will ultimately store unavailable fonts the user thinks is there because of what they can see in the drop-down menu and it won't have any new ones - making the list and some selections obsolete. On the Google Developer API Web Fonts page, it talks briefly on retrieving a "dynamic list" using JSON/JavaScript. I was wondering how would I be able to pull the Google Web Fonts API into my Wordpress Theme Options page so I'm not creating my own list or have to constantly release an update to solve this issue. Could someone please walk me through what I would need to paste into my options.php, functions.php, /inc/options-framework.php file etc. or even in a new one to implement this? I've also had a look into some screencasts, plugins and tutorials on how it works, but none of them are specific enough for people just starting out. Please keep in mind I'm not the best coder... Thank you.

    Read the article

  • More BI Showcase Events - Greensboro, NC & Tampa, FL

    - by Rob Reynolds
    As the momentum around OBIEE 11g continues, we are providing more opportunities to get a hands on view of the new technology via our Oracle Business Intelligence Showcases. Next week we will have Showcases in Greensboro, NC and Tampa, FL. I will be presenting at both, so please stop by and say hello, while learning about the latest in Oracle BI & DW technology. Pre-registration is required. You can register for the events at the links below: Greensboro, NC - Tuesday December 7, 2011 Tampa, FL - Wednesday, December 8, 2011 Session Agenda: Agenda 9:00 a.m. – 10:00 a.m. Registration and Welcome 10:00 a.m. – 11:00 a.m. Session Keynote: Oracle’s New Generation of Business Intelligence Solutions and Innovations 11:00 a.m. – 12:00 noon Session 1 Track 1Oracle Business Intelligence Enterprise Edition 11g: End User Experience Track 2Management Reporting with Oracle Essbase 12:00 noon – 1:00 p.m. Networking Lunch 1:00 p.m. – 2:00 p.m. Session 2 Track 1Oracle Business Intelligence Enterprise Edition 11g for Power Users, Developers, and Administrators Track 2Oracle BI Applications: The Value of Cross-Functional BI Break to change rooms 2:00 p.m.– 3:00 p.m. Session 3 Track 1 Extreme Performance Data Warehousing Track 2Master Data Management: The Single Source of Truth for Real Time Decisions 3:15 p.m. Wrap-Up and Raffle Prize

    Read the article

  • CSS variable height columns [migrated]

    - by Rob
    I have created a website, www.unionfamilies.com. I have a header, a main section consisting of two columns, and a footer. Currently, I have specified heights for header, main, footer, etc. I would like to make the site so the header stays on top, the columns adjust to match the height of the longest column, and the footer stays at the bottom. Can someone help me with this? I am new to CSS, so please be patient. Thank you.

    Read the article

  • 24 Hours of PASS: Whine, Whine, Whine

    - by Most Valuable Yak (Rob Volk)
    24 Hours of Pass (or 24HOP) is a great program offered by PASS to provide free, online training for anyone who wants to learn more about SQL Server.  They routinely have the best SQL Server presenters available for these sessions, and attract hundreds, perhaps even a thousand attendees from around the world.  This is definitely one of the best things they've started doing in the past few years, and every session I've attended has been excellent. So why am I so grumpy about it? I'm not really, pretty much everything here is a minor annoyance that I can deal with.  However since they're so minor they seem to be things that can be easily corrected and would make the process much better. First off, this is my biggest gripe, the registration page: https://www323.livemeeting.com/lrs/8000181573/Registration.aspx?pageName=lj6378f4fhf5hpdm What grinds my gears about this?  I have to scroll alllllllllllllllllllllllllllll the way to bottom to actually register for the sessions.  This wouldn't be so bad except all the details of the session, including the presenter, is in a separate list at the top.  Both lists contain info the other does not, and scrolling between them to determine "Should I make time to listen to this?  Who is speaking at this time anyway?" is really unnecessary. My preference would be to keep the top list and add the checkboxes and schedule info in separate columns.  This is a full-width design, so there's plenty of space for this data, which is pretty small anyway.  The other huge benefit is halving the size of the page, which improves performance and lowers bandwidth usage considerably.  And if you know HTML/ASP.Net, and you view the page source, you can find PLENTY of other things that can be reduced even further.  (not just viewstate) One nice thing that PASS does is send iCal reminders to your email address so you can accept them to your calendar.  Again, they leave off the presenter in the appointment details, while still duplicating the meeting title in the body.  Sometimes I make decisions based on speaker rather than content (Natalie Portman is reading the Yellow Pages??? I'M THERE!) and having the speaker in the iCal is helpful. Next minor annoyances are the necessity for providing a company name, and the survey questions.  I know PASS needs to market themselves effectively, and they need information to do that, and since this is a free event it's really not worth complaining about, but why ask the survey question twice? (once at registration, once again when joining the LiveMeeting)  Same thing for the company name.  All of this should be tied to email address, so that's all I should need to enter when joining the LiveMeeting. The last one is also minor, but it irks me in this day and age of multiple browsers and the decline of Internet Explorer as a dominant platform.  The registration page was originally created in Visual Studio 2003, and has a lot of IE-specific crud representative of the browser situation of 2003. (IE5 references? really? and is the aforementioned viewstate big enough?)  This causes some grief with other browsers like Firefox, Chrome, and sometimes IE8 or 9.  And don't get me started on using the page on a Mac or in Safari. My main point is that PASS is an international organization, welcoming everyone from all levels of SQL Server proficiency, and in that spirit I think it would help to accommodate a wider range of browser software, especially since the registration page is extremely simple.  I recognize that this page is not hosted on the PASS website and may be maintained by some division of Microsoft, but to me that's even worse if MS can't update their own pages.  They've deprecated IE6, so they don't need to maintain support on their own websites anymore. OK, I'll shut up now. #sqlpass #24HOP

    Read the article

  • Thank You MySQL Community! MySQL 5.6.9 Release Candidate Available Now!

    - by Rob Young
    The MySQL Community continues its good work in testing and refining MySQL 5.6, and as such the next iteration of the 5.6 Release Candidate is now available for download.  You can get MySQL 5.6.9 here (look under the "Development Releases" tab).  This version is the result of feedback we have gotten since MySQL 5.6.7 was announced at MySQL Connect in late September. As iron sharpens iron, Community feedback sharpens the quality and performance of MySQL so please download 5.6.9 and let us know how we can improve it as we move toward the production-ready product release in early 2013. MySQL 5.6 is designed to meet the agility demands of the next generation of web apps and services and includes across the board improvements to the Optimizer, InnoDB performance/scale and online DDL operations, self-healing Replication, Performance Schema Instrumentation, Security and developer enabling NoSQL functionality.  You can learn all the details and follow MySQL Engineering blogs on all of the key features in this MySQL DevZone article. On a related note, plan to join this week's live webinars to learn more about MySQL 5.6 Self-Healing Replication Clusters and Building the Next Generation of Web, Cloud, SaaS, Embedded Application and Services with MySQL 5.6.  Hurry!  Seating is limited!  As always, thanks for your continued support of MySQL!

    Read the article

< Previous Page | 5 6 7 8 9 10 11 12 13 14 15 16  | Next Page >