Search Results

Search found 637 results on 26 pages for 'tony wolfram'.

Page 3/26 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

Monitoring the Application alongside SQL Server

- by Tony Davis

Sometimes, on Simple-Talk, it takes a while to spot strange and unexpected patterns of user activity, or small bugs. For example, one morning we spotted that an article’s comment count had leapt to 1485, but that only four were displayed. With some rooting around in Google Analytics, and the endlessly annoying Community Server admin-interface, we were able to work out that a few days previously the article had been subject to a spam attack and that the comment count was for some reason including both accepted and unaccepted comments (which in turn uncovered a bug in the SQL). This sort of incident made us a lot keener on monitoring Simple-talk website usage more effectively. However, the metrics we wanted are troublesome, because they are far too specific for Google Analytics to measure, and the SQL Server backend doesn’t keep sufficient information to enable us to plot trends. The latter could provide, for example, the total number of comments made on, or votes cast for, articles, over all time, but not the number that occur by hour over a set time. We lacked a baseline, in other words. We couldn’t alter the database, as it is a bought-in package. We had neither the resources nor inclination to build-in dedicated application monitoring. Possibly, we could investigate a third-party tool to do the job; but then it occurred to us that we were already using a monitoring tool (SQL Monitor) to keep an eye on the database. It stored data, made graphs and sent alerts. Could we get it to monitor some aspects of the application as well? Of course, SQL Monitor’s single purpose is to check and monitor SQL Server, over time, rather than to monitor applications that use SQL Server. However, how different is the business of gathering and plotting SQL Server Wait Stats, from gathering and plotting various aspects of user activity on the site? Not a lot, it turns out. The latest version allows us to write our own custom monitoring scripts, meaning that we could now monitor any metric in the application that returns an integer. It took little time to write a simple SQL Query that collects basic metrics of the total number of subscribers, votes cast, comments made, or views of articles, over time. The SQL Monitor database polls Simple-Talk every second or so in order to get the latest totals, and can then store and plot this information, or even correlate SQL Server usage to application usage. You can see the live data by visiting monitor.red-gate.com. Click the "Analysis" tab, and select one of the "Simple-talk:" entries in the "Show" box and an appropriate data range (e.g. last 30 days). It’s nascent, and we’re still working on it, but it’s already given us more confidence that we’ll spot quickly trends, bugs, or bursts of ‘abnormal’ activity. If there is a sudden rise in comments, we get an alert, and if it’s due to a spam attack, we can moderate or ban the perpetrator very quickly. We’ve often argued that a tool should perform a single job well rather than turn into a Swiss-army knife, but ironically we’ve rather appreciated being able to make best use of what’s there anyway for a slightly different purpose. Is this a good or common practice? What do you think? Cheers, Tony.

Read the article
From NaN to Infinity...and Beyond!

- by Tony Davis

It is hard to believe that it was once possible to corrupt a SQL Server Database by storing perfectly normal data values into a table; but it is true. In SQL Server 2000 and before, one could inadvertently load invalid data values into certain data types via RPC calls or bulk insert methods rather than DML. In the particular case of the FLOAT data type, this meant that common 'special values' for this type, namely NaN (not-a-number) and +/- infinity, could be quite happily plugged into the database from an application and stored as 'out-of-range' values. This was like a time-bomb. When one then tried to query this data; the values were unsupported and so data pages containing them were flagged as being corrupt. Any query that needed to read a column containing the special value could fail or return unpredictable results. Microsoft even had to issue a hotfix to deal with failures in the automatic recovery process, caused by the presence of these NaN values, which rendered the whole database inaccessible! This problem is history for those of us on more current versions of SQL Server, but its ghost still haunts us. Recently, for example, a developer on Red Gate’s SQL Response team reported a strange problem when attempting to load historical monitoring data into a SQL Server 2005 database via the C# ADO.NET provider. The ratios used in some of their reporting calculations occasionally threw out NaN or infinity values, and the subsequent attempts to load these values resulted in a nasty error. It turns out to be a different manifestation of the same problem. SQL Server 2005 still does not fully support the IEEE 754 standard for floating point numbers, in that the FLOAT data type still cannot handle NaN or infinity values. Instead, they just added validation checks that prevent the 'invalid' values from being loaded in the first place. For people migrating from SQL Server 2000 databases that contained out-of-range FLOAT (or DATETIME etc.) data, to SQL Server 2005, Microsoft have added to the latter's version of the DBCC CHECKDB (or CHECKTABLE) command a DATA_PURITY clause. When enabled, this will seek out the corrupt data, but won’t fix it. You have to do this yourself in what can often be a slow, painful manual process. Our development team, after a quizzical shrug of the shoulders, simply decided to represent NaN and infinity values as NULL, and move on, accepting the minor inconvenience of not being able to tell them apart. However, what of scientific, engineering and other applications that really would like the luxury of being able to both store and access these perfectly-reasonable floating point data values? The sticking point seems to be the stipulation in the IEEE 754 standard that, when NaN is compared to any other value including itself, the answer is "unequal" (i.e. FALSE). This is clearly different from normal number comparisons and has repercussions for such things as indexing operations. Even so, this hardly applies to infinity values, which are single definite values. In fact, there is some encouraging talk in the Connect note on this issue that they might be supported 'in the SQL Server 2008 timeframe'. If didn't happen; SQL 2008 doesn't support NaN or infinity values, though one could be forgiven for thinking otherwise, based on the MSDN documentation for the FLOAT type, which states that "The behavior of float and real follows the IEEE 754 specification on approximate numeric data types". However, the truth is revealed in the XPath documentation, which states that "…float (53) is not exactly IEEE 754. For example, neither NaN (Not-a-Number) nor infinity is used…". Is it really so hard to fix this problem the right way, and properly support in SQL Server the IEEE 754 standard for the floating point data type, NaNs, infinities and all? Oracle seems to have managed it quite nicely with its BINARY_FLOAT and BINARY_DOUBLE types, so it is technically possible. We have an enterprise-class database that is marketed as being part of an 'integrated' Windows platform. Absurdly, we have .NET and XPath libraries that fully support the standard for floating point numbers, and we can't even properly store these values, let alone query them, in the SQL Server database! Cheers, Tony.

Read the article
A temporary disagreement

- by Tony Davis

Last month, Phil Factor caused a furore amongst some MVPs with an article that attempted to offer simple advice to developers regarding the use of table variables, versus local and global temporary tables, in their code. Phil makes clear that the table variables do come with some fairly major limitations.no distribution statistics, no parallel query plans for queries that modify table variables.but goes on to suggest that for reasonably small-scale strategic uses, and with a bit of due care and testing, table variables are a "good thing". Not everyone shares his opinion; in fact, I imagine he was rather aghast to learn that there were those felt his article was akin to pulling the pin out of a grenade and tossing it into the database; table variables should be avoided in almost all cases, according to their advice, in favour of temp tables. In other words, a fairly major feature of SQL Server should be more-or-less 'off limits' to developers. The problem with temp tables is that, because they are scoped either in the procedure or the connection, it is easy to allow them to hang around for too long, eating up precious memory and bulking up the shared tempdb database. Unless they are explicitly dropped, global temporary tables, and local temporary tables created within a connection rather than within a stored procedure, will persist until the connection is closed or, with connection pooling, until the connection is reused. It's also quite common with ASP.NET applications to have connection leaks, as Bill Vaughn explains in his chapter in the "SQL Server Deep Dives" book, meaning that the web page exits without closing the connection object, maybe due to an error condition. This will then hang around in the heap for what might be hours before picked up by the garbage collector. Table variables are much safer in this regard, since they are batch-scoped and so are cleaned up automatically once the batch is complete, which also means that they are intuitive to use for the developer because they conform to scoping rules that are closer to those in procedural code. On the surface then, an ideal way to deal with issues related to tempdb memory hogging. So why did Phil qualify his recommendation to use Table Variables? This is another of those cases where, like scalar UDFs and table-valued multi-statement UDFs, developers can sometimes get into trouble with a relatively benign-looking feature, due to way it's been implemented in SQL Server. Once again the biggest problem is how they are handled internally, by the SQL Server query optimizer, which can make very poor choices for JOIN orders and so on, in the absence of statistics, especially when joining to tables with highly-skewed data. The resulting execution plans can be horrible, as will be the resulting performance. If the JOIN is to a large table, that will hurt. Ideally, Microsoft would simply fix this issue so that developers can't get burned in this way; they've been around since SQL Server 2000, so Microsoft has had a bit of time to get it right. As I commented in regard to UDFs, when developers discover issues like with such standard features, the database becomes an alien planet to them, where death lurks around each corner, and they continue to avoid these "killer" features years after the problems have been eventually resolved. In the meantime, what is the right approach? Is it to say "hammers can kill, don't ever use hammers", or is it to try to explain, as Phil's article and follow-up blog post have tried to do, what the feature was intended for, why care must be applied in its use, and so enable developers to make properly-informed decisions, without requiring them to delve deep into the inner workings of SQL Server? Cheers, Tony.

Read the article
Inappropriate Updates?

- by Tony Davis

A recent Simple-talk article by Kathi Kellenberger dissected the fastest SQL solution, submitted by Peter Larsson as part of Phil Factor's SQL Speed Phreak challenge, to the classic "running total" problem. In its analysis of the code, the article re-ignited a heated debate regarding the techniques that should, and should not, be deemed acceptable in your search for fast SQL code. Peter's code for running total calculation uses a variation of a somewhat contentious technique, sometimes referred to as a "quirky update": SET @Subscribers = Subscribers = @Subscribers + PeopleJoined - PeopleLeft This form of the UPDATE statement, @variable = column = expression, is documented and it allows you to set a variable to the value returned by the expression. Microsoft does not guarantee the order in which rows are updated in this technique because, in relational theory, a table doesn’t have a natural order to its rows and the UPDATE statement has no means of specifying the order. Traditionally, in cases where a specific order is requires, such as for running aggregate calculations, programmers who used the technique have relied on the fact that the UPDATE statement, without the WHERE clause, is executed in the order imposed by the clustered index, or in heap order, if there isn’t one. Peter wasn’t satisfied with this, and so used the ingenious device of assuring the order of the UPDATE by the use of an "ordered CTE", based on an underlying temporary staging table (a heap). However, in either case, the ordering is still not guaranteed and, in addition, would be broken under conditions of parallelism, or partitioning. Many argue, with validity, that this reliance on a given order where none can ever be guaranteed is an abuse of basic relational principles, and so is a bad practice; perhaps even irresponsible. More importantly, Microsoft doesn't wish to support the technique and offers no guarantee that it will always work. If you put it into production and it breaks in a later version, you can't file a bug. As such, many believe that the technique should never be tolerated in a production system, under any circumstances. Is this attitude justified? After all, both forms of the technique, using a clustered index to guarantee the order or using an ordered CTE, have been tested rigorously and are proven to be robust; although not guaranteed by Microsoft, the ordering is reliable, provided none of the conditions that are known to break it are violated. In Peter's particular case, the technique is being applied to a temporary table, where the developer has full control of the data ordering, and indexing, and knows that the table will never be subject to parallelism or partitioning. It might be argued that, in such circumstances, the technique is not really "quirky" at all and to ban it from your systems would server no real purpose other than to deprive yourself of a reliable technique that has uses that extend well beyond the running total calculations. Of course, it is doubly important that such a technique, including its unsupported status and the assumptions that underpin its success, is fully and clearly documented, preferably even when posting it online in a competition or forum post. Ultimately, however, this technique has been available to programmers throughout the time Sybase and SQL Server has existed, and so cannot be lightly cast aside, even if one sympathises with Microsoft for the awkwardness of maintaining an archaic way of doing updates. After all, a Table hint could easily be devised that, if specified in the WITH (<Table_Hint_Limited>) clause, could be used to request the database engine to do the update in the conventional order. Then perhaps everyone would be satisfied. Cheers, Tony.

Read the article
Access Denied

- by Tony Davis

When Microsoft executives wake up in the night screaming, I suspect they are having a nightmare about their own version of Frankenstein's monster. Created with the best of intentions, without thinking too hard of the long-term strategy, and having long outlived its usefulness, the monster still lives on, occasionally wreaking vengeance on the innocent. Its name is Access; a living synthesis of disparate body parts that is resistant to all attempts at a mercy-killing. In 1986, Microsoft had no database products, and needed one for their new OS/2 operating system, the successor to MSDOS. In 1986, they bought exclusive rights to Sybase DataServer, and were also intent on developing a desktop database to capture Ashton-Tate's dominance of that market, with dbase. This project, first called 'Omega' and later 'Cirrus', eventually spawned two products: Visual Basic in 1991 and Access in late 1992. Whereas Visual Basic battled with PowerBuilder for dominance in the client-server market, Access easily won the desktop database battle, with Dbase III and DataEase falling away. Access did an excellent job of abstracting and simplifying the task of building small database applications in a short amount of time, for a small number of departmental users, and often for a transient requirement. There is an excellent front end and forms generator. We not only see it in Access but parts of it also reappear in SSMS. It's good. A business user can pull together useful reports, without relying on extensive technical support. A skilled Access programmer can deliver a fairly sophisticated application, whilst the traditional client-server programmer is still sharpening his pencil. Even for the SQL Server programmer, the forms generator of Access is useful for sketching out application designs. So far, so good, but here's where the problems start; Access ties together two different products and the backend of Access is the bugbear. The limitations of Jet/ACE are well-known and documented. They range from MDB files that are prone to corruption, especially as they grow in size, pathetic security, and "copy and paste" Backups. The biggest problem though, was an infamous lack of scalability. Because Microsoft never realized how long the product would last, they put little energy into improving the beast. Microsoft 'ate their own dog food' by using Access for Microsoft Exchange and Outlook. They choked on it. For years, scalability and performance problems with Exchange Server have been laid at the door of the Jet Blue engine on which it relies. Substantial development work in Exchange 2010 was required, just in order to improve the engine and storage schema so that it more efficiently handled the reading and writing of mails. The alternative of using SQL Server just never panned out. The Jet engine was designed to limit concurrent users to a small number (10-20). When Access applications outgrew this, bitter experience proved that there really is no easy upgrade path from Access to SQL Server, beyond rewriting the whole lot from scratch. The various initiatives to do this never quite bridged the cultural gulf between Access and a true relational database So, what are the obvious alternatives for small, strategic database applications? I know many users who, for simple 'list maintenance' requirements are very happy using Excel databases. Surely, now that PowerPivot has led the way, it is time for Microsoft to offer a new RAD package for database application development; namely an Excel-based front end for SQL Server Express. In that way, we'll have a powerful and familiar front end, to a scalable database, and a clear upgrade path when an app takes off and needs to go enterprise. Cheers, Tony.

Read the article
DAC pack up all your troubles

- by Tony Davis

Visual Studio 2010, or perhaps its apparently-forthcoming sister, "SQL Studio", is being geared up to become the natural way for developers to create databases. Central to this drive is the introduction of 'data-tier application components', or DACs. Applications are developed as normal but when it comes to deployment, instead of supplying the DBA with a bunch of scripts to create the required database objects, the developer creates a single DAC Package ("DAC Pack"); a zipped XML file containing all the database objects needed by the application, along with versioning information, policies for deployment, and so on. It's an intriguing prospect. Developers can work on their development database using their existing tools and source control, and then package up the changes into a single DACPAC for deployment and management. DBAs get an "application level view" of how their instances are being used and the ability to collectively, rather than individually, manage the objects. The DBA needing to manage a large number of relatively small databases can use "DAC snapshots" to get a quick overview of what has changed across all the databases they manage. The reason that DAC packs haven't caused more excitement is that they can only be pushed to SQL Server 2008 R2, and they must be developed or inspected using Visual Studio 2010. Furthermore, what we see right now in VS2010 is more of a 'work-in-progress' or 'vision of the future', with serious shortcomings and restrictions that render it unsuitable for anything but small 'non-critical' departmental databases. The first problem is that DAC packs support a limited set of schema objects (corresponding closely to the features available on 'Azure'). This means that Service Broker queues, CLR Objects, and perhaps most critically security (permissions, certificates etc.), are off-limits. Applications that require these objects will need to add them via a post-deployment TSQL script, rather defeating the whole idea. More worrying still is the process for altering a database with a DAC pack. The grand 'collective' philosophy, whereby a single XML file can be used for deploying and managing builds and changes, extends, unfortunately, to database upgrades. Any change to a database object will result in the creation of a new database, copying the data from the old version, nuking the previous one, and then renaming the new one. Simple eh? The problem is that even something as trivial as adding a comment to a stored procedure in a 5GB database will require the server to find at least twice as much space, as well sufficient elbow-room in the transaction log for copying the largest table. Of course, you'll need to take the database offline for the full course of the deployment, which is likely to take a long time if there is a lot of data. This upgrade/rename process breaks the log chain, makes any subsequent full restore operation highly complicated, and will also break log shipping. As with any grand vision, the devil is always in the detail. It's hard to fathom why Microsoft hasn't used a SQL Compare-style approach to the upgrade process, altering a database with a change script, and this will surely be adopted in the near future. Something had to be in place for VS2010, but right now DAC packs only make sense for Azure. For this, they're cute, but hardly compelling. Nevertheless, DBAs would do well to get familiar with VS 2010 and DAC packs. Like it or not, they're both coming. Cheers, Tony.

Read the article
Sweet and Sour Source Control

- by Tony Davis

Most database developers don't use Source Control. A recent anonymous poll on SQL Server Central asked its readers "Which Version Control system do you currently use to store you database scripts?" The winner, with almost 30% of the vote was...none: "We don't use source control for database scripts". In second place with almost 28% of the vote was Microsoft's VSS. VSS? Given its reputation for being buggy, unstable and lacking most of the basic features required of a proper source control system, answering VSS is really just another way of saying "I don't use Source Control". At first glance, it's a surprising thought. You wonder how database developers can work in a team and find out what changed, when the system worked before but is now broken; to work out what happened to their changes that now seem to have vanished; to roll-back a mistake quickly so that the rest of the team have a functioning build; to find instantly whether a suspect change has been deployed to production. Unfortunately, the survey didn't ask about the scale of the database development, and correlate the two questions. If there is only one database developer within a schema, who has an automated approach to regular generation of build scripts, then the need for a formal source control system is questionable. After all, a database stores far more about its metadata than a traditional compiled application. However, what is meat for a small development is poison for a team-based development. Here, we need a form of Source Control that can reconcile simultaneous changes, store the history of changes, derive versions and builds and that can cope with forks and merges. The problem comes when one borrows a solution that was designed for conventional programming. A database is not thought of as a "file", but a vast, interdependent and intricate matrix of tables, indexes, constraints, triggers, enumerations, static data and so on, all subtly interconnected. It is an awkward fit. Subversion with its support for merges and forks, and the tolerance of different work practices, can be made to work well, if used carefully. It has a standards-based architecture that allows it to be used on all platforms such as Windows Mac, and Linux. In the words of Erland Sommerskog, developers should "just do it". What's in a database is akin to a "binary file", and the developer must work only from the file. You check out the file, edit it, and save it to disk to compile it. Dependencies are validated at this point and if you've broken anything (e.g. you renamed a column and broke all the objects that reference the column), you'll find out about it right away, and you'll be forced to fix it. Nevertheless, for many this is an alien way of working with SQL Server. Subversion is the powerhouse, not the GUI. It doesn't work seamlessly with your existing IDE, and that usually means SSMS. So the question then becomes more subtle. Would developers be less reluctant to use a fully-featured source (revision) control system for a team database development if they had a turn-key, reliable system that fitted in with their existing work-practices? I'd love to hear what you think. Cheers, Tony.

Read the article
Cloud Backup: Getting the Users' Backs Up

- by Tony Davis

On Wednesday last week, Microsoft announced that as of July 1, all data transfers into its Microsoft Azure cloud will be free (though you have to pay for transferring data out). On Thursday last week, SQL Azure in Western Europe went down. It was a relatively short outage, but since SQL Azure currently provides no easy way to take a standard backup of a database and store it locally, many people had no recourse but to wait patiently for their cloud-based app to resume. It seems that Microsoft are very keen encourage developers to move their data onto their cloud, but are developers ready to do it, given that such basic backup capabilities are lacking? Recently on Simple-Talk, Mike Mooney described a perfect use case for the Microsoft Cloud. They had a simple web-based application with a SQL Server backend; they could move the application to Windows Azure, and the data into SQL Azure and in the process free themselves from much of the hassle surrounding management and scaling of the hardware, network and so on. It was a great fit and yet it nearly didn't happen; lack of support for the BACKUP command almost proved a show-stopper. Of course, backups of Azure databases are always and have always been taken automatically, for disaster recovery purposes, but these are strictly on-cloud copies and as of now it is not possible to use them to them to restore a database to a particular point in time. It seems that none of those clever Microsoft people managed to predict the need to perform basic backups of Azure databases so that copies could be stored locally, outside the Azure universe. At the very least, as Mike points out, performing a local backup before a new deployment is more or less mandatory. Microsoft did at least note the sound of gnashing teeth and, as a stop-gap measure, offered SQL Azure Database Copy which basically allows you to create an online clone of your database, but this doesn't allow for storing local archives of the data. To that end MS has provided SQL Azure Import/Export, to package up and export a database and its data, using BACPACs. These BACPACs do not guarantee transactional consistency; for example, if a child table is modified after the parent is copied, then the copied database will be in inconsistent state (meaning, to add to the fun, BACPACs need to be created from a database copy). In any event, widespread problems with BACPAC's evil cousin, the DACPAC have been well-documented, and it seems likely that many will also give BACPAC the bum's rush. Finally, in a TechEd 2011 presentation tagged "SQL Azure Advanced Administration", it was announced that "backup and restore" were coming in the next SQL Azure CTP. And yet this still doesn't mean that we'll get simple backups as DBAs know and love them. What it does mean, at least, is the ability to restore any given database to a point in time within a 2-week window. For the time being, if you want a local copy of your data and don't want to brave the BACPAC, one is left with SSIS or BCP, creative use of schema and data comparison tools, or use of SQL Azure Backup (currently in beta) in order to perform this simple but vital task. Cheers, Tony.

Read the article
SQL Server Optimizer Malfunction?

- by Tony Davis

There was a sharp intake of breath from the audience when Adam Machanic declared the SQL Server optimizer to be essentially "stuck in 1997". It was during his fascinating "Query Tuning Mastery: Manhandling Parallelism" session at the recent PASS SQL Summit. Paraphrasing somewhat, Adam (blog | @AdamMachanic) offered a convincing argument that the optimizer often delivers flawed plans based on assumptions that are no longer valid with today’s hardware. In 1997, when Microsoft engineers re-designed the database engine for SQL Server 7.0, SQL Server got its initial implementation of a cost-based optimizer. Up to SQL Server 2000, the developer often had to deploy a steady stream of hints in SQL statements to combat the occasionally wilful plan choices made by the optimizer. However, with each successive release, the optimizer has evolved and improved in its decision-making. It is still prone to the occasional stumble when we tackle difficult problems, join large numbers of tables, perform complex aggregations, and so on, but for most of us, most of the time, the optimizer purrs along efficiently in the background. Adam, however, challenged further any assumption that the current optimizer is competent at providing the most efficient plans for our more complex analytical queries, and in particular of offering up correctly parallelized plans. He painted a picture of a present where complex analytical queries have become ever more prevalent; where disk IO is ever faster so that reads from disk come into buffer cache faster than ever; where the improving RAM-to-data ratio means that we have a better chance of finding our data in cache. Most importantly, we have more CPUs at our disposal than ever before. To get these queries to perform, we not only need to have the right indexes, but also to be able to split the data up into subsets and spread its processing evenly across all these available CPUs. Improvements such as support for ColumnStore indexes are taking things in the right direction, but, unfortunately, deficiencies in the current Optimizer mean that SQL Server is yet to be able to exploit properly all those extra CPUs. Adam’s contention was that the current optimizer uses essentially the same costing model for many of its core operations as it did back in the days of SQL Server 7, based on assumptions that are no longer valid. One example he gave was a "slow disk" bias that may have been valid back in 1997 but certainly is not on modern disk systems. Essentially, the optimizer assesses the relative cost of serial versus parallel plans based on the assumption that there is no IO cost benefit from parallelization, only CPU. It assumes that a single request will saturate the IO channel, and so a query would not run any faster if we parallelized IO because the disk system simply wouldn’t be able to handle the extra pressure. As such, the optimizer often decides that a serial plan is lower cost, often in cases where a parallel plan would improve performance dramatically. It was challenging and thought provoking stuff, as were his techniques for driving parallelism through query logic based on subsets of rows that define the "grain" of the query. I highly recommend you catch the session if you missed it. I’m interested to hear though, when and how often people feel the force of the optimizer’s shortcomings. Barring mistakes, such as stale statistics, how often do you feel the Optimizer fails to find the plan you think it should, and what are the most common causes? Is it fighting to induce it toward parallelism? Combating unexpected plans, arising from table partitioning? Something altogether more prosaic? Cheers, Tony.

Read the article
Head in the Clouds

- by Tony Davis

We're just past the second anniversary of the launch of Windows Azure. A couple of years' experience with Azure in the industry has provided some obvious success stories, but has deflated some of the initial marketing hyperbole. As a general principle, Azure seems to work well in providing a Service-Oriented Architecture for services in enterprises that suffer wide fluctuations in demand. Instead of being obliged to provide hardware sufficient for the occasional peaks in demand, one can hire capacity only when it is needed, and the cost of hosting an application is no longer a capital cost. It enables companies to avoid having to scale out hardware for peak periods only to see it underused for the rest of the time. A customer-facing application such as a concert ticketing system, which suffers high demand in short, predictable bursts of activity, is a great example of an application that would work well in Azure. However, moving existing applications to Azure isn't something to be done on impulse. Unless your application is .NET-based, and consists of 'stateless' components that communicate via queues, you are probably in for a lot of redevelopment work. It makes most sense for IT departments who are already deep in this .NET mindset, and who also want 'grown-up' methods of staging, testing, and deployment. Azure fits well with this culture and offers, as a bonus, good Visual Studio integration. The most-commonly stated barrier to porting these applications to Azure is the problem of reconciling the use of the cloud with legislation for data privacy and security. Putting databases in the cloud is a sticky issue for many and impossible for some due to compliance and security issues, the need for direct control over data, and so on. In the face of feedback from the early adopters of Azure, Microsoft has broadened the architectural choices to cater for a wide range of requirements. As well as SQL Azure Database (SAD) and Azure storage, the unstructured 'BLOB and Entity-Attribute-Value' NoSQL storage alternative (which equates more closely with folders and files than a database), Windows Azure offers a wide range of storage options including use of services such as oData: developers who are programming for Windows Azure can simply choose the one most appropriate for their needs. Secondly, and crucially, the Windows Azure architecture allows you the freedom to produce hybrid applications, where only those parts that need cloud-based hosting are deployed to Azure, whereas those parts that must unavoidably be hosted in a corporate datacenter can stay there. By using a hybrid architecture, it will seldom, if ever, be necessary to move an entire application to the cloud, along with personal and financial data. For example that we could port to Azure only put those parts of our ticketing application that capture and process tickets orders. Once an order is captured, the financial side can be processed in our own data center. In short, Windows Azure seems to be a very effective way of providing services that are subject to wide but predictable fluctuations in demand. Have you come to the same conclusions, or do you think I've got it wrong? If you've had experience with Azure, would you recommend it? It would be great to hear from you. Cheers, Tony.

Read the article
Going by the eBook

- by Tony Davis

The book and magazine publishing world is rapidly going digital, and the industry is faced with making drastic changes to their ways of doing business. The sudden take-up of digital readers by the book-buying public has surprised even the most technological-savvy of the industry. Printed books just aren't selling like they did. In contrast, eBooks are doing well. The ePub file format is the standard around which all publishers are converging. ePub is a standard for formatting book content, so that it can be reflowed for various devices, with their widely differing screen-sizes, and can be read offline. If you unzip an ePub file, you'll find familiar formats such as XML, XHTML and CSS. This is both a blessing and a curse. Whilst it is good to be able to use familiar technologies that have been developed to a level of considerable sophistication, it doesn't get us all the way to producing a viable publication. XHTML is a page-description language, not a book-description language, as we soon found out during our initial experiments, when trying to specify headers, footers, indexes and chaptering. As a result, it is difficult to predict how any particular eBook application will decide to render a book. There isn't even a consensus as to how the cover image is specified. All of this is awkward for the publisher. Each book must be created and revised in a form from which can be generated a whole range of 'printed media', from print books, to Mobi for kindles, ePub for most Tablets and SmartPhones, HTML for excerpted chapters on websites, and a plethora of other formats for other eBook readers, each with its own idiosyncrasies. In theory, if we can get our content into a clean, semantic XML form, such as DOCBOOKS, we can, from there, after every revision, perform a series of relatively simple XSLT transformations to output anything from a HTML article, to an ePub file for reading on an iPad, to an ICML file (an XML-based file format supported by the InDesign tool), ready for print publication. As always, however, the task looks bigger the closer you get to the detail. On the way to the utopian world of an XML-based book format that encompasses all the diverse requirements of the different publication media, ePub looks like a reasonable format to adopt. Its forthcoming support for HTML 5 and CSS 3, with ePub 3.0, means that features, such as widow-and-orphan controls, multi-column flow and multi-media graphics can be incorporated into eBooks. This starts to make it possible to build an "app-like" experience into the eBook and to free publishers to think of putting context before container; to think of what content is required, be it graphical, textual or audio, from the point of view of the user, rather than what's possible in a given, traditional book "Container". In the meantime, there is a gap between what publishers require and what current technology can provide and, of course building this app-like experience is far from plain sailing. Real portability between devices is still a big challenge, and achieving the sort of wizardry seen in the likes of Theodore Grey's "Elements" eBook will require some serious device-specific programming skills. Cheers, Tony.

Read the article
The long road to bug-free software

- by Tony Davis

The past decade has seen a burgeoning interest in functional programming languages such as Haskell or, in the Microsoft world, F#. Though still on the periphery of mainstream programming, functional programming concepts are gradually seeping into the imperative C# language (for example, Lambda expressions have their root in functional programming). One of the more interesting concepts from functional programming languages is the use of formal methods, the lofty ideal behind which is bug-free software. The idea is that we write a specification that describes exactly how our function (say) should behave. We then prove that our function conforms to it, and in doing so have proved beyond any doubt that it is free from bugs. All programmers already use one form of specification, specifically their programming language's type system. If a value has a specific type then, in a type-safe language, the compiler guarantees that value cannot be an instance of a different type. Many extensions to existing type systems, such as generics in Java and .NET, extend the range of programs that can be type-checked. Unfortunately, type systems can only prevent some bugs. To take a classic problem of retrieving an index value from an array, since the type system doesn't specify the length of the array, the compiler has no way of knowing that a request for the "value of index 4" from an array of only two elements is "unsafe". We restore safety via exception handling, but the ideal type system will prevent us from doing anything that is unsafe in the first place and this is where we start to borrow ideas from a language such as Haskell, with its concept of "dependent types". If the type of an array includes its length, we can ensure that any index accesses into the array are valid. The problem is that we now need to carry around the length of arrays and the values of indices throughout our code so that it can be type-checked. In general, writing the specification to prove a positive property, even for a problem very amenable to specification, such as a simple sorting algorithm, turns out to be very hard and the specification will be different for every program. Extend this to writing a specification for, say, Microsoft Word and we can see that the specification would end up being no simpler, and therefore no less buggy, than the implementation. Fortunately, it is easier to write a specification that proves that a program doesn't have certain, specific and undesirable properties, such as infinite loops or accesses to the wrong bit of memory. If we can write the specifications to prove that a program is immune to such problems, we could reuse them in many places. The problem is the lack of specification "provers" that can do this without a lot of manual intervention (i.e. hints from the programmer). All this might feel a very long way off, but computing power and our understanding of the theory of "provers" advances quickly, and Microsoft is doing some of it already. Via their Terminator research project they have started to prove that their device drivers will always terminate, and in so doing have suddenly eliminated a vast range of possible bugs. This is a huge step forward from saying, "we've tested it lots and it seems fine". What do you think? What might be good targets for specification and verification? SQL could be one: the cost of a bug in SQL Server is quite high given how many important systems rely on it, so there's a good incentive to eliminate bugs, even at high initial cost. [Many thanks to Mike Williamson for guidance and useful conversations during the writing of this piece] Cheers, Tony.

Read the article
Concurrent Affairs

- by Tony Davis

I once wrote an editorial, multi-core mania, on the conundrum of ever-increasing numbers of processor cores, but without the concurrent programming techniques to get anywhere near exploiting their performance potential. I came to the.controversial.conclusion that, while the problem loomed for all procedural languages, it was not a big issue for the vast majority of programmers. Two years later, I still think most programmers don't concern themselves overly with this issue, but I do think that's a bigger problem than I originally implied. Firstly, is the performance boost from writing code that can fully exploit all available cores worth the cost of the additional programming complexity? Right now, with quad-core processors that, at best, can make our programs four times faster, the answer is still no for many applications. But what happens in a few years, as the number of cores grows to 100 or even 1000? At this point, it becomes very hard to ignore the potential gains from exploiting concurrency. Possibly, I was optimistic to assume that, by the time we have 100-core processors, and most applications really needed to exploit them, some technology would be around to allow us to do so with relative ease. The ideal solution would be one that allows programmers to forget about the problem, in much the same way that garbage collection removed the need to worry too much about memory allocation. From all I can find on the topic, though, there is only a remote likelihood that we'll ever have a compiler that takes a program written in a single-threaded style and "auto-magically" converts it into an efficient, correct, multi-threaded program. At the same time, it seems clear that what is currently the most common solution, multi-threaded programming with shared memory, is unsustainable. As soon as a piece of state can be changed by a different thread of execution, the potential number of execution paths through your program grows exponentially with the number of threads. If you have two threads, each executing n instructions, then there are 2^n possible "interleavings" of those instructions. Of course, many of those interleavings will have identical behavior, but several won't. Not only does this make understanding how a program works an order of magnitude harder, but it will also result in irreproducible, non-deterministic, bugs. And of course, the problem will be many times worse when you have a hundred or a thousand threads. So what is the answer? All of the possible alternatives require a change in the way we write programs and, currently, seem to be plagued by performance issues. Software transactional memory (STM) applies the ideas of database transactions, and optimistic concurrency control, to memory. However, working out how to break down your program into sufficiently small transactions, so as to avoid contention issues, isn't easy. Another approach is concurrency with actors, where instead of having threads share memory, each thread runs in complete isolation, and communicates with others by passing messages. It simplifies concurrent programs but still has performance issues, if the threads need to operate on the same large piece of data. There are doubtless other possible solutions that I haven't mentioned, and I would love to know to what extent you, as a developer, are considering the problem of multi-core concurrency, what solution you currently favor, and why. Cheers, Tony.

Read the article
The long road to bug-free software

- by Tony Davis

The past decade has seen a burgeoning interest in functional programming languages such as Haskell or, in the Microsoft world, F#. Though still on the periphery of mainstream programming, functional programming concepts are gradually seeping into the imperative C# language (for example, Lambda expressions have their root in functional programming). One of the more interesting concepts from functional programming languages is the use of formal methods, the lofty ideal behind which is bug-free software. The idea is that we write a specification that describes exactly how our function (say) should behave. We then prove that our function conforms to it, and in doing so have proved beyond any doubt that it is free from bugs. All programmers already use one form of specification, specifically their programming language's type system. If a value has a specific type then, in a type-safe language, the compiler guarantees that value cannot be an instance of a different type. Many extensions to existing type systems, such as generics in Java and .NET, extend the range of programs that can be type-checked. Unfortunately, type systems can only prevent some bugs. To take a classic problem of retrieving an index value from an array, since the type system doesn't specify the length of the array, the compiler has no way of knowing that a request for the "value of index 4" from an array of only two elements is "unsafe". We restore safety via exception handling, but the ideal type system will prevent us from doing anything that is unsafe in the first place and this is where we start to borrow ideas from a language such as Haskell, with its concept of "dependent types". If the type of an array includes its length, we can ensure that any index accesses into the array are valid. The problem is that we now need to carry around the length of arrays and the values of indices throughout our code so that it can be type-checked. In general, writing the specification to prove a positive property, even for a problem very amenable to specification, such as a simple sorting algorithm, turns out to be very hard and the specification will be different for every program. Extend this to writing a specification for, say, Microsoft Word and we can see that the specification would end up being no simpler, and therefore no less buggy, than the implementation. Fortunately, it is easier to write a specification that proves that a program doesn't have certain, specific and undesirable properties, such as infinite loops or accesses to the wrong bit of memory. If we can write the specifications to prove that a program is immune to such problems, we could reuse them in many places. The problem is the lack of specification "provers" that can do this without a lot of manual intervention (i.e. hints from the programmer). All this might feel a very long way off, but computing power and our understanding of the theory of "provers" advances quickly, and Microsoft is doing some of it already. Via their Terminator research project they have started to prove that their device drivers will always terminate, and in so doing have suddenly eliminated a vast range of possible bugs. This is a huge step forward from saying, "we've tested it lots and it seems fine". What do you think? What might be good targets for specification and verification? SQL could be one: the cost of a bug in SQL Server is quite high given how many important systems rely on it, so there's a good incentive to eliminate bugs, even at high initial cost. [Many thanks to Mike Williamson for guidance and useful conversations during the writing of this piece] Cheers, Tony.

Read the article
The clock hands of the buffer cache

- by Tony Davis

Over a leisurely beer at our local pub, the Waggon and Horses, Phil Factor was holding forth on the esoteric, but strangely poetic, language of SQL Server internals, riddled as it is with 'sleeping threads', 'stolen pages', and 'memory sweeps'. Generally, I remain immune to any twinge of interest in the bowels of SQL Server, reasoning that there are certain things that I don't and shouldn't need to know about SQL Server in order to use it successfully. Suddenly, however, my attention was grabbed by his mention of the 'clock hands of the buffer cache'. Back at the office, I succumbed to a moment of weakness and opened up Google. He wasn't lying. SQL Server maintains various memory buffers, or caches. For example, the plan cache stores recently-used execution plans. The data cache in the buffer pool stores frequently-used pages, ensuring that they may be read from memory rather than via expensive physical disk reads. These memory stores are classic LRU (Least Recently Updated) buffers, meaning that, for example, the least frequently used pages in the data cache become candidates for eviction (after first writing the page to disk if it has changed since being read into the cache). SQL Server clearly needs some mechanism to track which pages are candidates for being cleared out of a given cache, when it is getting too large, and it is this mechanism that is somewhat more labyrinthine than I previously imagined. Each page that is loaded into the cache has a counter, a miniature "wristwatch", which records how recently it was last used. This wristwatch gets reset to "present time", each time a page gets updated and then as the page 'ages' it clicks down towards zero, at which point the page can be removed from the cache. But what is SQL Server is suffering memory pressure and urgently needs to free up more space than is represented by zero-counter pages (or plans etc.)? This is where our 'clock hands' come in. Each cache has associated with it a "memory clock". Like most conventional clocks, it has two hands; one "external" clock hand, and one "internal". Slava Oks is very particular in stressing that these names have "nothing to do with the equivalent types of memory pressure". He's right, but the names do, in that peculiar Microsoft tradition, seem designed to confuse. The hands do relate to memory pressure; the cache "eviction policy" is determined by both global and local memory pressures on SQL Server. The "external" clock hand responds to global memory pressure, in other words pressure on SQL Server to reduce the size of its memory caches as a whole. Global memory pressure – which just to confuse things further seems sometimes to be referred to as physical memory pressure – can be either external (from the OS) or internal (from the process itself, e.g. due to limited virtual address space). The internal clock hand responds to local memory pressure, in other words the need to reduce the size of a single, specific cache. So, for example, if a particular cache, such as the plan cache, reaches a defined "pressure limit" the internal clock hand will start to turn and a memory sweep will be performed on that cache in order to remove plans from the memory store. During each sweep of the hands, the usage counter on the cache entry is reduced in value, effectively moving its "last used" time to further in the past (in effect, setting back the wrist watch on the page a couple of hours) and increasing the likelihood that it can be aged out of the cache. There is even a special Dynamic Management View, sys.dm_os_memory_cache_clock_hands, which allows you to interrogate the passage of the clock hands. Frequently turning hands equates to excessive memory pressure, which will lead to performance problems. Two hours later, I emerged from this rather frightening journey into the heart of SQL Server memory management, fascinated but still unsure if I'd learned anything that I'd put to any practical use. However, I certainly began to agree that there is something almost Tolkeinian in the language of the deep recesses of SQL Server. Cheers, Tony.

Read the article
Dell Inspirion 1525 64bit driver compatibility

- by Tony

Hello, Can anyone tell me what the compatibility of drivers on 64bit vista are like on the Dell Inspiron 1525. Mine is just a year old now and I want to upgrade to 64bit vista. I wondered if there's known issues with drivers that I should be aware of before upgrading? Tony PS: is the upgrade worth it? My machine has 3GB of RAM and runs a T5800 2.0GHz DuoCore Processor. I use Visual Studio often and have loads of things open at the same time a lot. That's why I'm considering an upgrade.

Read the article
Problem installing a w2k DC on Hyper-V?

- by Tony

Hi, We have a cluster with four node windows 2008 r2 and hyper-v installed. We would like to install 2 VM with role domain controller w2k (the domain is different from the domain of the hyper-v cluster). Do you know if there are any restriction on doing it? Some collegues say that we risk data corruption if we do live migrations. Others speak about the fact that Microsoft don't support w2k any more. And others have doubts because the global catalog server installed on these DC could have loss of performance. Any idea? Thanks Tony

Read the article
Solution to: Hotmail Senders receiving NDR : “550-Please turn on SMTP Authentication in your mail client…”

- by Tony Yustein

Original question is here original question I can not answer to that question because the system requires me to have 10 credits, very nice.... This error is based mostly on mobile devices, mostly on iPhones and mostly on mobile networks. This is how much I have narrowed it to. I believe: Hotmail checks where your are connecting from If it is a mobile network it requires additional security for sending messages but the default iPhone config does not have this option for hotmail if the user creates the hotmail account on the iPhone with SMTP AUTH enabled manually it might solve the situation Cheers, Tony

Read the article
Where do all the old programmers go?

- by Tony Lambert

I know some people move over to management and some die... but where do the rest go and why? One reason people change to management is that in some companies the "Programmer" career path is very short - you can get to be a senior programmer within a few years. Leaving no way to get more money but to become a manager. In other companies project managers and programmers are parallel career paths so your project manager can be your junior. Tony

Read the article
2 Parent BLOGs in DNN 5.x?

- by Tony

Hi there, I have created 2 parent BLOGs in DNN. BLOG x has 3 children BLOGs and BLOG y has 5 children. When I click Add BLOG entry, only BLOG x & is children come up in the dropdown list, BLOG y is missing. Anyone know why? Even if I go to the BLOG y and click Add BLOG entry, its missing. Many thanks, Tony.

Read the article
How to implment SSO with Joomla!?

- by Tony Woo

Hi Friends, I would like to use the Jommla! as our applications' portal, when one user logined joomla!, he/she don't need to enter password in our another application? anyone has any idear on how to implement this? or you have any web site which I can reference. Thanks in advance. Tony

Read the article
Anybody knows any knowledge base open source?

- by Tony

I want to build a web-based knowledge base system for our call center. To save some development time, I am looking for a open source. Does anybody know any good one out there? Thanks in advance, Tony

Read the article
Java Server Page Tutorials in pdf form

- by tony-kurshingal

Hi all, I am new to jsp.I have a basic knowledge about Core java and i need to get expertise in jsp too.So please give me a jsp tutorial for beginers Thanks Tony

Read the article
MSI creation with Visual Studio 2008 - how to create a subdirectory?

- by tony

Hello, I am using Visual Studio 2008 to create an MSI installer. I currently have all the files installed in the main directory. Is there a way to select some files to go in a subdirectory? I can find an option in Visual Studio to do that. Thanks Tony

Read the article
Updating version of multiple components using 1 file in c#

- by tony

Hello, I have a solution with many components and each component has its own version file (AssemblyInfo.cs). Currently each file as the standard structure: [assembly: AssemblyVersion("1.0.0.0")] [assembly: AssemblyFileVersion("1.0.0.0")] I would like to use the file version as the component version and keep that independent but I would like to have the "Product Version" be the same for all the components. I would like to do that by having a file at the solution level that is used by all the projects to get that version embedded. Any suggestions Thanks Tony

Read the article

Search Results

Search found 637 results on 26 pages for 'tony wolfram'.

Page 3/26 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

- by Tony Davis

- by Tony Davis

- by Tony Davis

- by Tony Davis

- by Tony Davis

- by Tony Davis

- by Tony Davis

- by Tony Davis

- by Tony Davis

- by Tony Davis

- by Tony Davis

- by Tony Davis

- by Tony Davis

- by Tony Davis

- by Tony Davis

- by Tony

- by Tony

- by Tony Yustein

- by Tony Lambert

- by Tony

- by Tony Woo

- by Tony

- by tony-kurshingal

- by tony

- by tony

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >