fail fast fail early - Page 187

Using FlashCache on /var on startup

- by WinkyWolly

I would like to have FlashCache cache my /var partition however I can't seem to get it to play nice upon bootup (IE: not really sure how to do it). I'm not sure if I need to modify the initramfs/use DKMS or if I can do it in user-land during bootup. The issue I'm running into is /var mounts early and therefore the device is busy (whether it by generally by syslogd). I'm positive this can be resolved by modifying the initramfs although I simply haven't fiddled with it enough to get it working. They have instructions on how to boot your root partition however I'm not sure if these instructions would apply to my use case. Any help / pointers in the right direction would be absolutely splendid.

Read the article

Using TrueCrypt (software encryption) with an SSD

- by Shackrock

I use full drive encryption (FDE) w/ TrueCrypt on my laptop. I have a 2nd gen I7 with AES instruction support, so honestly I can't even notice a speed change on the system with it on. My question, is for those who know about SSD's a lot. I previously (early 2011) read articles about how software encryption will negate the speed benefits that an SSD provides - because of the need for the SSD to send a delete command, then a write command, for every encrypted write - instead of just writing over data like a regular HDD would (or something like this...honestly I can't remember...ha!). Anyway, any improvements in this field? Is it pointless for me to grab an SSD if I'm using FDE? Thanks all.

Read the article

Full computer freeze with audio stuttering after playing games for a period of time

- by Wes

I've been having a problem with my computer freezing completely when playing games like LA Noire or SW:TOR (yay early access!). Basically, what happens is I will play for around an hour or so (depending on the game) and when the freeze happens, the entire computer locks up and any audio that was being played glitches out and stutters broken-record style (only much shorter. Very techno). I think it might be heat related and thought it might be my video card overheating, so I have been setting my video card (Nvidia Geforce 260GTX 216-core) fan to highest setting, but that has little to no effect. Now I'm beginning to think it's either my FSB or CPU overheating. Can anyone provide some insight or similar experiences? I'm really at a loss and don't wanna damage my rig beyond repair.

Read the article

Mac PPC_7100 ROM for SheepShaver?

- by Good_quess

Hello everybody, I still own a PPC 7100/80av, that I bought in early 1995. The Mac is still running fine, but I am not using it often these days. I recently came across SheepShaver and would like to take my old games and apps to the emulation. For SheepShaver I would like to use a copy of my 7100's ROM. Did anybody succeed in saving a working copy of this machines ROM? I tried CopyRom and GetRom, but all I ever saved was a 4MB file good for nothing. I am just wondering what I am missing. Any ideas would be very welcome. Thank you!

Read the article

Git Clone from SSH Repository

- by Mike Silvis

I used to be able to clone from my personal git repository but now i seem to be running into an error. user:dev.site.com mikesilvis$ git clone { my ssh directory } server@ipaddress's password: remote: Counting objects: 3622, done. remote: Compressing objects: 100% (2718/2718), done. error: git upload-pack: git-pack-objects died with error. fatal: git upload-pack: aborting due to possible repository corruption on the remote side. remote: aborting due to possible repository corruption on the remote side. fatal: early EOF fatal: index-pack failed It seems to be working however while I push files to the repository.

Read the article

Debian Squeeze - Monitor outgoing traffic

- by Sam W.

I have a small webserver that running on Lighttpd 1.4 which steadily uses 250GB or less bandwidth for the past couple of months. But since May the traffic spikeed to more than triple of what it was. Nothing special was on my site to make its spike like that. When I checked with vnstat I found that 70% of the bandwidth is tx. I suspect I've been hacked and my webserver is becoming some sort of bot. ClamAV comes out with nothing and I already replaced the Joomla installation with a fresh one, early in June. But right now the traffic stayed the same. My question, how can I monitor my server and look what is transmitting all that data out? My need to be done to pinpoint what is the culprit. Can someone please point to the right way to solve this? Thank you.

Read the article

Experience with Intel X25-M 160GB and Oracle

- by derobert

We're considering building an Oracle database with 12 Intel X25-M G2 160GB drives in software RAID10. It'd be running Linux. Database gets some very heavy write activity during the early morning data load, other than that it is mostly read-only (and the read load is fairly minimal). We're currently running on 11 150GB Velociraptors (also Linux software RAID10), and are hoping the X25-M will speed up the data load. We currently have redo on different disks than the rest of the data. I'm wondering a few things: Any experience with using X25-M drives for databases? The X25-E are unfortunately beyond our budget. Would it hurt to separate redo off to some magnetic (non-SSD) drives, say 2 (raid1) or 4 (raid10) Seagate Constellations?

Read the article

Random TCP Resets

- by allenwei

We got randomly TCP "reset" error when we send request to remote server. Log from remote server Cisco TCP Connection Terminated,Nov 05 14:43:39 EST: %ASA-session-6-302014: Teardown TCP connection 640068283 for Outside:xxxx to xxxx duration 0:00:00 bytes 4160 TCP Reset-O One my local machine I saw when I use netstat 100703 connections reset due to unexpected data 324186 connections reset due to early user close I also use tcpdump to see what's wrong with it, I saw xxxx.https: Flags [R.], seq 290, ack 1369, win 136, options [nop,nop,TS val 2871790533 ecr 1897173283], length 0 The problem just happened today, we didn't change anything on our server. Anyone know what's wrong with it? Is it related to code we wrote send out request or related to linux configuration?

Read the article

How does hadoop decide what its nodes hostnames are?

- by Dan R

Currently the urls generated by the jobtracker & namenode return either hostnames like bubbles.local or just bubbles. These end up not resolving unless the client machine has specified these in their /etc/hosts file. When I run the hostname command on these machines it returns a hostname complete with the domain (E.G bubbles.example.com) Running a small java test on these machines InetAddress addr = InetAddress.getLocalHost(); byte[] ipAddr = addr.getAddress(); String hostname = addr.getHostName(); System.out.println(hostname); Produces output just like the hostname command. Where else could hadoop be grabbing a hostname to use in its jobtracker / namenode UI? This is occurring in clusters with Hadoop 1.0.3 and 1.0.4-SNAPSHOT from early august. The machines are running CentOS release 5.8 (Final). The generated URLs I'm referring to are like this http://example:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=/ or http://example.local:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=/

Read the article

The best dvd ripper software in 2014 review

- by user328170

The top 3 DVD Ripping Tools in 2014 Nowadays everyone may have several smart mobile devices, such as iphone, ipad air, ipad mini ,Samsung Galaxy and Sony Xperia. If you want to take your movies with your mobile devices, or sometimes just want to backup those classic physical discs on your notebook or workstation with high quality resolutions, you need a fast and stable software to rip them and convert them to the format you like. Fortunately, there are plenty of great software products designed to make the process easy and transform DVD to the files that are playable on any mobile device you choose. We have done a full review on dozens of products. Here are five of the best, based on our review. We test the software from its ripping speed, friendly use guide , reliability and ripping capability. The top one is still Winx DVD Ripper platinum. We've test its 6.1 version 2 years ago for its ability to quickly and easily rip DVDs and Blu-ray discs to high quality MKV files with a single click. It gave us deep impression in the test. This time we test it’s lastest 7.3.5 version. Besides easy use and speed, we test its capability to decrypt all kinds of discs with different protect method, for example, Disney X-project DRM , Sony ArccOS, RCE and region code. The result shows that winx dvd ripper platinum still maintain its advantages in all the area. Winx dvd ripper platinum is a more focused on DVD ripping software with the basic duty to rip and convert DVD. The color of UI is a modern technical sense. All the main functions are shown obviously while others specials are hidden for advanced users, making it more clear and convenient to make option. There are two company weisoft limited and Digiarty who can provide the software. Weisoft limited focus on USA, UK and Australia market. Digiarty focus on others. ripping speed ????? friendly use guide ????? reliability ????? ripping capability ????? The second one DVDFab DVDFab is also very robust during ripping dics. It can also decrypt most of the dics in the market. The shortage it still friendly use and speed. We'd note that the app is frequently updated to cut through the copy protection on even the latest DVDs and Blu-ray discs . The app is shareware, meaning most features are free, including decrypting and ripping to your hard drive. Many of you note that you use another app for compression and authoring, but many of you say they hey, storage is cheap, and the rips from DVDFab are easy, one-click, and work. ripping speed ??? friendly use guide ???? reliability ????? ripping capability ????? The Third one is Handbrake Handbrake is our favorite video encoder for a reason: it's simple, easy to use, easy to install, and offers a lots of options to get the high quality file as a result. If you're scared by them, you don't even have to use them—the app will compensate for you and pick some settings it thinks you'll like based on your destination device. So many of you like Handbrake that many of you use it in conjunction with another app (like VLC, which makes ripping easy)—you'll let another app do the rip and crack the DRM on your discs, and then process the file through Handbrake for encoding. The app is fast, can make the most of multi-core processors to speed up the process, and is completely open source. ripping speed ??? friendly use guide ???? reliability ???? ripping capability ????

Read the article

PHP/Oracle Connectivity randomly "drops out"

- by user20555

Hi! Here's the current situation - I have two web servers (for now named A and B) and two database servers (named C and D). The web servers are quite old, and are running an early version of Apache 2 + PHP4, while the DB servers are running Oracle 9i and 10g respectively. We're experiencing a strange problem connecting (via PHP code) to database A while on web server B only. Web server A has no issues at all... Randomly, web server B will report a "Not connected to Oracle" error (3114). I can't see a real pattern with this, but refreshing a few times seems to fix the issue. Apparently there are no drop-outs on the network interface, which leads me to believe that there's some misconfiguration between PHP/Apache and Oracle (which uses connection pooling). We're running SunOS 5.8... Any ideas?

Read the article

Control Windows 8 with a tablet

- by Frantumn

It seems much of Windows 8 is based on the idea that it will be running on tablets, and touch screen PCs. I like this, but I don't have a W89 tablet, or a touch PC yet. I'm running W8 on a laptop, and am wondering if there's any way of using my iPad2 as a touch interface with metro? Or another options that'd be nice is if there's a input device similar to Apple's "Magic Trackpad" that would allow me to use hand gestures instead of a mouse cursor when in metro apps. I've seen some cool videos of MS Smart Glass and it would seem that the capabilities are there. But it may just be too early on to do this? I'm not sure.

Read the article

Long term system health monitor

- by user30336

As an experienced user, I sometimes notice that things are not going well with my computer. For example, my backup drive recently started cycling up and down, so I guessed it was probably dying, and replaced it. I detected this with my ears. Windows did not seem to notice or care. There ought to be software that monitors overall system health by keeping track of things like this, so that unusual events or increasing error rates will not be shrugged off. Among other things: disk errors that are recovered, corrupt network packets (at above the baseline expected rate) and crashes of trusted programs are early warnings. Is there any software that tries to use this kind of monitoring to warn of impending trouble?

Read the article

SCCM? Overkill?

- by Le_Quack

T. technician in a high school with around 1600 students 250 staff and 800+ client computers mostly running W7 I'm looking for a better way to manage clients (deploy software, track changes, inventory etc) I like the look of SCCM 2012 features but the case studies seem to be aimed at large multi-site infrastructural rather than a single mid sized site. Is SCCM suitable for a mid sized single site or is it aimed at much larger corporations, if so what would be more suitable Just a note about me and my situation. I work as a technician in a school part of a team of 3. My boss seems content with a network that works (just about) not a productive well maintained network that is easy to run and maintain. I'm still fairly early on in my I.T. career so sorry if I'm not up to speed on all products. EDIT: Thanks for all the help I'll take a look at SCE and SCCM and get some proposals drawn up to take to my boss/deputy head

Read the article

Windows API Error 2 when installing MikTex (Win7)

- by GreenAsJade

I am trying to install MikTex 2.9.3927 on Windows 7 x64. Very early in the installation process, I get a MikTex setup wizard error saying: "Windows API Error 2: The system cannot find the file specified Details: C:...\somefile.tpm" The file that results in the error seems to be different every time I try. I have tried many different installation paths, with many different setup options. The same error occurs if I download the ~138MB "Basic MiKTeX 2.9" Installer or if I use the Net Installer to download the entire setup (~1GB). Note that this is a duplicate question - the other copy of this question is closed to users of under 10 rep. I have asked it so that I can provide an updated answer...

Read the article

Administrators: People interaction and Working hours?

- by sanksjaya

Do admins get to meet a lot of new people on a daily basis at work? And what kind of people domain do they interact with? Secondly, I've had this myth for a long time that unlike programmers, Network/System/Security Admins get locked-up in a den and juiced up late nights and early mornings. Most of the time they had to slip out of work without being noticed. How true and often does this happen for you? I understand there is no specific answer to this question as it depends on the type of working organization. But just looking out for answers from your experience. Thanks :) Clarification: I've had the question posted on to stackoverflow and they closed it. So I posted it here.

Read the article

How Social Is Your Contact Center?

- by Charles Knapp

More than 75% of consumers have complained on a social site after a poor customer experience. Yet, 70% of companies have little understanding of the social media conversations about their brand. To deliver upon your brand promise, retain customers, and increase their lifetime value, you must deliver great customer experiences across social, mobile, phone, and chat channels. Siloed channels produce poor customer experiences. Social channels must integrate with the people, processes, technology, and traditional channels used to satisfy customers. The more effective a company’s social marketing, the greater the demand for effective social service. However, service is not a job for social marketers. It is a job for service specialists, focused on KPIs such as response time, first contact resolution, satisfaction, churn, retention, and customer lifetime value. Most social-enabled contact centers are at the early adopter stage, attempting to “bolt on” social media as a side process. Many are experiencing inconsistent customer experiences, higher costs, and negligible return on investments. Service leaders should consider carefully how to integrate social channels with their current customer service and support people, processes, technology, and channels. Here is one company realizing success: the pre-integrated Oracle RightNow Social Experience “empowers our contact center operations by enabling our agents to join customer conversations that are happening on social sites like Twitter and Facebook and integrate those conversations into our overall multichannel customer engagement processes.” — Lisa Larson, Drugstore.com

Read the article

Full & Last Lunar Eclipse Of 2010 On December 21st

- by Kavitha

A total and last eclipse of the Moon occurs during the early morning hours of December 21, 2010 (for observers in western North America and Hawaii, the eclipse actually begins on the evening of December 20). You can fully view the total lunar eclipse from North America, Greenland and Iceland. The beginning stages of the eclipse is visible in Western Europe and the later stages after moonrise is visible in Western Asia. Total Lunar Eclipse Timings According to NASA “The Moon’s orbital trajectory takes it through the northern half of Earth’s umbral shadow. Although the eclipse is not central, the total phase still lasts 72 minutes”. The timings of the major eclipse phases are listed below. Penumbral Eclipse Begins 05:29:17 UT Partial Eclipse Begins 06:32:37 UT Total Eclipse Begins 07:40:47 UT Greatest Eclipse 08:16:57 UT Total Eclipse Ends 08:53:08 UT Partial Eclipse Ends 10:01:20 UT Penumbral Eclipse Ends 11:04:31 UT The timings of the major eclipse phases are listed below. Total Lunar Eclipse of December 21, 2010 Europe North America Pacific Event GMT AST EST CST MST PST AKST HST Partial Eclipse Begins: 06:33 am 02:33 am 01:33 am 12:33 am 11:33 pm* 10:33 pm* 09:33 pm* 08:33 pm* Total Eclipse Begins: 07:41 am 03:41 am 02:41 am 01:41 am 12:41 am 11:41 pm* 10:41 pm* 09:41 pm* Mid-Eclipse: 08:17 am 04:17 am 03:17 am 02:17 am 01:17 am 12:17 am 11:17 pm* 10:17 pm* Total Eclipse Ends: 08:53 am 04:53 am 03:53 am 02:53 am 01:53 am 12:53 am 11:53 pm* 10:53 pm* Partial Eclipse Ends: 10:01 am 06:01 am 05:01 am 04:01 am 03:01 am 02:01 am 01:01 am 12:01 am Note: * Event occurs on evening of December 20, 2010 In India the Total Lunar Eclipse will take place between 13:10:47 and 14:23:08 on December 21st 2010. Animation Of Total Lunar Eclipse ShadowsAndSubstance website has a beautiful flash animation of December 21 2010 Total Lunar Eclipse and you can view it here Image Credit : flickr/stargazer95050 This article titled,Full & Last Lunar Eclipse Of 2010 On December 21st, was originally published at Tech Dreams. Grab our rss feed or fan us on Facebook to get updates from us.

Read the article

Microsoft launches IE9 preview – No support for XP

- by samsudeen

Microsoft launched the developer preview version of Internet Explorer 9 (IE9) at MIX 10 web conference yesterday.This release is aimed getting the feedback from website designers , developers and other community to make IE9 development better from its previous versions. Microsoft will update the developer preview every eight weeks and the next update is expected on mid of march.So what is new and interesting about IE9 Chakra Chakra (The new scripting engine of IE9) renders the Java script much faster compared to IE8 and other browsers thus improving the performance significantly.According to Microsoft Chakra renders the java script in background with a separate thread parallel to the main engine which is complete new way of rendering from the current browser technologies Standards Microsoft is desperate to make ( surprisingly!!!) IE9 compliance to web standards by supporting the open standards such as Accelerated support for HTML5 video support for new web technologies such as CSS3 and SVG2. ACID3 Test IE9 scores (55/100) in its latest ACID3 test which is much better compared to the IE8 score (22/100) but not even nearer to their rivals Chrome, Opera, and Safari which scores 100/100 in ACID3 testing I am little disappointed over not able to download the developer preview on my XP machine. The early comments looks much positive for IE9.If you want to explore IE9,check the Microsoft Test drive site at Microsoft IE9 Test-drive You can also download the IE9 developer preview at Download Preview Join us on Facebook to read all our stories right inside your Facebook news feed.

Read the article

A Week of DNN – March 19, 2010

- by Rob Chartier

DotNetNuke 5.3.0 Released! New Features Templated User Profiles - User profile pages are now publicly viewable, and layout is controlled by the Admin. Photo field in User Profile - Users can upload a photo to their profile. We also added support for User Specific data storage. User Messaging - Users can send direct messages to other system users. This also includes an out-of-the-box asynchronous, provider based, message platform. You will see more of this in future releases. Search Engine Sitemap Provider - The sitemap now allows module admins to plug in sitemap logic for individual modules. Taxonomy Manager - Administrators can create flat or hierarchical taxonomies that can be shared and used across modules. Supporting SEO and Social features at the core is an important piece for DotNetNuke moving forward. (Last Minute Update: 5.3.1 will be released with some last minute updates early next week) DotNetNuke as a Scalable Content management System (CMS) Power, Reliability & Feature Richness – DotNetNuke an Open Source Framework How to Search Engine Optimize dotnetnuke dotnetnuke Training Video – Setting DNN Security DotNetNuke Module Template [CS] (Free) XsltDb - DotNetNuke XSLT module with database and ajax support (Free) Create a non-Award Winning DotNetNuke Skin (part 1, part 2, part 3) Test Driven example module nearly refactored to Web Forms MVP Ajax Search v1.0.0 Released! (Live Demo) Tutorials: Backup DNN, Restore DNN, Move DNN from Backup (By Mitchel Sellers) A tag cloud based on the new 5.3 Taxonomy Engage: Tell-a-Friend 1.1 released (FREE module) 549 DotNetNuke Videos: DNN Creative Magazine Issue 54 Out Now http://www.dotnetnuke.com/Community/Forums/tabid/795/forumid/112/threadid/355615/scope/posts/Default.aspx

Read the article

Smithsonian Showcases Video Game History with The Art of Video Games [Video]

- by Jason Fitzpatrick

The Art of Video Games is the Smithsonian’s look at the history of video games; check out this video trailer to see what the exhibition is all about and hear from some notable folks. From the Smithsonian listing for the exhibition: The Art of Video Games is one of the first exhibitions to explore the forty-year evolution of video games as an artistic medium, with a focus on striking visual effects and the creative use of new technologies. It features some of the most influential artists and designers during five eras of game technology, from early pioneers to contemporary designers. The exhibition focuses on the interplay of graphics, technology and storytelling through some of the best games for twenty gaming systems ranging from the Atari VCS to the PlayStation 3. The exhibit will be at the Smithsonian until the end of September and will then begin touring the country. Hit up the link below for more information. The Art of Video Games Tour [via Neatorama] How To Properly Scan a Photograph (And Get An Even Better Image) The HTG Guide to Hiding Your Data in a TrueCrypt Hidden Volume Make Your Own Windows 8 Start Button with Zero Memory Usage

Read the article

Introducing SSIS Reporting Pack for SQL Server code-named Denali

- by jamiet

In recent blog posts I have introduced the new SSIS Catalog that is forthcoming in SQL Server Code-named Denali: What's new in SSIS in Denali Introduction to SSIS Projects in Denali Parameters in SSIS In Denali SSIS Server, Catalogs, Environments and Environment Variables in SSIS in Denali The SSIS Catalog is responsible for executing SSIS packages and also for capturing the metadata from those executions. However, at the time of writing there is no mechanism provided to view analyse and drill into that metadata and that is the reason that I am, in this blog post, introducing a suite of SSIS Catalog reports called the SSIS Reporting Pack which you can download from my SkyDrive at http://cid-550f681dad532637.office.live.com/self.aspx/Public/SSIS%20Reporting%20Pack/SSISReportingPack%20v0.1.zip. In this first release the SSIS Reporting Pack includes five reports: Catalog – A high-level summary of all activity in the Catalog Folders – A summary of activity in each Catalog Folder Folder – Project-level activity per single Folder Executions – A visualisation of all executions per Folder/Project/Package/Environment or subset thereof Execution – Information about an individual execution Here is a screenshot of the Executions report: Notice that the SSIS Reporting Pack provides a visual overview of all executions in the Catalog. Each execution is represented as a bar on the bar chart, the success or otherwise of each execution is indicated by the colour of the bar and the execution time is indicated by the bar height. I have recorded a video that gives an overview of the SSIS Reporting which I have embedded below. If you are having any trouble viewing the video go see it at http://vimeo.com/17617974 I must stress that this is a very early version of the SSIS Reporting Pack and I am expecting it to change a lot over the coming year. I am very keen to get some feedback about this, specifically: let me know if anything does not work as you expect give me your feature requests The easiest way to get hold of of me for now is within the comments section of this blog post. That’s all for now. I hope the SSIS Reporting Pack proves useful and I look forward to hearing your feedback. Lastly, that download link again: http://cid-550f681dad532637.office.live.com/self.aspx/Public/SSIS%20Reporting%20Pack/SSISReportingPack%20v0.1.zip. @jamiet

Read the article

A way of doing real-world test-driven development (and some thoughts about it)

- by Thomas Weller

Lately, I exchanged some arguments with Derick Bailey about some details of the red-green-refactor cycle of the Test-driven development process. In short, the issue revolved around the fact that it’s not enough to have a test red or green, but it’s also important to have it red or green for the right reasons. While for me, it’s sufficient to initially have a NotImplementedException in place, Derick argues that this is not totally correct (see these two posts: Red/Green/Refactor, For The Right Reasons and Red For The Right Reason: Fail By Assertion, Not By Anything Else). And he’s right. But on the other hand, I had no idea how his insights could have any practical consequence for my own individual interpretation of the red-green-refactor cycle (which is not really red-green-refactor, at least not in its pure sense, see the rest of this article). This made me think deeply for some days now. In the end I found out that the ‘right reason’ changes in my understanding depending on what development phase I’m in. To make this clear (at least I hope it becomes clear…) I started to describe my way of working in some detail, and then something strange happened: The scope of the article slightly shifted from focusing ‘only’ on the ‘right reason’ issue to something more general, which you might describe as something like 'Doing real-world TDD in .NET , with massive use of third-party add-ins’. This is because I feel that there is a more general statement about Test-driven development to make: It’s high time to speak about the ‘How’ of TDD, not always only the ‘Why’. Much has been said about this, and me myself also contributed to that (see here: TDD is not about testing, it's about how we develop software). But always justifying what you do is very unsatisfying in the long run, it is inherently defensive, and it costs time and effort that could be used for better and more important things. And frankly: I’m somewhat sick and tired of repeating time and again that the test-driven way of software development is highly preferable for many reasons - I don’t want to spent my time exclusively on stating the obvious… So, again, let’s say it clearly: TDD is programming, and programming is TDD. Other ways of programming (code-first, sometimes called cowboy-coding) are exceptional and need justification. – I know that there are many people out there who will disagree with this radical statement, and I also know that it’s not a description of the real world but more of a mission statement or something. But nevertheless I’m absolutely sure that in some years this statement will be nothing but a platitude. Side note: Some parts of this post read as if I were paid by Jetbrains (the manufacturer of the ReSharper add-in – R#), but I swear I’m not. Rather I think that Visual Studio is just not production-complete without it, and I wouldn’t even consider to do professional work without having this add-in installed... The three parts of a software component Before I go into some details, I first should describe my understanding of what belongs to a software component (assembly, type, or method) during the production process (i.e. the coding phase). Roughly, I come up with the three parts shown below: First, we need to have some initial sort of requirement. This can be a multi-page formal document, a vague idea in some programmer’s brain of what might be needed, or anything in between. In either way, there has to be some sort of requirement, be it explicit or not. – At the C# micro-level, the best way that I found to formulate that is to define interfaces for just about everything, even for internal classes, and to provide them with exhaustive xml comments. The next step then is to re-formulate these requirements in an executable form. This is specific to the respective programming language. - For C#/.NET, the Gallio framework (which includes MbUnit) in conjunction with the ReSharper add-in for Visual Studio is my toolset of choice. The third part then finally is the production code itself. It’s development is entirely driven by the requirements and their executable formulation. This is the delivery, the two other parts are ‘only’ there to make its production possible, to give it a decent quality and reliability, and to significantly reduce related costs down the maintenance timeline. So while the first two parts are not really relevant for the customer, they are very important for the developer. The customer (or in Scrum terms: the Product Owner) is not interested at all in how the product is developed, he is only interested in the fact that it is developed as cost-effective as possible, and that it meets his functional and non-functional requirements. The rest is solely a matter of the developer’s craftsmanship, and this is what I want to talk about during the remainder of this article… An example To demonstrate my way of doing real-world TDD, I decided to show the development of a (very) simple Calculator component. The example is deliberately trivial and silly, as examples always are. I am totally aware of the fact that real life is never that simple, but I only want to show some development principles here… The requirement As already said above, I start with writing down some words on the initial requirement, and I normally use interfaces for that, even for internal classes - the typical question “intf or not” doesn’t even come to mind. I need them for my usual workflow and using them automatically produces high componentized and testable code anyway. To think about their usage in every single situation would slow down the production process unnecessarily. So this is what I begin with: namespace Calculator { /// <summary> /// Defines a very simple calculator component for demo purposes. /// </summary> public interface ICalculator { /// <summary> /// Gets the result of the last successful operation. /// </summary> /// <value>The last result.</value> /// <remarks> /// Will be <see langword="null" /> before the first successful operation. /// </remarks> double? LastResult { get; } } // interface ICalculator } // namespace Calculator So, I’m not beginning with a test, but with a sort of code declaration - and still I insist on being 100% test-driven. There are three important things here: Starting this way gives me a method signature, which allows to use IntelliSense and AutoCompletion and thus eliminates the danger of typos - one of the most regular, annoying, time-consuming, and therefore expensive sources of error in the development process. In my understanding, the interface definition as a whole is more of a readable requirement document and technical documentation than anything else. So this is at least as much about documentation than about coding. The documentation must completely describe the behavior of the documented element. I normally use an IoC container or some sort of self-written provider-like model in my architecture. In either case, I need my components defined via service interfaces anyway. - I will use the LinFu IoC framework here, for no other reason as that is is very simple to use. The ‘Red’ (pt. 1) First I create a folder for the project’s third-party libraries and put the LinFu.Core dll there. Then I set up a test project (via a Gallio project template), and add references to the Calculator project and the LinFu dll. Finally I’m ready to write the first test, which will look like the following: namespace Calculator.Test { [TestFixture] public class CalculatorTest { private readonly ServiceContainer container = new ServiceContainer(); [Test] public void CalculatorLastResultIsInitiallyNull() { ICalculator calculator = container.GetService<ICalculator>(); Assert.IsNull(calculator.LastResult); } } // class CalculatorTest } // namespace Calculator.Test This is basically the executable formulation of what the interface definition states (part of). Side note: There’s one principle of TDD that is just plain wrong in my eyes: I’m talking about the Red is 'does not compile' thing. How could a compiler error ever be interpreted as a valid test outcome? I never understood that, it just makes no sense to me. (Or, in Derick’s terms: this reason is as wrong as a reason ever could be…) A compiler error tells me: Your code is incorrect, but nothing more. Instead, the ‘Red’ part of the red-green-refactor cycle has a clearly defined meaning to me: It means that the test works as intended and fails only if its assumptions are not met for some reason. Back to our Calculator. When I execute the above test with R#, the Gallio plugin will give me this output: So this tells me that the test is red for the wrong reason: There’s no implementation that the IoC-container could load, of course. So let’s fix that. With R#, this is very easy: First, create an ICalculator - derived type: Next, implement the interface members: And finally, move the new class to its own file: So far my ‘work’ was six mouse clicks long, the only thing that’s left to do manually here, is to add the Ioc-specific wiring-declaration and also to make the respective class non-public, which I regularly do to force my components to communicate exclusively via interfaces: This is what my Calculator class looks like as of now: using System; using LinFu.IoC.Configuration; namespace Calculator { [Implements(typeof(ICalculator))] internal class Calculator : ICalculator { public double? LastResult { get { throw new NotImplementedException(); } } } } Back to the test fixture, we have to put our IoC container to work: [TestFixture] public class CalculatorTest { #region Fields private readonly ServiceContainer container = new ServiceContainer(); #endregion // Fields #region Setup/TearDown [FixtureSetUp] public void FixtureSetUp() { container.LoadFrom(AppDomain.CurrentDomain.BaseDirectory, "Calculator.dll"); } ... Because I have a R# live template defined for the setup/teardown method skeleton as well, the only manual coding here again is the IoC-specific stuff: two lines, not more… The ‘Red’ (pt. 2) Now, the execution of the above test gives the following result: This time, the test outcome tells me that the method under test is called. And this is the point, where Derick and I seem to have somewhat different views on the subject: Of course, the test still is worthless regarding the red/green outcome (or: it’s still red for the wrong reasons, in that it gives a false negative). But as far as I am concerned, I’m not really interested in the test outcome at this point of the red-green-refactor cycle. Rather, I only want to assert that my test actually calls the right method. If that’s the case, I will happily go on to the ‘Green’ part… The ‘Green’ Making the test green is quite trivial. Just make LastResult an automatic property: [Implements(typeof(ICalculator))] internal class Calculator : ICalculator { public double? LastResult { get; private set; } } One more round… Now on to something slightly more demanding (cough…). Let’s state that our Calculator exposes an Add() method: ... /// <summary> /// Adds the specified operands. /// </summary> /// <param name="operand1">The operand1.</param> /// <param name="operand2">The operand2.</param> /// <returns>The result of the additon.</returns> /// <exception cref="ArgumentException"> /// Argument <paramref name="operand1"/> is < 0.<br/> /// -- or --<br/> /// Argument <paramref name="operand2"/> is < 0. /// </exception> double Add(double operand1, double operand2); } // interface ICalculator A remark: I sometimes hear the complaint that xml comment stuff like the above is hard to read. That’s certainly true, but irrelevant to me, because I read xml code comments with the CR_Documentor tool window. And using that, it looks like this: Apart from that, I’m heavily using xml code comments (see e.g. here for a detailed guide) because there is the possibility of automating help generation with nightly CI builds (using MS Sandcastle and the Sandcastle Help File Builder), and then publishing the results to some intranet location. This way, a team always has first class, up-to-date technical documentation at hand about the current codebase. (And, also very important for speeding up things and avoiding typos: You have IntelliSense/AutoCompletion and R# support, and the comments are subject to compiler checking…). Back to our Calculator again: Two more R# – clicks implement the Add() skeleton: ... public double Add(double operand1, double operand2) { throw new NotImplementedException(); } } // class Calculator As we have stated in the interface definition (which actually serves as our requirement document!), the operands are not allowed to be negative. So let’s start implementing that. Here’s the test: [Test] [Row(-0.5, 2)] public void AddThrowsOnNegativeOperands(double operand1, double operand2) { ICalculator calculator = container.GetService<ICalculator>(); Assert.Throws<ArgumentException>(() => calculator.Add(operand1, operand2)); } As you can see, I’m using a data-driven unit test method here, mainly for these two reasons: Because I know that I will have to do the same test for the second operand in a few seconds, I save myself from implementing another test method for this purpose. Rather, I only will have to add another Row attribute to the existing one. From the test report below, you can see that the argument values are explicitly printed out. This can be a valuable documentation feature even when everything is green: One can quickly review what values were tested exactly - the complete Gallio HTML-report (as it will be produced by the Continuous Integration runs) shows these values in a quite clear format (see below for an example). Back to our Calculator development again, this is what the test result tells us at the moment: So we’re red again, because there is not yet an implementation… Next we go on and implement the necessary parameter verification to become green again, and then we do the same thing for the second operand. To make a long story short, here’s the test and the method implementation at the end of the second cycle: // in CalculatorTest: [Test] [Row(-0.5, 2)] [Row(295, -123)] public void AddThrowsOnNegativeOperands(double operand1, double operand2) { ICalculator calculator = container.GetService<ICalculator>(); Assert.Throws<ArgumentException>(() => calculator.Add(operand1, operand2)); } // in Calculator: public double Add(double operand1, double operand2) { if (operand1 < 0.0) { throw new ArgumentException("Value must not be negative.", "operand1"); } if (operand2 < 0.0) { throw new ArgumentException("Value must not be negative.", "operand2"); } throw new NotImplementedException(); } So far, we have sheltered our method from unwanted input, and now we can safely operate on the parameters without further caring about their validity (this is my interpretation of the Fail Fast principle, which is regarded here in more detail). Now we can think about the method’s successful outcomes. First let’s write another test for that: [Test] [Row(1, 1, 2)] public void TestAdd(double operand1, double operand2, double expectedResult) { ICalculator calculator = container.GetService<ICalculator>(); double result = calculator.Add(operand1, operand2); Assert.AreEqual(expectedResult, result); } Again, I’m regularly using row based test methods for these kinds of unit tests. The above shown pattern proved to be extremely helpful for my development work, I call it the Defined-Input/Expected-Output test idiom: You define your input arguments together with the expected method result. There are two major benefits from that way of testing: In the course of refining a method, it’s very likely to come up with additional test cases. In our case, we might add tests for some edge cases like ‘one of the operands is zero’ or ‘the sum of the two operands causes an overflow’, or maybe there’s an external test protocol that has to be fulfilled (e.g. an ISO norm for medical software), and this results in the need of testing against additional values. In all these scenarios we only have to add another Row attribute to the test. Remember that the argument values are written to the test report, so as a side-effect this produces valuable documentation. (This can become especially important if the fulfillment of some sort of external requirements has to be proven). So your test method might look something like that in the end: [Test, Description("Arguments: operand1, operand2, expectedResult")] [Row(1, 1, 2)] [Row(0, 999999999, 999999999)] [Row(0, 0, 0)] [Row(0, double.MaxValue, double.MaxValue)] [Row(4, double.MaxValue - 2.5, double.MaxValue)] public void TestAdd(double operand1, double operand2, double expectedResult) { ICalculator calculator = container.GetService<ICalculator>(); double result = calculator.Add(operand1, operand2); Assert.AreEqual(expectedResult, result); } And this will produce the following HTML report (with Gallio): Not bad for the amount of work we invested in it, huh? - There might be scenarios where reports like that can be useful for demonstration purposes during a Scrum sprint review… The last requirement to fulfill is that the LastResult property is expected to store the result of the last operation. I don’t show this here, it’s trivial enough and brings nothing new… And finally: Refactor (for the right reasons) To demonstrate my way of going through the refactoring portion of the red-green-refactor cycle, I added another method to our Calculator component, namely Subtract(). Here’s the code (tests and production): // CalculatorTest.cs: [Test, Description("Arguments: operand1, operand2, expectedResult")] [Row(1, 1, 0)] [Row(0, 999999999, -999999999)] [Row(0, 0, 0)] [Row(0, double.MaxValue, -double.MaxValue)] [Row(4, double.MaxValue - 2.5, -double.MaxValue)] public void TestSubtract(double operand1, double operand2, double expectedResult) { ICalculator calculator = container.GetService<ICalculator>(); double result = calculator.Subtract(operand1, operand2); Assert.AreEqual(expectedResult, result); } [Test, Description("Arguments: operand1, operand2, expectedResult")] [Row(1, 1, 0)] [Row(0, 999999999, -999999999)] [Row(0, 0, 0)] [Row(0, double.MaxValue, -double.MaxValue)] [Row(4, double.MaxValue - 2.5, -double.MaxValue)] public void TestSubtractGivesExpectedLastResult(double operand1, double operand2, double expectedResult) { ICalculator calculator = container.GetService<ICalculator>(); calculator.Subtract(operand1, operand2); Assert.AreEqual(expectedResult, calculator.LastResult); } ... // ICalculator.cs: /// <summary> /// Subtracts the specified operands. /// </summary> /// <param name="operand1">The operand1.</param> /// <param name="operand2">The operand2.</param> /// <returns>The result of the subtraction.</returns> /// <exception cref="ArgumentException"> /// Argument <paramref name="operand1"/> is < 0.<br/> /// -- or --<br/> /// Argument <paramref name="operand2"/> is < 0. /// </exception> double Subtract(double operand1, double operand2); ... // Calculator.cs: public double Subtract(double operand1, double operand2) { if (operand1 < 0.0) { throw new ArgumentException("Value must not be negative.", "operand1"); } if (operand2 < 0.0) { throw new ArgumentException("Value must not be negative.", "operand2"); } return (this.LastResult = operand1 - operand2).Value; } Obviously, the argument validation stuff that was produced during the red-green part of our cycle duplicates the code from the previous Add() method. So, to avoid code duplication and minimize the number of code lines of the production code, we do an Extract Method refactoring. One more time, this is only a matter of a few mouse clicks (and giving the new method a name) with R#: Having done that, our production code finally looks like that: using System; using LinFu.IoC.Configuration; namespace Calculator { [Implements(typeof(ICalculator))] internal class Calculator : ICalculator { #region ICalculator public double? LastResult { get; private set; } public double Add(double operand1, double operand2) { ThrowIfOneOperandIsInvalid(operand1, operand2); return (this.LastResult = operand1 + operand2).Value; } public double Subtract(double operand1, double operand2) { ThrowIfOneOperandIsInvalid(operand1, operand2); return (this.LastResult = operand1 - operand2).Value; } #endregion // ICalculator #region Implementation (Helper) private static void ThrowIfOneOperandIsInvalid(double operand1, double operand2) { if (operand1 < 0.0) { throw new ArgumentException("Value must not be negative.", "operand1"); } if (operand2 < 0.0) { throw new ArgumentException("Value must not be negative.", "operand2"); } } #endregion // Implementation (Helper) } // class Calculator } // namespace Calculator But is the above worth the effort at all? It’s obviously trivial and not very impressive. All our tests were green (for the right reasons), and refactoring the code did not change anything. It’s not immediately clear how this refactoring work adds value to the project. Derick puts it like this: STOP! Hold on a second… before you go any further and before you even think about refactoring what you just wrote to make your test pass, you need to understand something: if your done with your requirements after making the test green, you are not required to refactor the code. I know… I’m speaking heresy, here. Toss me to the wolves, I’ve gone over to the dark side! Seriously, though… if your test is passing for the right reasons, and you do not need to write any test or any more code for you class at this point, what value does refactoring add? Derick immediately answers his own question: So why should you follow the refactor portion of red/green/refactor? When you have added code that makes the system less readable, less understandable, less expressive of the domain or concern’s intentions, less architecturally sound, less DRY, etc, then you should refactor it. I couldn’t state it more precise. From my personal perspective, I’d add the following: You have to keep in mind that real-world software systems are usually quite large and there are dozens or even hundreds of occasions where micro-refactorings like the above can be applied. It’s the sum of them all that counts. And to have a good overall quality of the system (e.g. in terms of the Code Duplication Percentage metric) you have to be pedantic on the individual, seemingly trivial cases. My job regularly requires the reading and understanding of ‘foreign’ code. So code quality/readability really makes a HUGE difference for me – sometimes it can be even the difference between project success and failure… Conclusions The above described development process emerged over the years, and there were mainly two things that guided its evolution (you might call it eternal principles, personal beliefs, or anything in between): Test-driven development is the normal, natural way of writing software, code-first is exceptional. So ‘doing TDD or not’ is not a question. And good, stable code can only reliably be produced by doing TDD (yes, I know: many will strongly disagree here again, but I’ve never seen high-quality code – and high-quality code is code that stood the test of time and causes low maintenance costs – that was produced code-first…) It’s the production code that pays our bills in the end. (Though I have seen customers these days who demand an acceptance test battery as part of the final delivery. Things seem to go into the right direction…). The test code serves ‘only’ to make the production code work. But it’s the number of delivered features which solely counts at the end of the day - no matter how much test code you wrote or how good it is. With these two things in mind, I tried to optimize my coding process for coding speed – or, in business terms: productivity - without sacrificing the principles of TDD (more than I’d do either way…). As a result, I consider a ratio of about 3-5/1 for test code vs. production code as normal and desirable. In other words: roughly 60-80% of my code is test code (This might sound heavy, but that is mainly due to the fact that software development standards only begin to evolve. The entire software development profession is very young, historically seen; only at the very beginning, and there are no viable standards yet. If you think about software development as a kind of casting process, where the test code is the mold and the resulting production code is the final product, then the above ratio sounds no longer extraordinary…) Although the above might look like very much unnecessary work at first sight, it’s not. With the aid of the mentioned add-ins, doing all the above is a matter of minutes, sometimes seconds (while writing this post took hours and days…). The most important thing is to have the right tools at hand. Slow developer machines or the lack of a tool or something like that - for ‘saving’ a few 100 bucks - is just not acceptable and a very bad decision in business terms (though I quite some times have seen and heard that…). Production of high-quality products needs the usage of high-quality tools. This is a platitude that every craftsman knows… The here described round-trip will take me about five to ten minutes in my real-world development practice. I guess it’s about 30% more time compared to developing the ‘traditional’ (code-first) way. But the so manufactured ‘product’ is of much higher quality and massively reduces maintenance costs, which is by far the single biggest cost factor, as I showed in this previous post: It's the maintenance, stupid! (or: Something is rotten in developerland.). In the end, this is a highly cost-effective way of software development… But on the other hand, there clearly is a trade-off here: coding speed vs. code quality/later maintenance costs. The here described development method might be a perfect fit for the overwhelming majority of software projects, but there certainly are some scenarios where it’s not - e.g. if time-to-market is crucial for a software project. So this is a business decision in the end. It’s just that you have to know what you’re doing and what consequences this might have… Some last words First, I’d like to thank Derick Bailey again. His two aforementioned posts (which I strongly recommend for reading) inspired me to think deeply about my own personal way of doing TDD and to clarify my thoughts about it. I wouldn’t have done that without this inspiration. I really enjoy that kind of discussions… I agree with him in all respects. But I don’t know (yet?) how to bring his insights into the described production process without slowing things down. The above described method proved to be very “good enough” in my practical experience. But of course, I’m open to suggestions here… My rationale for now is: If the test is initially red during the red-green-refactor cycle, the ‘right reason’ is: it actually calls the right method, but this method is not yet operational. Later on, when the cycle is finished and the tests become part of the regular, automated Continuous Integration process, ‘red’ certainly must occur for the ‘right reason’: in this phase, ‘red’ MUST mean nothing but an unfulfilled assertion - Fail By Assertion, Not By Anything Else!

Read the article

Should a newcomer to Perl learn both Perl 5 and 6?

- by M. Joanis

Hello! I have started playing around with Perl 5 lately, and it seems very interesting. I would like to spend some time learning it more in depth when I can. My question, since Perl 6 is slowly taking place (I believe...) and is said to break backward compatibility, is this: am I better learning Perl 5 and then Perl 6, or is learning Perl 6 directly a better time investment according to you? If the changes from Perl 5 to 6 are making it hard to understand Perl 5, I should certainly start with Perl 5 to be able to read "old" scripts, then check Perl 6. There is also the "Perl 6 is not yet completely released" problem. I know there's an early adopter implementation for Perl 6, but if Perl 6 isn't officially released before some more years, I'll stick to 5 for now. I would certainly like some insight on this. Feel free to discuss related stuff. My interest in scripting languages is fairly new. Thanks!

Read the article

Heaps of Trouble?

- by Paul White NZ

If you’re not already a regular reader of Brad Schulz’s blog, you’re missing out on some great material. In his latest entry, he is tasked with optimizing a query run against tables that have no indexes at all. The problem is, predictably, that performance is not very good. The catch is that we are not allowed to create any indexes (or even new statistics) as part of our optimization efforts. In this post, I’m going to look at the problem from a slightly different angle, and present an alternative solution to the one Brad found. Inevitably, there’s going to be some overlap between our entries, and while you don’t necessarily need to read Brad’s post before this one, I do strongly recommend that you read it at some stage; he covers some important points that I won’t cover again here. The Example We’ll use data from the AdventureWorks database, copied to temporary unindexed tables. A script to create these structures is shown below: CREATE TABLE #Custs ( CustomerID INTEGER NOT NULL, TerritoryID INTEGER NULL, CustomerType NCHAR(1) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, ); GO CREATE TABLE #Prods ( ProductMainID INTEGER NOT NULL, ProductSubID INTEGER NOT NULL, ProductSubSubID INTEGER NOT NULL, Name NVARCHAR(50) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, ); GO CREATE TABLE #OrdHeader ( SalesOrderID INTEGER NOT NULL, OrderDate DATETIME NOT NULL, SalesOrderNumber NVARCHAR(25) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, CustomerID INTEGER NOT NULL, ); GO CREATE TABLE #OrdDetail ( SalesOrderID INTEGER NOT NULL, OrderQty SMALLINT NOT NULL, LineTotal NUMERIC(38,6) NOT NULL, ProductMainID INTEGER NOT NULL, ProductSubID INTEGER NOT NULL, ProductSubSubID INTEGER NOT NULL, ); GO INSERT #Custs ( CustomerID, TerritoryID, CustomerType ) SELECT C.CustomerID, C.TerritoryID, C.CustomerType FROM AdventureWorks.Sales.Customer C WITH (TABLOCK); GO INSERT #Prods ( ProductMainID, ProductSubID, ProductSubSubID, Name ) SELECT P.ProductID, P.ProductID, P.ProductID, P.Name FROM AdventureWorks.Production.Product P WITH (TABLOCK); GO INSERT #OrdHeader ( SalesOrderID, OrderDate, SalesOrderNumber, CustomerID ) SELECT H.SalesOrderID, H.OrderDate, H.SalesOrderNumber, H.CustomerID FROM AdventureWorks.Sales.SalesOrderHeader H WITH (TABLOCK); GO INSERT #OrdDetail ( SalesOrderID, OrderQty, LineTotal, ProductMainID, ProductSubID, ProductSubSubID ) SELECT D.SalesOrderID, D.OrderQty, D.LineTotal, D.ProductID, D.ProductID, D.ProductID FROM AdventureWorks.Sales.SalesOrderDetail D WITH (TABLOCK); The query itself is a simple join of the four tables: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #Prods P JOIN #OrdDetail D ON P.ProductMainID = D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID JOIN #OrdHeader H ON D.SalesOrderID = H.SalesOrderID JOIN #Custs C ON H.CustomerID = C.CustomerID ORDER BY P.ProductMainID ASC OPTION (RECOMPILE, MAXDOP 1); Remember that these tables have no indexes at all, and only the single-column sampled statistics SQL Server automatically creates (assuming default settings). The estimated query plan produced for the test query looks like this (click to enlarge): The Problem The problem here is one of cardinality estimation – the number of rows SQL Server expects to find at each step of the plan. The lack of indexes and useful statistical information means that SQL Server does not have the information it needs to make a good estimate. Every join in the plan shown above estimates that it will produce just a single row as output. Brad covers the factors that lead to the low estimates in his post. In reality, the join between the #Prods and #OrdDetail tables will produce 121,317 rows. It should not surprise you that this has rather dire consequences for the remainder of the query plan. In particular, it makes a nonsense of the optimizer’s decision to use Nested Loops to join to the two remaining tables. Instead of scanning the #OrdHeader and #Custs tables once (as it expected), it has to perform 121,317 full scans of each. The query takes somewhere in the region of twenty minutes to run to completion on my development machine. A Solution At this point, you may be thinking the same thing I was: if we really are stuck with no indexes, the best we can do is to use hash joins everywhere. We can force the exclusive use of hash joins in several ways, the two most common being join and query hints. A join hint means writing the query using the INNER HASH JOIN syntax; using a query hint involves adding OPTION (HASH JOIN) at the bottom of the query. The difference is that using join hints also forces the order of the join, whereas the query hint gives the optimizer freedom to reorder the joins at its discretion. Adding the OPTION (HASH JOIN) hint results in this estimated plan: That produces the correct output in around seven seconds, which is quite an improvement! As a purely practical matter, and given the rigid rules of the environment we find ourselves in, we might leave things there. (We can improve the hashing solution a bit – I’ll come back to that later on). Faster Nested Loops It might surprise you to hear that we can beat the performance of the hash join solution shown above using nested loops joins exclusively, and without breaking the rules we have been set. The key to this part is to realize that a condition like (A = B) can be expressed as (A <= B) AND (A >= B). Armed with this tremendous new insight, we can rewrite the join predicates like so: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #OrdDetail D JOIN #OrdHeader H ON D.SalesOrderID >= H.SalesOrderID AND D.SalesOrderID <= H.SalesOrderID JOIN #Custs C ON H.CustomerID >= C.CustomerID AND H.CustomerID <= C.CustomerID JOIN #Prods P ON P.ProductMainID >= D.ProductMainID AND P.ProductMainID <= D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID ORDER BY D.ProductMainID OPTION (RECOMPILE, LOOP JOIN, MAXDOP 1, FORCE ORDER); I’ve also added LOOP JOIN and FORCE ORDER query hints to ensure that only nested loops joins are used, and that the tables are joined in the order they appear. The new estimated execution plan is: This new query runs in under 2 seconds. Why Is It Faster? The main reason for the improvement is the appearance of the eager Index Spools, which are also known as index-on-the-fly spools. If you read my Inside The Optimiser series you might be interested to know that the rule responsible is called JoinToIndexOnTheFly. An eager index spool consumes all rows from the table it sits above, and builds a index suitable for the join to seek on. Taking the index spool above the #Custs table as an example, it reads all the CustomerID and TerritoryID values with a single scan of the table, and builds an index keyed on CustomerID. The term ‘eager’ means that the spool consumes all of its input rows when it starts up. The index is built in a work table in tempdb, has no associated statistics, and only exists until the query finishes executing. The result is that each unindexed table is only scanned once, and just for the columns necessary to build the temporary index. From that point on, every execution of the inner side of the join is answered by a seek on the temporary index – not the base table. A second optimization is that the sort on ProductMainID (required by the ORDER BY clause) is performed early, on just the rows coming from the #OrdDetail table. The optimizer has a good estimate for the number of rows it needs to sort at that stage – it is just the cardinality of the table itself. The accuracy of the estimate there is important because it helps determine the memory grant given to the sort operation. Nested loops join preserves the order of rows on its outer input, so sorting early is safe. (Hash joins do not preserve order in this way, of course). The extra lazy spool on the #Prods branch is a further optimization that avoids executing the seek on the temporary index if the value being joined (the ‘outer reference’) hasn’t changed from the last row received on the outer input. It takes advantage of the fact that rows are still sorted on ProductMainID, so if duplicates exist, they will arrive at the join operator one after the other. The optimizer is quite conservative about introducing index spools into a plan, because creating and dropping a temporary index is a relatively expensive operation. It’s presence in a plan is often an indication that a useful index is missing. I want to stress that I rewrote the query in this way primarily as an educational exercise – I can’t imagine having to do something so horrible to a production system. Improving the Hash Join I promised I would return to the solution that uses hash joins. You might be puzzled that SQL Server can create three new indexes (and perform all those nested loops iterations) faster than it can perform three hash joins. The answer, again, is down to the poor information available to the optimizer. Let’s look at the hash join plan again: Two of the hash joins have single-row estimates on their build inputs. SQL Server fixes the amount of memory available for the hash table based on this cardinality estimate, so at run time the hash join very quickly runs out of memory. This results in the join spilling hash buckets to disk, and any rows from the probe input that hash to the spilled buckets also get written to disk. The join process then continues, and may again run out of memory. This is a recursive process, which may eventually result in SQL Server resorting to a bailout join algorithm, which is guaranteed to complete eventually, but may be very slow. The data sizes in the example tables are not large enough to force a hash bailout, but it does result in multiple levels of hash recursion. You can see this for yourself by tracing the Hash Warning event using the Profiler tool. The final sort in the plan also suffers from a similar problem: it receives very little memory and has to perform multiple sort passes, saving intermediate runs to disk (the Sort Warnings Profiler event can be used to confirm this). Notice also that because hash joins don’t preserve sort order, the sort cannot be pushed down the plan toward the #OrdDetail table, as in the nested loops plan. Ok, so now we understand the problems, what can we do to fix it? We can address the hash spilling by forcing a different order for the joins: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #Prods P JOIN #Custs C JOIN #OrdHeader H ON H.CustomerID = C.CustomerID JOIN #OrdDetail D ON D.SalesOrderID = H.SalesOrderID ON P.ProductMainID = D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID ORDER BY D.ProductMainID OPTION (MAXDOP 1, HASH JOIN, FORCE ORDER); With this plan, each of the inputs to the hash joins has a good estimate, and no hash recursion occurs. The final sort still suffers from the one-row estimate problem, and we get a single-pass sort warning as it writes rows to disk. Even so, the query runs to completion in three or four seconds. That’s around half the time of the previous hashing solution, but still not as fast as the nested loops trickery. Final Thoughts SQL Server’s optimizer makes cost-based decisions, so it is vital to provide it with accurate information. We can’t really blame the performance problems highlighted here on anything other than the decision to use completely unindexed tables, and not to allow the creation of additional statistics. I should probably stress that the nested loops solution shown above is not one I would normally contemplate in the real world. It’s there primarily for its educational and entertainment value. I might perhaps use it to demonstrate to the sceptical that SQL Server itself is crying out for an index. Be sure to read Brad’s original post for more details. My grateful thanks to him for granting permission to reuse some of his material. Paul White Email: [email protected] Twitter: @PaulWhiteNZ

Search Results

Search found 11913 results on 477 pages for 'fail fast fail early'.

Page 187/477 | < Previous Page | 183 184 185 186 187 188 189 190 191 192 193 194 | Next Page >

- by WinkyWolly

- by Shackrock

- by Wes

- by Good_quess

- by Mike Silvis

- by Sam W.

- by derobert

- by allenwei

- by Dan R

- by user328170

- by user20555

- by Frantumn

- by user30336

- by Le_Quack

- by GreenAsJade

- by sanksjaya

- by Charles Knapp

- by Kavitha

- by samsudeen

- by Rob Chartier

- by Jason Fitzpatrick

- by jamiet

- by Thomas Weller

- by M. Joanis

- by Paul White NZ

< Previous Page | 183 184 185 186 187 188 189 190 191 192 193 194 | Next Page >