Search Results

Search found 1124 results on 45 pages for 'anthony clever'.

Page 42/45 | < Previous Page | 38 39 40 41 42 43 44 45 | Next Page >

Talking JavaOne with Rock Star Kirk Pepperdine

- by Janice J. Heiss

Kirk Pepperdine is not only a JavaOne Rock Star but a Java Champion and a highly regarded expert in Java performance tuning who works as a consultant, educator, and author. He is the principal consultant at Kodewerk Ltd. He speaks frequently at conferences and co-authored the Ant Developer's Handbook. In the rapidly shifting world of information technology, Pepperdine, as much as anyone, keeps up with what's happening with Java performance tuning. Pepperdine will participate in the following sessions: CON5405 - Are Your Garbage Collection Logs Speaking to You? BOF6540 - Java Champions and JUG Leaders Meet Oracle Executives (with Jeff Genender, Mattias Karlsson, Henrik Stahl, Georges Saab) HOL6500 - Finding and Solving Java Deadlocks (with Heinz Kabutz, Ellen Kraffmiller Martijn Verburg, Jeff Genender, and Henri Tremblay) I asked him what technological changes need to be taken into account in performance tuning. “The volume of data we're dealing with just seems to be getting bigger and bigger all the time,” observed Pepperdine. “A couple of years ago you'd never think of needing a heap that was 64g, but today there are deployments where the heap has grown to 256g and tomorrow there are plans for heaps that are even larger. Dealing with all that data simply requires more horse power and some very specialized techniques. In some cases, teams are trying to push hardware to the breaking point. Under those conditions, you need to be very clever just to get things to work -- let alone to get them to be fast. We are very quickly moving from a world where everything happens in a transaction to one where if you were to even consider using a transaction, you've lost." When asked about the greatest misconceptions about performance tuning that he currently encounters, he said, “If you have a performance problem, you should start looking at code at the very least and for that extra step, whip out an execution profiler. I'm not going to say that I never use execution profilers or look at code. What I will say is that execution profilers are effective for a small subset of performance problems and code is literally the last thing you should look at.And what is the most exciting thing happening in the world of Java today? “Interesting question because so many people would say that nothing exciting is happening in Java. Some might be disappointed that a few features have slipped in terms of scheduling. But I'd disagree with the first group and I'm not so concerned about the slippage because I still see a lot of exciting things happening. First, lambda will finally be with us and with lambda will come better ways.” For JavaOne, he is proctoring for Heinz Kabutz's lab. “I'm actually looking forward to that more than I am to my own talk,” he remarked. “Heinz will be the third non-Sun/Oracle employee to present a lab and the first since Oracle began hosting JavaOne. He's got a great message. He's spent a ton of time making sure things are going to work, and we've got a great team of proctors to help out. After that, getting my talk done, the Java Champion's panel session and then kicking back and just meeting up and talking to some Java heads."Finally, what should Java developers know that they currently do not know? “’Write Once, Run Everywhere’ is a great slogan and Java has come closer to that dream than any other technology stack that I've used. That said, different hardware bits work differently and as hard as we try, the JVM can't hide all the differences. Plus, if we are to get good performance we need to work with our hardware and not against it. All this implies that Java developers need to know more about the hardware they are deploying to.” Originally published on blogs.oracle.com/javaone.

Read the article
How Can I Start an Incognito/Private Browsing Window from a Shortcut?

- by Jason Fitzpatrick

Sometimes you just want to pop the browser open for a quick web search without reloading all your saved tabs; read on as we show a fellow reader how to make a quick private-browsing shortcut. Dear How-To Geek, I came up with a solution to my problem, but I need your help implementing it. I typically have a ton of tabs open in my web browser and, when I need to free up system resources when gaming or using a resource intense application, I shut down the web browser. The problem arises when I find myself needing to do quick web search while the browser is shut down. I don’t want to open it up, load all the tabs, and waste the resources in doing so all for a quick Google search. The perfect solution, it would seem, is to open up one of Chrome’s Incognito windows: it loads separate, it won’t open up all the old tabs, and it’s perfect for a quick Google search. Is there a way to launch Chrome with a single Incognito window open without having to open the browser in the normal mode (and load the bazillion tabs I have sitting there)? Sincerely, Tab Crazy That’s a rather clever work around to your problem. Since you’ve already done the hard work of figuring out the solution you need, we’re more than happy to help you across the finish line. The magic you seek is available via what are known as “command line options” which allow you to add additional parameters and switches onto a command. By appending the command the Chrome shortcut uses, we can easily tell it to launch in Incognito mode. (And, for other readers following along at home, we can do the same thing with other browsers like Firefox). First, let’s look at Chrome’s default shortcut: If you right click on it and select the properties menu, you’ll see where the shortcut points: "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" If you run that shortcut, you’ll open up normal browsing mode in Chrome and your saved tabs will all load. What we need to do is use the command line switches available for Chrome and tell it that we want it to launch an Incognito window instead. Doing so is as simple as appending the end of the “Target” box’s command line entry with -incognito, like so: "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" -incognito We’d also recommend changing the icon to it’s easy to tell the default Chrome shortcut apart from your new Incognito shortcut. When you’re done, make sure to hit OK/Apply at the button to save the changes. You can recreate the same private-browsing-shortcut effect with other major web browsers too. Repeat shortcut editing steps we highlighted above, but change out the -incognito with -private (for Firefox and Internet Explorer) and -newprivatetab (for Opera). With just a simple command line switch applied, you can now launch a lightweight single browser window for those quick web searches without having to stop your game and load up all your saved tabs. Have a pressing tech question? Email us at [email protected] and we’ll do our best to answer it.

Read the article
Cloud Backup: Getting the Users' Backs Up

- by Tony Davis

On Wednesday last week, Microsoft announced that as of July 1, all data transfers into its Microsoft Azure cloud will be free (though you have to pay for transferring data out). On Thursday last week, SQL Azure in Western Europe went down. It was a relatively short outage, but since SQL Azure currently provides no easy way to take a standard backup of a database and store it locally, many people had no recourse but to wait patiently for their cloud-based app to resume. It seems that Microsoft are very keen encourage developers to move their data onto their cloud, but are developers ready to do it, given that such basic backup capabilities are lacking? Recently on Simple-Talk, Mike Mooney described a perfect use case for the Microsoft Cloud. They had a simple web-based application with a SQL Server backend; they could move the application to Windows Azure, and the data into SQL Azure and in the process free themselves from much of the hassle surrounding management and scaling of the hardware, network and so on. It was a great fit and yet it nearly didn't happen; lack of support for the BACKUP command almost proved a show-stopper. Of course, backups of Azure databases are always and have always been taken automatically, for disaster recovery purposes, but these are strictly on-cloud copies and as of now it is not possible to use them to them to restore a database to a particular point in time. It seems that none of those clever Microsoft people managed to predict the need to perform basic backups of Azure databases so that copies could be stored locally, outside the Azure universe. At the very least, as Mike points out, performing a local backup before a new deployment is more or less mandatory. Microsoft did at least note the sound of gnashing teeth and, as a stop-gap measure, offered SQL Azure Database Copy which basically allows you to create an online clone of your database, but this doesn't allow for storing local archives of the data. To that end MS has provided SQL Azure Import/Export, to package up and export a database and its data, using BACPACs. These BACPACs do not guarantee transactional consistency; for example, if a child table is modified after the parent is copied, then the copied database will be in inconsistent state (meaning, to add to the fun, BACPACs need to be created from a database copy). In any event, widespread problems with BACPAC's evil cousin, the DACPAC have been well-documented, and it seems likely that many will also give BACPAC the bum's rush. Finally, in a TechEd 2011 presentation tagged "SQL Azure Advanced Administration", it was announced that "backup and restore" were coming in the next SQL Azure CTP. And yet this still doesn't mean that we'll get simple backups as DBAs know and love them. What it does mean, at least, is the ability to restore any given database to a point in time within a 2-week window. For the time being, if you want a local copy of your data and don't want to brave the BACPAC, one is left with SSIS or BCP, creative use of schema and data comparison tools, or use of SQL Azure Backup (currently in beta) in order to perform this simple but vital task. Cheers, Tony.

Read the article
Sunshine after the iCloud release?

- by Laila

"Why should I believe them? They're the ones that brought us MobileMe? It was not our finest hour, but we learned a lot." Steve Jobs June 6th 2011 Apple's new cloud service has been met with uncritical excitement by industry commentators. It is wonderful what a rename can do. Apple has had a 'cloud' offering for three years called MobileMe, successor to .MAC and iTools, so iCloud is now the fourth internet service Apple have attempted. If this had been Microsoft, there would have been catcalls all around the blogosphere. I'll admit that there is a lot more functionality announced for iCloud than MobileMe has ever managed to achieve, but then almost anything has more functionality than MobileMe. It's an expensive service (£120 a year in the UK, $90 in the states), launched as far back as June 9, 2008, that has delivered very little and suffered a string of technical problems; the documentation was mainly a community effort, built up gradually by the frustrated and angry users. It was supposed to synchronise PC Outlook calendars but couldn't manage Microsoft Exchange (Google could, of course). It used WebDAV to allow Windows users to attach to the filestore, but didn't document how to do it. The method for downloading and uploading files to the cloud-based filestore was ridiculously clunky. It allowed you to post photos on a public site, but forgot to include a way of deleting photos. I could go on with the list, but you can explore the many sites that have flourished to inhabit the support-vacuum left by Apple. MobileMe should have had all the bright new clever things announced for iCloud. Apple dropped the ball, and allowed services such as Flickr to fill the void. However, their PR skills are such that, a name-change later (the .ME.com email address remains), it has turned a rout into a victory, and hundreds of earnest bloggers have been extolling Apple's expertise in cloud matters. This must be frustrating for the other cloud providers who have quietly got the technology working right. I wish iCloud well, even though I resent the expensive mess they made of MobileMe. Apple promise that iCloud will sync files, apps, app data, and media across all the different iOS5 devices, Macs, and PCs. It also hopes to sync music across devices, but not video content. They've offered existing MobileMe users free use of the MobileMe service for a year as the product is morphed, and they will be able to transfer to iCloud when it is launched in the autumn. On June 30, 2012, MobileMe will die, and Apple's iWeb is also soon to join iTools and .MAC in the hereafter. So why get excited about iCloud? That all depends on the level of PC integration. Whereas iOS5 machines will be full participants in the new world of data-sharing (Sorry iPod Touch users) what about .NET libraries? There is talk of synchronising 'My Pictures' libraries with iOS5 and iMac machines, but little more detail as yet. Apple has a lot to prove with iCloud and anyone with actual experience of their past attempts to get into cloud services will be wary.

Read the article
Talking JavaOne with Rock Star Kirk Pepperdine

- by Janice J. Heiss

Kirk Pepperdine is not only a JavaOne Rock Star but a Java Champion and a highly regarded expert in Java performance tuning who works as a consultant, educator, and author. He is the principal consultant at Kodewerk Ltd. He speaks frequently at conferences and co-authored the Ant Developer's Handbook. In the rapidly shifting world of information technology, Pepperdine, as much as anyone, keeps up with what's happening with Java performance tuning. Pepperdine will participate in the following sessions: CON5405 - Are Your Garbage Collection Logs Speaking to You? BOF6540 - Java Champions and JUG Leaders Meet Oracle Executives (with Jeff Genender, Mattias Karlsson, Henrik Stahl, Georges Saab) HOL6500 - Finding and Solving Java Deadlocks (with Heinz Kabutz, Ellen Kraffmiller Martijn Verburg, Jeff Genender, and Henri Tremblay) I asked him what technological changes need to be taken into account in performance tuning. “The volume of data we're dealing with just seems to be getting bigger and bigger all the time,” observed Pepperdine. “A couple of years ago you'd never think of needing a heap that was 64g, but today there are deployments where the heap has grown to 256g and tomorrow there are plans for heaps that are even larger. Dealing with all that data simply requires more horse power and some very specialized techniques. In some cases, teams are trying to push hardware to the breaking point. Under those conditions, you need to be very clever just to get things to work -- let alone to get them to be fast. We are very quickly moving from a world where everything happens in a transaction to one where if you were to even consider using a transaction, you've lost." When asked about the greatest misconceptions about performance tuning that he currently encounters, he said, “If you have a performance problem, you should start looking at code at the very least and for that extra step, whip out an execution profiler. I'm not going to say that I never use execution profilers or look at code. What I will say is that execution profilers are effective for a small subset of performance problems and code is literally the last thing you should look at.And what is the most exciting thing happening in the world of Java today? “Interesting question because so many people would say that nothing exciting is happening in Java. Some might be disappointed that a few features have slipped in terms of scheduling. But I'd disagree with the first group and I'm not so concerned about the slippage because I still see a lot of exciting things happening. First, lambda will finally be with us and with lambda will come better ways.” For JavaOne, he is proctoring for Heinz Kabutz's lab. “I'm actually looking forward to that more than I am to my own talk,” he remarked. “Heinz will be the third non-Sun/Oracle employee to present a lab and the first since Oracle began hosting JavaOne. He's got a great message. He's spent a ton of time making sure things are going to work, and we've got a great team of proctors to help out. After that, getting my talk done, the Java Champion's panel session and then kicking back and just meeting up and talking to some Java heads."Finally, what should Java developers know that they currently do not know? “’Write Once, Run Everywhere’ is a great slogan and Java has come closer to that dream than any other technology stack that I've used. That said, different hardware bits work differently and as hard as we try, the JVM can't hide all the differences. Plus, if we are to get good performance we need to work with our hardware and not against it. All this implies that Java developers need to know more about the hardware they are deploying to.”

Read the article
ConcurrentDictionary<TKey,TValue> used with Lazy<T>

- by Reed

In a recent thread on the MSDN forum for the TPL, Stephen Toub suggested mixing ConcurrentDictionary<T,U> with Lazy<T>. This provides a fantastic model for creating a thread safe dictionary of values where the construction of the value type is expensive. This is an incredibly useful pattern for many operations, such as value caches. The ConcurrentDictionary<TKey, TValue> class was added in .NET 4, and provides a thread-safe, lock free collection of key value pairs. While this is a fantastic replacement for Dictionary<TKey, TValue>, it has a potential flaw when used with values where construction of the value class is expensive. The typical way this is used is to call a method such as GetOrAdd to fetch or add a value to the dictionary. It handles all of the thread safety for you, but as a result, if two threads call this simultaneously, two instances of TValue can easily be constructed. If TValue is very expensive to construct, or worse, has side effects if constructed too often, this is less than desirable. While you can easily work around this with locking, Stephen Toub provided a very clever alternative – using Lazy<TValue> as the value in the dictionary instead. This looks like the following. Instead of calling: MyValue value = dictionary.GetOrAdd( key, () => new MyValue(key)); .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } We would instead use a ConcurrentDictionary<TKey, Lazy<TValue>>, and write: MyValue value = dictionary.GetOrAdd( key, () => new Lazy<MyValue>( () => new MyValue(key))) .Value; This simple change dramatically changes how the operation works. Now, if two threads call this simultaneously, instead of constructing two MyValue instances, we construct two Lazy<MyValue> instances. However, the Lazy<T> class is very cheap to construct. Unlike “MyValue”, we can safely afford to construct this twice and “throw away” one of the instances. We then call Lazy<T>.Value at the end to fetch our “MyValue” instance. At this point, GetOrAdd will always return the same instance of Lazy<MyValue>. Since Lazy<T> doesn’t construct the MyValue instance until requested, the actual MyClass instance returned is only constructed once.

Read the article
Bidirectional real-time sync of large file tree between two distant linux servers

- by dlo

By large file tree I mean about 200k files, and growing all the time. A relatively small number of files are being changed in any given hour though. By bidirectional I mean that changes may occur on either server and need to be pushed to the other, so rsync doesn't seem appropriate. By distant I mean that the servers are both in data centers, but geographically remote from each other. Currently there are only 2 servers, but that may expand over time. By real-time, it's ok for there to be a little latency between syncing, but running a cron every 1-2 minutes doesn't seem right, since a very small fraction of files may change in any given hour, let alone minute. EDIT: This is running on VPS's so I might be limited on the kinds of kernel-level stuff I can do. Also, the VPS's are not resource-rich, so I'd shy away from solutions that require lots of ram (like Gluster?). What's the best / most "accepted" approach to get this done? This seems like it would be a common need, but I haven't been able to find a generally accepted approach yet, which was surprising. (I'm seeking the safety of the masses. :) I've come across lsyncd to trigger a sync at the filesystem change level. That seems clever though not super common, and I'm a bit confused by the various lsyncd approaches. There's just using lsyncd with rsync, but it seems this could be fragile for bidirectionality since rsync doesn't have a notion of memory (eg- to know whether a deleted file on A should be deleted on B or whether it's a new file on B that should be copied to A). lipsync appears to be just a lsyncd+rsync implementation, right? Then there's using lsyncd with csync2, like this: http://www.axivo.com/community/threads/lightning-fast-synchronization-with-csync2-and-lsyncd.121/ ... I'm leaning towards this approach, but csync2 is a little quirky, though I did do a successful test of it. I'm mostly concerned that I haven't been able to find a lot of community confirmation of this method. People on here seem to like Unison a lot, but it seems that it is no longer under active development and it's not clear that it has an automatic trigger like lsyncd. I've seen Gluster mentioned, but maybe overkill for what I need? UPDATE: fyi- I ended up going with the original solution I mentioned: lsyncd+csync2. It seems to work quite well, and I like the architectural approach of having the servers be very loosely joined, so that each server can operate indefinitely on its own regardless of the link quality between them.

Read the article
Echo 404 directly from nginx to improve performance

- by user64204

I am in charge of production servers serving static content for a website. Those servers are constantly being crawled by bots looking for potential exploits (which isn't that much of a problem security-wise because no application can be reached behind the web server) but generates thousands of 404 per day, sometimes per hour. I am looking into ways of blocking those requests but it's tricky (you want to make sure you don't block legitimate traffic and these bots are becoming more and more clever at looking like they're legit) and is going to take me a while to find an acceptable solution. In the meantime I would like to reduce the performance impact of serving those 404 pages. Indeed we're using nginx which by default is configured to serve it's 404 page from the disk (This can be changed using the error_page directive but in the end the 404 will either have to be served from disk or from another external source (e.g. upstream application which would be worst)) which isn't ideal. I ran a test with ab on my local machine with a basic configuration: in one case I echo a message directly from nginx so the disk isn't touched at all, in the other case I hit a missing page and nginx serves its 404 from disk. server { # [...] the default nginx stuff location / { } location /this_page_exists { echo "this page was found"; } } Here are the test results (my laptop has Intel(R) Core(TM) i7-2670QM + SSD in case you're wondering why they are so high): $ ab -n 500000 -c 1000 http://localhost/this_page_exists Requests per second: 25609.16 [#/sec] (mean) $ ab -n 500000 -c 1000 http://localhost/this_page_doesnt_exists Requests per second: 22905.72 [#/sec] (mean) As you can see, returning a value with echo is 11% ((25609-22905)÷22905×100) faster than serving the 404 page from disk. Accordingly I would like to echo a simple 404 Page not Found string from nginx. I tried many things so far but they all failed, essentially the idea was this: location / { try_files $uri @not_found; } location @not_found { echo "404 - Page not found"; } The problem is that as soon as the echo directive is used, the http response code is set to 200. I tried changing that by doing error_page 200 = 400 but that breaks the configuration. How can I serve a 404 page directly from nginx? (without hacking the source which may be might next step)

Read the article
Zen and the Art of File and Folder Organization

- by Mark Virtue

Is your desk a paragon of neatness, or does it look like a paper-bomb has gone off? If you’ve been putting off getting organized because the task is too huge or daunting, or you don’t know where to start, we’ve got 40 tips to get you on the path to zen mastery of your filing system. For all those readers who would like to get their files and folders organized, or, if they’re already organized, better organized—we have compiled a complete guide to getting organized and staying organized, a comprehensive article that will hopefully cover every possible tip you could want. Signs that Your Computer is Poorly Organized If your computer is a mess, you’re probably already aware of it. But just in case you’re not, here are some tell-tale signs: Your Desktop has over 40 icons on it “My Documents” contains over 300 files and 60 folders, including MP3s and digital photos You use the Windows’ built-in search facility whenever you need to find a file You can’t find programs in the out-of-control list of programs in your Start Menu You save all your Word documents in one folder, all your spreadsheets in a second folder, etc Any given file that you’re looking for may be in any one of four different sets of folders But before we start, here are some quick notes: We’re going to assume you know what files and folders are, and how to create, save, rename, copy and delete them The organization principles described in this article apply equally to all computer systems. However, the screenshots here will reflect how things look on Windows (usually Windows 7). We will also mention some useful features of Windows that can help you get organized. Everyone has their own favorite methodology of organizing and filing, and it’s all too easy to get into “My Way is Better than Your Way” arguments. The reality is that there is no perfect way of getting things organized. When I wrote this article, I tried to keep a generalist and objective viewpoint. I consider myself to be unusually well organized (to the point of obsession, truth be told), and I’ve had 25 years experience in collecting and organizing files on computers. So I’ve got a lot to say on the subject. But the tips I have described here are only one way of doing it. Hopefully some of these tips will work for you too, but please don’t read this as any sort of “right” way to do it. At the end of the article we’ll be asking you, the reader, for your own organization tips. Why Bother Organizing At All? For some, the answer to this question is self-evident. And yet, in this era of powerful desktop search software (the search capabilities built into the Windows Vista and Windows 7 Start Menus, and third-party programs like Google Desktop Search), the question does need to be asked, and answered. I have a friend who puts every file he ever creates, receives or downloads into his My Documents folder and doesn’t bother filing them into subfolders at all. He relies on the search functionality built into his Windows operating system to help him find whatever he’s looking for. And he always finds it. He’s a Search Samurai. For him, filing is a waste of valuable time that could be spent enjoying life! It’s tempting to follow suit. On the face of it, why would anyone bother to take the time to organize their hard disk when such excellent search software is available? Well, if all you ever want to do with the files you own is to locate and open them individually (for listening, editing, etc), then there’s no reason to ever bother doing one scrap of organization. But consider these common tasks that are not achievable with desktop search software: Find files manually. Often it’s not convenient, speedy or even possible to utilize your desktop search software to find what you want. It doesn’t work 100% of the time, or you may not even have it installed. Sometimes its just plain faster to go straight to the file you want, if you know it’s in a particular sub-folder, rather than trawling through hundreds of search results. Find groups of similar files (e.g. all your “work” files, all the photos of your Europe holiday in 2008, all your music videos, all the MP3s from Dark Side of the Moon, all your letters you wrote to your wife, all your tax returns). Clever naming of the files will only get you so far. Sometimes it’s the date the file was created that’s important, other times it’s the file format, and other times it’s the purpose of the file. How do you name a collection of files so that they’re easy to isolate based on any of the above criteria? Short answer, you can’t. Move files to a new computer. It’s time to upgrade your computer. How do you quickly grab all the files that are important to you? Or you decide to have two computers now – one for home and one for work. How do you quickly isolate only the work-related files to move them to the work computer? Synchronize files to other computers. If you have more than one computer, and you need to mirror some of your files onto the other computer (e.g. your music collection), then you need a way to quickly determine which files are to be synced and which are not. Surely you don’t want to synchronize everything? Choose which files to back up. If your backup regime calls for multiple backups, or requires speedy backups, then you’ll need to be able to specify which files are to be backed up, and which are not. This is not possible if they’re all in the same folder. Finally, if you’re simply someone who takes pleasure in being organized, tidy and ordered (me! me!), then you don’t even need a reason. Being disorganized is simply unthinkable. Tips on Getting Organized Here we present our 40 best tips on how to get organized. Or, if you’re already organized, to get better organized. Tip #1. Choose Your Organization System Carefully The reason that most people are not organized is that it takes time. And the first thing that takes time is deciding upon a system of organization. This is always a matter of personal preference, and is not something that a geek on a website can tell you. You should always choose your own system, based on how your own brain is organized (which makes the assumption that your brain is, in fact, organized). We can’t instruct you, but we can make suggestions: You may want to start off with a system based on the users of the computer. i.e. “My Files”, “My Wife’s Files”, My Son’s Files”, etc. Inside “My Files”, you might then break it down into “Personal” and “Business”. You may then realize that there are overlaps. For example, everyone may want to share access to the music library, or the photos from the school play. So you may create another folder called “Family”, for the “common” files. You may decide that the highest-level breakdown of your files is based on the “source” of each file. In other words, who created the files. You could have “Files created by ME (business or personal)”, “Files created by people I know (family, friends, etc)”, and finally “Files created by the rest of the world (MP3 music files, downloaded or ripped movies or TV shows, software installation files, gorgeous desktop wallpaper images you’ve collected, etc).” This system happens to be the one I use myself. See below: Mark is for files created by meVC is for files created by my company (Virtual Creations)Others is for files created by my friends and familyData is the rest of the worldAlso, Settings is where I store the configuration files and other program data files for my installed software (more on this in tip #34, below). Each folder will present its own particular set of requirements for further sub-organization. For example, you may decide to organize your music collection into sub-folders based on the artist’s name, while your digital photos might get organized based on the date they were taken. It can be different for every sub-folder! Another strategy would be based on “currentness”. Files you have yet to open and look at live in one folder. Ones that have been looked at but not yet filed live in another place. Current, active projects live in yet another place. All other files (your “archive”, if you like) would live in a fourth folder. (And of course, within that last folder you’d need to create a further sub-system based on one of the previous bullet points). Put some thought into this – changing it when it proves incomplete can be a big hassle! Before you go to the trouble of implementing any system you come up with, examine a wide cross-section of the files you own and see if they will all be able to find a nice logical place to sit within your system. Tip #2. When You Decide on Your System, Stick to It! There’s nothing more pointless than going to all the trouble of creating a system and filing all your files, and then whenever you create, receive or download a new file, you simply dump it onto your Desktop. You need to be disciplined – forever! Every new file you get, spend those extra few seconds to file it where it belongs! Otherwise, in just a month or two, you’ll be worse off than before – half your files will be organized and half will be disorganized – and you won’t know which is which! Tip #3. Choose the Root Folder of Your Structure Carefully Every data file (document, photo, music file, etc) that you create, own or is important to you, no matter where it came from, should be found within one single folder, and that one single folder should be located at the root of your C: drive (as a sub-folder of C:\). In other words, do not base your folder structure in standard folders like “My Documents”. If you do, then you’re leaving it up to the operating system engineers to decide what folder structure is best for you. And every operating system has a different system! In Windows 7 your files are found in C:\Users\YourName, whilst on Windows XP it was C:\Documents and Settings\YourName\My Documents. In UNIX systems it’s often /home/YourName. These standard default folders tend to fill up with junk files and folders that are not at all important to you. “My Documents” is the worst offender. Every second piece of software you install, it seems, likes to create its own folder in the “My Documents” folder. These folders usually don’t fit within your organizational structure, so don’t use them! In fact, don’t even use the “My Documents” folder at all. Allow it to fill up with junk, and then simply ignore it. It sounds heretical, but: Don’t ever visit your “My Documents” folder! Remove your icons/links to “My Documents” and replace them with links to the folders you created and you care about! Create your own file system from scratch! Probably the best place to put it would be on your D: drive – if you have one. This way, all your files live on one drive, while all the operating system and software component files live on the C: drive – simply and elegantly separated. The benefits of that are profound. Not only are there obvious organizational benefits (see tip #10, below), but when it comes to migrate your data to a new computer, you can (sometimes) simply unplug your D: drive and plug it in as the D: drive of your new computer (this implies that the D: drive is actually a separate physical disk, and not a partition on the same disk as C:). You also get a slight speed improvement (again, only if your C: and D: drives are on separate physical disks). Warning: From tip #12, below, you will see that it’s actually a good idea to have exactly the same file system structure – including the drive it’s filed on – on all of the computers you own. So if you decide to use the D: drive as the storage system for your own files, make sure you are able to use the D: drive on all the computers you own. If you can’t ensure that, then you can still use a clever geeky trick to store your files on the D: drive, but still access them all via the C: drive (see tip #17, below). If you only have one hard disk (C:), then create a dedicated folder that will contain all your files – something like C:\Files. The name of the folder is not important, but make it a single, brief word. There are several reasons for this: When creating a backup regime, it’s easy to decide what files should be backed up – they’re all in the one folder! If you ever decide to trade in your computer for a new one, you know exactly which files to migrate You will always know where to begin a search for any file If you synchronize files with other computers, it makes your synchronization routines very simple. It also causes all your shortcuts to continue to work on the other machines (more about this in tip #24, below). Once you’ve decided where your files should go, then put all your files in there – Everything! Completely disregard the standard, default folders that are created for you by the operating system (“My Music”, “My Pictures”, etc). In fact, you can actually relocate many of those folders into your own structure (more about that below, in tip #6). The more completely you get all your data files (documents, photos, music, etc) and all your configuration settings into that one folder, then the easier it will be to perform all of the above tasks. Once this has been done, and all your files live in one folder, all the other folders in C:\ can be thought of as “operating system” folders, and therefore of little day-to-day interest for us. Here’s a screenshot of a nicely organized C: drive, where all user files are located within the \Files folder: Tip #4. Use Sub-Folders This would be our simplest and most obvious tip. It almost goes without saying. Any organizational system you decide upon (see tip #1) will require that you create sub-folders for your files. Get used to creating folders on a regular basis. Tip #5. Don’t be Shy About Depth Create as many levels of sub-folders as you need. Don’t be scared to do so. Every time you notice an opportunity to group a set of related files into a sub-folder, do so. Examples might include: All the MP3s from one music CD, all the photos from one holiday, or all the documents from one client. It’s perfectly okay to put files into a folder called C:\Files\Me\From Others\Services\WestCo Bank\Statements\2009. That’s only seven levels deep. Ten levels is not uncommon. Of course, it’s possible to take this too far. If you notice yourself creating a sub-folder to hold only one file, then you’ve probably become a little over-zealous. On the other hand, if you simply create a structure with only two levels (for example C:\Files\Work) then you really haven’t achieved any level of organization at all (unless you own only six files!). Your “Work” folder will have become a dumping ground, just like your Desktop was, with most likely hundreds of files in it. Tip #6. Move the Standard User Folders into Your Own Folder Structure Most operating systems, including Windows, create a set of standard folders for each of its users. These folders then become the default location for files such as documents, music files, digital photos and downloaded Internet files. In Windows 7, the full list is shown below: Some of these folders you may never use nor care about (for example, the Favorites folder, if you’re not using Internet Explorer as your browser). Those ones you can leave where they are. But you may be using some of the other folders to store files that are important to you. Even if you’re not using them, Windows will still often treat them as the default storage location for many types of files. When you go to save a standard file type, it can become annoying to be automatically prompted to save it in a folder that’s not part of your own file structure. But there’s a simple solution: Move the folders you care about into your own folder structure! If you do, then the next time you go to save a file of the corresponding type, Windows will prompt you to save it in the new, moved location. Moving the folders is easy. Simply drag-and-drop them to the new location. Here’s a screenshot of the default My Music folder being moved to my custom personal folder (Mark): Tip #7. Name Files and Folders Intelligently This is another one that almost goes without saying, but we’ll say it anyway: Do not allow files to be created that have meaningless names like Document1.doc, or folders called New Folder (2). Take that extra 20 seconds and come up with a meaningful name for the file/folder – one that accurately divulges its contents without repeating the entire contents in the name. Tip #8. Watch Out for Long Filenames Another way to tell if you have not yet created enough depth to your folder hierarchy is that your files often require really long names. If you need to call a file Johnson Sales Figures March 2009.xls (which might happen to live in the same folder as Abercrombie Budget Report 2008.xls), then you might want to create some sub-folders so that the first file could be simply called March.xls, and living in the Clients\Johnson\Sales Figures\2009 folder. A well-placed file needs only a brief filename! Tip #9. Use Shortcuts! Everywhere! This is probably the single most useful and important tip we can offer. A shortcut allows a file to be in two places at once. Why would you want that? Well, the file and folder structure of every popular operating system on the market today is hierarchical. This means that all objects (files and folders) always live within exactly one parent folder. It’s a bit like a tree. A tree has branches (folders) and leaves (files). Each leaf, and each branch, is supported by exactly one parent branch, all the way back to the root of the tree (which, incidentally, is exactly why C:\ is called the “root folder” of the C: drive). That hard disks are structured this way may seem obvious and even necessary, but it’s only one way of organizing data. There are others: Relational databases, for example, organize structured data entirely differently. The main limitation of hierarchical filing structures is that a file can only ever be in one branch of the tree – in only one folder – at a time. Why is this a problem? Well, there are two main reasons why this limitation is a problem for computer users: The “correct” place for a file, according to our organizational rationale, is very often a very inconvenient place for that file to be located. Just because it’s correctly filed doesn’t mean it’s easy to get to. Your file may be “correctly” buried six levels deep in your sub-folder structure, but you may need regular and speedy access to this file every day. You could always move it to a more convenient location, but that would mean that you would need to re-file back to its “correct” location it every time you’d finished working on it. Most unsatisfactory. A file may simply “belong” in two or more different locations within your file structure. For example, say you’re an accountant and you have just completed the 2009 tax return for John Smith. It might make sense to you to call this file 2009 Tax Return.doc and file it under Clients\John Smith. But it may also be important to you to have the 2009 tax returns from all your clients together in the one place. So you might also want to call the file John Smith.doc and file it under Tax Returns\2009. The problem is, in a purely hierarchical filing system, you can’t put it in both places. Grrrrr! Fortunately, Windows (and most other operating systems) offers a way for you to do exactly that: It’s called a “shortcut” (also known as an “alias” on Macs and a “symbolic link” on UNIX systems). Shortcuts allow a file to exist in one place, and an icon that represents the file to be created and put anywhere else you please. In fact, you can create a dozen such icons and scatter them all over your hard disk. Double-clicking on one of these icons/shortcuts opens up the original file, just as if you had double-clicked on the original file itself. Consider the following two icons: The one on the left is the actual Word document, while the one on the right is a shortcut that represents the Word document. Double-clicking on either icon will open the same file. There are two main visual differences between the icons: The shortcut will have a small arrow in the lower-left-hand corner (on Windows, anyway) The shortcut is allowed to have a name that does not include the file extension (the “.docx” part, in this case) You can delete the shortcut at any time without losing any actual data. The original is still intact. All you lose is the ability to get to that data from wherever the shortcut was. So why are shortcuts so great? Because they allow us to easily overcome the main limitation of hierarchical file systems, and put a file in two (or more) places at the same time. You will always have files that don’t play nice with your organizational rationale, and can’t be filed in only one place. They demand to exist in two places. Shortcuts allow this! Furthermore, they allow you to collect your most often-opened files and folders together in one spot for convenient access. The cool part is that the original files stay where they are, safe forever in their perfectly organized location. So your collection of most often-opened files can – and should – become a collection of shortcuts! If you’re still not convinced of the utility of shortcuts, consider the following well-known areas of a typical Windows computer: The Start Menu (and all the programs that live within it) The Quick Launch bar (or the Superbar in Windows 7) The “Favorite folders” area in the top-left corner of the Windows Explorer window (in Windows Vista or Windows 7) Your Internet Explorer Favorites or Firefox Bookmarks Each item in each of these areas is a shortcut! Each of those areas exist for one purpose only: For convenience – to provide you with a collection of the files and folders you access most often. It should be easy to see by now that shortcuts are designed for one single purpose: To make accessing your files more convenient. Each time you double-click on a shortcut, you are saved the hassle of locating the file (or folder, or program, or drive, or control panel icon) that it represents. Shortcuts allow us to invent a golden rule of file and folder organization: “Only ever have one copy of a file – never have two copies of the same file. Use a shortcut instead” (this rule doesn’t apply to copies created for backup purposes, of course!) There are also lesser rules, like “don’t move a file into your work area – create a shortcut there instead”, and “any time you find yourself frustrated with how long it takes to locate a file, create a shortcut to it and place that shortcut in a convenient location.” So how to we create these massively useful shortcuts? There are two main ways: “Copy” the original file or folder (click on it and type Ctrl-C, or right-click on it and select Copy): Then right-click in an empty area of the destination folder (the place where you want the shortcut to go) and select Paste shortcut: Right-drag (drag with the right mouse button) the file from the source folder to the destination folder. When you let go of the mouse button at the destination folder, a menu pops up: Select Create shortcuts here. Note that when shortcuts are created, they are often named something like Shortcut to Budget Detail.doc (windows XP) or Budget Detail – Shortcut.doc (Windows 7). If you don’t like those extra words, you can easily rename the shortcuts after they’re created, or you can configure Windows to never insert the extra words in the first place (see our article on how to do this). And of course, you can create shortcuts to folders too, not just to files! Bottom line: Whenever you have a file that you’d like to access from somewhere else (whether it’s convenience you’re after, or because the file simply belongs in two places), create a shortcut to the original file in the new location. Tip #10. Separate Application Files from Data Files Any digital organization guru will drum this rule into you. Application files are the components of the software you’ve installed (e.g. Microsoft Word, Adobe Photoshop or Internet Explorer). Data files are the files that you’ve created for yourself using that software (e.g. Word Documents, digital photos, emails or playlists). Software gets installed, uninstalled and upgraded all the time. Hopefully you always have the original installation media (or downloaded set-up file) kept somewhere safe, and can thus reinstall your software at any time. This means that the software component files are of little importance. Whereas the files you have created with that software is, by definition, important. It’s a good rule to always separate unimportant files from important files. So when your software prompts you to save a file you’ve just created, take a moment and check out where it’s suggesting that you save the file. If it’s suggesting that you save the file into the same folder as the software itself, then definitely don’t follow that suggestion. File it in your own folder! In fact, see if you can find the program’s configuration option that determines where files are saved by default (if it has one), and change it. Tip #11. Organize Files Based on Purpose, Not on File Type If you have, for example a folder called Work\Clients\Johnson, and within that folder you have two sub-folders, Word Documents and Spreadsheets (in other words, you’re separating “.doc” files from “.xls” files), then chances are that you’re not optimally organized. It makes little sense to organize your files based on the program that created them. Instead, create your sub-folders based on the purpose of the file. For example, it would make more sense to create sub-folders called Correspondence and Financials. It may well be that all the files in a given sub-folder are of the same file-type, but this should be more of a coincidence and less of a design feature of your organization system. Tip #12. Maintain the Same Folder Structure on All Your Computers In other words, whatever organizational system you create, apply it to every computer that you can. There are several benefits to this: There’s less to remember. No matter where you are, you always know where to look for your files If you copy or synchronize files from one computer to another, then setting up the synchronization job becomes very simple Shortcuts can be copied or moved from one computer to another with ease (assuming the original files are also copied/moved). There’s no need to find the target of the shortcut all over again on the second computer Ditto for linked files (e.g Word documents that link to data in a separate Excel file), playlists, and any files that reference the exact file locations of other files. This applies even to the drive that your files are stored on. If your files are stored on C: on one computer, make sure they’re stored on C: on all your computers. Otherwise all your shortcuts, playlists and linked files will stop working! Tip #13. Create an “Inbox” Folder Create yourself a folder where you store all files that you’re currently working on, or that you haven’t gotten around to filing yet. You can think of this folder as your “to-do” list. You can call it “Inbox” (making it the same metaphor as your email system), or “Work”, or “To-Do”, or “Scratch”, or whatever name makes sense to you. It doesn’t matter what you call it – just make sure you have one! Once you have finished working on a file, you then move it from the “Inbox” to its correct location within your organizational structure. You may want to use your Desktop as this “Inbox” folder. Rightly or wrongly, most people do. It’s not a bad place to put such files, but be careful: If you do decide that your Desktop represents your “to-do” list, then make sure that no other files find their way there. In other words, make sure that your “Inbox”, wherever it is, Desktop or otherwise, is kept free of junk – stray files that don’t belong there. So where should you put this folder, which, almost by definition, lives outside the structure of the rest of your filing system? Well, first and foremost, it has to be somewhere handy. This will be one of your most-visited folders, so convenience is key. Putting it on the Desktop is a great option – especially if you don’t have any other folders on your Desktop: the folder then becomes supremely easy to find in Windows Explorer: You would then create shortcuts to this folder in convenient spots all over your computer (“Favorite Links”, “Quick Launch”, etc). Tip #14. Ensure You have Only One “Inbox” Folder Once you’ve created your “Inbox” folder, don’t use any other folder location as your “to-do list”. Throw every incoming or created file into the Inbox folder as you create/receive it. This keeps the rest of your computer pristine and free of randomly created or downloaded junk. The last thing you want to be doing is checking multiple folders to see all your current tasks and projects. Gather them all together into one folder. Here are some tips to help ensure you only have one Inbox: Set the default “save” location of all your programs to this folder. Set the default “download” location for your browser to this folder. If this folder is not your desktop (recommended) then also see if you can make a point of not putting “to-do” files on your desktop. This keeps your desktop uncluttered and Zen-like: (the Inbox folder is in the bottom-right corner) Tip #15. Be Vigilant about Clearing Your “Inbox” Folder This is one of the keys to staying organized. If you let your “Inbox” overflow (i.e. allow there to be more than, say, 30 files or folders in there), then you’re probably going to start feeling like you’re overwhelmed: You’re not keeping up with your to-do list. Once your Inbox gets beyond a certain point (around 30 files, studies have shown), then you’ll simply start to avoid it. You may continue to put files in there, but you’ll be scared to look at it, fearing the “out of control” feeling that all overworked, chaotic or just plain disorganized people regularly feel. So, here’s what you can do: Visit your Inbox/to-do folder regularly (at least five times per day). Scan the folder regularly for files that you have completed working on and are ready for filing. File them immediately. Make it a source of pride to keep the number of files in this folder as small as possible. If you value peace of mind, then make the emptiness of this folder one of your highest (computer) priorities If you know that a particular file has been in the folder for more than, say, six weeks, then admit that you’re not actually going to get around to processing it, and move it to its final resting place. Tip #16. File Everything Immediately, and Use Shortcuts for Your Active Projects As soon as you create, receive or download a new file, store it away in its “correct” folder immediately. Then, whenever you need to work on it (possibly straight away), create a shortcut to it in your “Inbox” (“to-do”) folder or your desktop. That way, all your files are always in their “correct” locations, yet you still have immediate, convenient access to your current, active files. When you finish working on a file, simply delete the shortcut. Ideally, your “Inbox” folder – and your Desktop – should contain no actual files or folders. They should simply contain shortcuts. Tip #17. Use Directory Symbolic Links (or Junctions) to Maintain One Unified Folder Structure Using this tip, we can get around a potential hiccup that we can run into when creating our organizational structure – the issue of having more than one drive on our computer (C:, D:, etc). We might have files we need to store on the D: drive for space reasons, and yet want to base our organized folder structure on the C: drive (or vice-versa). Your chosen organizational structure may dictate that all your files must be accessed from the C: drive (for example, the root folder of all your files may be something like C:\Files). And yet you may still have a D: drive and wish to take advantage of the hundreds of spare Gigabytes that it offers. Did you know that it’s actually possible to store your files on the D: drive and yet access them as if they were on the C: drive? And no, we’re not talking about shortcuts here (although the concept is very similar). By using the shell command mklink, you can essentially take a folder that lives on one drive and create an alias for it on a different drive (you can do lots more than that with mklink – for a full rundown on this programs capabilities, see our dedicated article). These aliases are called directory symbolic links (and used to be known as junctions). You can think of them as “virtual” folders. They function exactly like regular folders, except they’re physically located somewhere else. For example, you may decide that your entire D: drive contains your complete organizational file structure, but that you need to reference all those files as if they were on the C: drive, under C:\Files. If that was the case you could create C:\Files as a directory symbolic link – a link to D:, as follows: mklink /d c:\files d:\ Or it may be that the only files you wish to store on the D: drive are your movie collection. You could locate all your movie files in the root of your D: drive, and then link it to C:\Files\Media\Movies, as follows: mklink /d c:\files\media\movies d:\ (Needless to say, you must run these commands from a command prompt – click the Start button, type cmd and press Enter) Tip #18. Customize Your Folder Icons This is not strictly speaking an organizational tip, but having unique icons for each folder does allow you to more quickly visually identify which folder is which, and thus saves you time when you’re finding files. An example is below (from my folder that contains all files downloaded from the Internet): To learn how to change your folder icons, please refer to our dedicated article on the subject. Tip #19. Tidy Your Start Menu The Windows Start Menu is usually one of the messiest parts of any Windows computer. Every program you install seems to adopt a completely different approach to placing icons in this menu. Some simply put a single program icon. Others create a folder based on the name of the software. And others create a folder based on the name of the software manufacturer. It’s chaos, and can make it hard to find the software you want to run. Thankfully we can avoid this chaos with useful operating system features like Quick Launch, the Superbar or pinned start menu items. Even so, it would make a lot of sense to get into the guts of the Start Menu itself and give it a good once-over. All you really need to decide is how you’re going to organize your applications. A structure based on the purpose of the application is an obvious candidate. Below is an example of one such structure: In this structure, Utilities means software whose job it is to keep the computer itself running smoothly (configuration tools, backup software, Zip programs, etc). Applications refers to any productivity software that doesn’t fit under the headings Multimedia, Graphics, Internet, etc. In case you’re not aware, every icon in your Start Menu is a shortcut and can be manipulated like any other shortcut (copied, moved, deleted, etc). With the Windows Start Menu (all version of Windows), Microsoft has decided that there be two parallel folder structures to store your Start Menu shortcuts. One for you (the logged-in user of the computer) and one for all users of the computer. Having two parallel structures can often be redundant: If you are the only user of the computer, then having two parallel structures is totally redundant. Even if you have several users that regularly log into the computer, most of your installed software will need to be made available to all users, and should thus be moved out of the “just you” version of the Start Menu and into the “all users” area. To take control of your Start Menu, so you can start organizing it, you’ll need to know how to access the actual folders and shortcut files that make up the Start Menu (both versions of it). To find these folders and files, click the Start button and then right-click on the All Programs text (Windows XP users should right-click on the Start button itself): The Open option refers to the “just you” version of the Start Menu, while the Open All Users option refers to the “all users” version. Click on the one you want to organize. A Windows Explorer window then opens with your chosen version of the Start Menu selected. From there it’s easy. Double-click on the Programs folder and you’ll see all your folders and shortcuts. Now you can delete/rename/move until it’s just the way you want it. Note: When you’re reorganizing your Start Menu, you may want to have two Explorer windows open at the same time – one showing the “just you” version and one showing the “all users” version. You can drag-and-drop between the windows. Tip #20. Keep Your Start Menu Tidy Once you have a perfectly organized Start Menu, try to be a little vigilant about keeping it that way. Every time you install a new piece of software, the icons that get created will almost certainly violate your organizational structure. So to keep your Start Menu pristine and organized, make sure you do the following whenever you install a new piece of software: Check whether the software was installed into the “just you” area of the Start Menu, or the “all users” area, and then move it to the correct area. Remove all the unnecessary icons (like the “Read me” icon, the “Help” icon (you can always open the help from within the software itself when it’s running), the “Uninstall” icon, the link(s)to the manufacturer’s website, etc) Rename the main icon(s) of the software to something brief that makes sense to you. For example, you might like to rename Microsoft Office Word 2010 to simply Word Move the icon(s) into the correct folder based on your Start Menu organizational structure And don’t forget: when you uninstall a piece of software, the software’s uninstall routine is no longer going to be able to remove the software’s icon from the Start Menu (because you moved and/or renamed it), so you’ll need to remove that icon manually. Tip #21. Tidy C:\ The root of your C: drive (C:\) is a common dumping ground for files and folders – both by the users of your computer and by the software that you install on your computer. It can become a mess. There’s almost no software these days that requires itself to be installed in C:\. 99% of the time it can and should be installed into C:\Program Files. And as for your own files, well, it’s clear that they can (and almost always should) be stored somewhere else. In an ideal world, your C:\ folder should look like this (on Windows 7): Note that there are some system files and folders in C:\ that are usually and deliberately “hidden” (such as the Windows virtual memory file pagefile.sys, the boot loader file bootmgr, and the System Volume Information folder). Hiding these files and folders is a good idea, as they need to stay where they are and are almost never needed to be opened or even seen by you, the user. Hiding them prevents you from accidentally messing with them, and enhances your sense of order and well-being when you look at your C: drive folder. Tip #22. Tidy Your Desktop The Desktop is probably the most abused part of a Windows computer (from an organization point of view). It usually serves as a dumping ground for all incoming files, as well as holding icons to oft-used applications, plus some regularly opened files and folders. It often ends up becoming an uncontrolled mess. See if you can avoid this. Here’s why… Application icons (Word, Internet Explorer, etc) are often found on the Desktop, but it’s unlikely that this is the optimum place for them. The “Quick Launch” bar (or the Superbar in Windows 7) is always visible and so represents a perfect location to put your icons. You’ll only be able to see the icons on your Desktop when all your programs are minimized. It might be time to get your application icons off your desktop… You may have decided that the Inbox/To-do folder on your computer (see tip #13, above) should be your Desktop. If so, then enough said. Simply be vigilant about clearing it and preventing it from being polluted by junk files (see tip #15, above). On the other hand, if your Desktop is not acting as your “Inbox” folder, then there’s no reason for it to have any data files or folders on it at all, except perhaps a couple of shortcuts to often-opened files and folders (either ongoing or current projects). Everything else should be moved to your “Inbox” folder. In an ideal world, it might look like this: Tip #23. Move Permanent Items on Your Desktop Away from the Top-Left Corner When files/folders are dragged onto your desktop in a Windows Explorer window, or when shortcuts are created on your Desktop from Internet Explorer, those icons are always placed in the top-left corner – or as close as they can get. If you have other files, folders or shortcuts that you keep on the Desktop permanently, then it’s a good idea to separate these permanent icons from the transient ones, so that you can quickly identify which ones the transients are. An easy way to do this is to move all your permanent icons to the right-hand side of your Desktop. That should keep them separated from incoming items. Tip #24. Synchronize If you have more than one computer, you’ll almost certainly want to share files between them. If the computers are permanently attached to the same local network, then there’s no need to store multiple copies of any one file or folder – shortcuts will suffice. However, if the computers are not always on the same network, then you will at some point need to copy files between them. For files that need to permanently live on both computers, the ideal way to do this is to synchronize the files, as opposed to simply copying them. We only have room here to write a brief summary of synchronization, not a full article. In short, there are several different types of synchronization: Where the contents of one folder are accessible anywhere, such as with Dropbox Where the contents of any number of folders are accessible anywhere, such as with Windows Live Mesh Where any files or folders from anywhere on your computer are synchronized with exactly one other computer, such as with the Windows “Briefcase”, Microsoft SyncToy, or (much more powerful, yet still free) SyncBack from 2BrightSparks. This only works when both computers are on the same local network, at least temporarily. A great advantage of synchronization solutions is that once you’ve got it configured the way you want it, then the sync process happens automatically, every time. Click a button (or schedule it to happen automatically) and all your files are automagically put where they’re supposed to be. If you maintain the same file and folder structure on both computers, then you can also sync files depend upon the correct location of other files, like shortcuts, playlists and office documents that link to other office documents, and the synchronized files still work on the other computer! Tip #25. Hide Files You Never Need to See If you have your files well organized, you will often be able to tell if a file is out of place just by glancing at the contents of a folder (for example, it should be pretty obvious if you look in a folder that contains all the MP3s from one music CD and see a Word document in there). This is a good thing – it allows you to determine if there are files out of place with a quick glance. Yet sometimes there are files in a folder that seem out of place but actually need to be there, such as the “folder art” JPEGs in music folders, and various files in the root of the C: drive. If such files never need to be opened by you, then a good idea is to simply hide them. Then, the next time you glance at the folder, you won’t have to remember whether that file was supposed to be there or not, because you won’t see it at all! To hide a file, simply right-click on it and choose Properties: Then simply tick the Hidden tick-box: Tip #26. Keep Every Setup File These days most software is downloaded from the Internet. Whenever you download a piece of software, keep it. You’ll never know when you need to reinstall the software. Further, keep with it an Internet shortcut that links back to the website where you originally downloaded it, in case you ever need to check for updates. See tip #33 below for a full description of the excellence of organizing your setup files. Tip #27. Try to Minimize the Number of Folders that Contain Both Files and Sub-folders Some of the folders in your organizational structure will contain only files. Others will contain only sub-folders. And you will also have some folders that contain both files and sub-folders. You will notice slight improvements in how long it takes you to locate a file if you try to avoid this third type of folder. It’s not always possible, of course – you’ll always have some of these folders, but see if you can avoid it. One way of doing this is to take all the leftover files that didn’t end up getting stored in a sub-folder and create a special “Miscellaneous” or “Other” folder for them. Tip #28. Starting a Filename with an Underscore Brings it to the Top of a List Further to the previous tip, if you name that “Miscellaneous” or “Other” folder in such a way that its name begins with an underscore “_”, then it will appear at the top of the list of files/folders. The screenshot below is an example of this. Each folder in the list contains a set of digital photos. The folder at the top of the list, _Misc, contains random photos that didn’t deserve their own dedicated folder: Tip #29. Clean Up those CD-ROMs and (shudder!) Floppy Disks Have you got a pile of CD-ROMs stacked on a shelf of your office? Old photos, or files you archived off onto CD-ROM (or even worse, floppy disks!) because you didn’t have enough disk space at the time? In the meantime have you upgraded your computer and now have 500 Gigabytes of space you don’t know what to do with? If so, isn’t it time you tidied up that stack of disks and filed them into your gorgeous new folder structure? So what are you waiting for? Bite the bullet, copy them all back onto your computer, file them in their appropriate folders, and then back the whole lot up onto a shiny new 1000Gig external hard drive! Useful Folders to Create This next section suggests some useful folders that you might want to create within your folder structure. I’ve personally found them to be indispensable. The first three are all about convenience – handy folders to create and then put somewhere that you can always access instantly. For each one, it’s not so important where the actual folder is located, but it’s very important where you put the shortcut(s) to the folder. You might want to locate the shortcuts: On your Desktop In your “Quick Launch” area (or pinned to your Windows 7 Superbar) In your Windows Explorer “Favorite Links” area Tip #30. Create an “Inbox” (“To-Do”) Folder This has already been mentioned in depth (see tip #13), but we wanted to reiterate its importance here. This folder contains all the recently created, received or downloaded files that you have not yet had a chance to file away properly, and it also may contain files that you have yet to process. In effect, it becomes a sort of “to-do list”. It doesn’t have to be called “Inbox” – you can call it whatever you want. Tip #31. Create a Folder where Your Current Projects are Collected Rather than going hunting for them all the time, or dumping them all on your desktop, create a special folder where you put links (or work folders) for each of the projects you’re currently working on. You can locate this folder in your “Inbox” folder, on your desktop, or anywhere at all – just so long as there’s a way of getting to it quickly, such as putting a link to it in Windows Explorer’s “Favorite Links” area: Tip #32. Create a Folder for Files and Folders that You Regularly Open You will always have a few files that you open regularly, whether it be a spreadsheet of your current accounts, or a favorite playlist. These are not necessarily “current projects”, rather they’re simply files that you always find yourself opening. Typically such files would be located on your desktop (or even better, shortcuts to those files). Why not collect all such shortcuts together and put them in their own special folder? As with the “Current Projects” folder (above), you would want to locate that folder somewhere convenient. Below is an example of a folder called “Quick links”, with about seven files (shortcuts) in it, that is accessible through the Windows Quick Launch bar: See tip #37 below for a full explanation of the power of the Quick Launch bar. Tip #33. Create a “Set-ups” Folder A typical computer has dozens of applications installed on it. For each piece of software, there are often many different pieces of information you need to keep track of, including: The original installation setup file(s). This can be anything from a simple 100Kb setup.exe file you downloaded from a website, all the way up to a 4Gig ISO file that you copied from a DVD-ROM that you purchased. The home page of the software manufacturer (in case you need to look up something on their support pages, their forum or their online help) The page containing the download link for your actual file (in case you need to re-download it, or download an upgraded version) The serial number Your proof-of-purchase documentation Any other template files, plug-ins, themes, etc that also need to get installed For each piece of software, it’s a great idea to gather all of these files together and put them in a single folder. The folder can be the name of the software (plus possibly a very brief description of what it’s for – in case you can’t remember what the software does based in its name). Then you would gather all of these folders together into one place, and call it something like “Software” or “Setups”. If you have enough of these folders (I have several hundred, being a geek, collected over 20 years), then you may want to further categorize them. My own categorization structure is based on “platform” (operating system): The last seven folders each represents one platform/operating system, while _Operating Systems contains set-up files for installing the operating systems themselves. _Hardware contains ROMs for hardware I own, such as routers. Within the Windows folder (above), you can see the beginnings of the vast library of software I’ve compiled over the years: An example of a typical application folder looks like this: Tip #34. Have a “Settings” Folder We all know that our documents are important. So are our photos and music files. We save all of these files into folders, and then locate them afterwards and double-click on them to open them. But there are many files that are important to us that can’t be saved into folders, and then searched for and double-clicked later on. These files certainly contain important information that we need, but are often created internally by an application, and saved wherever that application feels is appropriate. A good example of this is the “PST” file that Outlook creates for us and uses to store all our emails, contacts, appointments and so forth. Another example would be the collection of Bookmarks that Firefox stores on your behalf. And yet another example would be the customized settings and configuration files of our all our software. Granted, most Windows programs store their configuration in the Registry, but there are still many programs that use configuration files to store their settings. Imagine if you lost all of the above files! And yet, when people are backing up their computers, they typically only back up the files they know about – those that are stored in the “My Documents” folder, etc. If they had a hard disk failure or their computer was lost or stolen, their backup files would not include some of the most vital files they owned. Also, when migrating to a new computer, it’s vital to ensure that these files make the journey. It can be a very useful idea to create yourself a folder to store all your “settings” – files that are important to you but which you never actually search for by name and double-click on to open them. Otherwise, next time you go to set up a new computer just the way you want it, you’ll need to spend hours recreating the configuration of your previous computer! So how to we get our important files into this folder? Well, we have a few options: Some programs (such as Outlook and its PST files) allow you to place these files wherever you want. If you delve into the program’s options, you will find a setting somewhere that controls the location of the important settings files (or “personal storage” – PST – when it comes to Outlook) Some programs do not allow you to change such locations in any easy way, but if you get into the Registry, you can sometimes find a registry key that refers to the location of the file(s). Simply move the file into your Settings folder and adjust the registry key to refer to the new location. Some programs stubbornly refuse to allow their settings files to be placed anywhere other then where they stipulate. When faced with programs like these, you have three choices: (1) You can ignore those files, (2) You can copy the files into your Settings folder (let’s face it – settings don’t change very often), or (3) you can use synchronization software, such as the Windows Briefcase, to make synchronized copies of all your files in your Settings folder. All you then have to do is to remember to run your sync software periodically (perhaps just before you run your backup software!). There are some other things you may decide to locate inside this new “Settings” folder: Exports of registry keys (from the many applications that store their configurations in the Registry). This is useful for backup purposes or for migrating to a new computer Notes you’ve made about all the specific customizations you have made to a particular piece of software (so that you’ll know how to do it all again on your next computer) Shortcuts to webpages that detail how to tweak certain aspects of your operating system or applications so they are just the way you like them (such as how to remove the words “Shortcut to” from the beginning of newly created shortcuts). In other words, you’d want to create shortcuts to half the pages on the How-To Geek website! Here’s an example of a “Settings” folder: Windows Features that Help with Organization This section details some of the features of Microsoft Windows that are a boon to anyone hoping to stay optimally organized. Tip #35. Use the “Favorite Links” Area to Access Oft-Used Folders Once you’ve created your great new filing system, work out which folders you access most regularly, or which serve as great starting points for locating the rest of the files in your folder structure, and then put links to those folders in your “Favorite Links” area of the left-hand side of the Windows Explorer window (simply called “Favorites” in Windows 7): Some ideas for folders you might want to add there include: Your “Inbox” folder (or whatever you’ve called it) – most important! The base of your filing structure (e.g. C:\Files) A folder containing shortcuts to often-accessed folders on other computers around the network (shown above as Network Folders) A folder containing shortcuts to your current projects (unless that folder is in your “Inbox” folder) Getting folders into this area is very simple – just locate the folder you’re interested in and drag it there! Tip #36. Customize the Places Bar in the File/Open and File/Save Boxes Consider the screenshot below: The highlighted icons (collectively known as the “Places Bar”) can be customized to refer to any folder location you want, allowing instant access to any part of your organizational structure. Note: These File/Open and File/Save boxes have been superseded by new versions that use the Windows Vista/Windows 7 “Favorite Links”, but the older versions (shown above) are still used by a surprisingly large number of applications. The easiest way to customize these icons is to use the Group Policy Editor, but not everyone has access to this program. If you do, open it up and navigate to: User Configuration > Administrative Templates > Windows Components > Windows Explorer > Common Open File Dialog If you don’t have access to the Group Policy Editor, then you’ll need to get into the Registry. Navigate to: HKEY_CURRENT_USER \ Software \ Microsoft \ Windows \ CurrentVersion \ Policies \ comdlg32 \ Placesbar It should then be easy to make the desired changes. Log off and log on again to allow the changes to take effect. Tip #37. Use the Quick Launch Bar as a Application and File Launcher That Quick Launch bar (to the right of the Start button) is a lot more useful than people give it credit for. Most people simply have half a dozen icons in it, and use it to start just those programs. But it can actually be used to instantly access just about anything in your filing system: For complete instructions on how to set this up, visit our dedicated article on this topic. Tip #38. Put a Shortcut to Windows Explorer into Your Quick Launch Bar This is only necessary in Windows Vista and Windows XP. The Microsoft boffins finally got wise and added it to the Windows 7 Superbar by default. Windows Explorer – the program used for managing your files and folders – is one of the most useful programs in Windows. Anyone who considers themselves serious about being organized needs instant access to this program at any time. A great place to create a shortcut to this program is in the Windows XP and Windows Vista “Quick Launch” bar: To get it there, locate it in your Start Menu (usually under “Accessories”) and then right-drag it down into your Quick Launch bar (and create a copy). Tip #39. Customize the Starting Folder for Your Windows 7 Explorer Superbar Icon If you’re on Windows 7, your Superbar will include a Windows Explorer icon. Clicking on the icon will launch Windows Explorer (of course), and will start you off in your “Libraries” folder. Libraries may be fine as a starting point, but if you have created yourself an “Inbox” folder, then it would probably make more sense to start off in this folder every time you launch Windows Explorer. To change this default/starting folder location, then first right-click the Explorer icon in the Superbar, and then right-click Properties:Then, in Target field of the Windows Explorer Properties box that appears, type %windir%\explorer.exe followed by the path of the folder you wish to start in. For example: %windir%\explorer.exe C:\Files If that folder happened to be on the Desktop (and called, say, “Inbox”), then you would use the following cleverness: %windir%\explorer.exe shell:desktop\Inbox Then click OK and test it out. Tip #40. Ummmmm…. No, that’s it. I can’t think of another one. That’s all of the tips I can come up with. I only created this one because 40 is such a nice round number… Case Study – An Organized PC To finish off the article, I have included a few screenshots of my (main) computer (running Vista). The aim here is twofold: To give you a sense of what it looks like when the above, sometimes abstract, tips are applied to a real-life computer, and To offer some ideas about folders and structure that you may want to steal to use on your own PC. Let’s start with the C: drive itself. Very minimal. All my files are contained within C:\Files. I’ll confine the rest of the case study to this folder: That folder contains the following: Mark: My personal files VC: My business (Virtual Creations, Australia) Others contains files created by friends and family Data contains files from the rest of the world (can be thought of as “public” files, usually downloaded from the Net) Settings is described above in tip #34 The Data folder contains the following sub-folders: Audio: Radio plays, audio books, podcasts, etc Development: Programmer and developer resources, sample source code, etc (see below) Humour: Jokes, funnies (those emails that we all receive) Movies: Downloaded and ripped movies (all legal, of course!), their scripts, DVD covers, etc. Music: (see below) Setups: Installation files for software (explained in full in tip #33) System: (see below) TV: Downloaded TV shows Writings: Books, instruction manuals, etc (see below) The Music folder contains the following sub-folders: Album covers: JPEG scans Guitar tabs: Text files of guitar sheet music Lists: e.g. “Top 1000 songs of all time” Lyrics: Text files MIDI: Electronic music files MP3 (representing 99% of the Music folder): MP3s, either ripped from CDs or downloaded, sorted by artist/album name Music Video: Video clips Sheet Music: usually PDFs The Data\Writings folder contains the following sub-folders: (all pretty self-explanatory) The Data\Development folder contains the following sub-folders: Again, all pretty self-explanatory (if you’re a geek) The Data\System folder contains the following sub-folders: These are usually themes, plug-ins and other downloadable program-specific resources. The Mark folder contains the following sub-folders: From Others: Usually letters that other people (friends, family, etc) have written to me For Others: Letters and other things I have created for other people Green Book: None of your business Playlists: M3U files that I have compiled of my favorite songs (plus one M3U playlist file for every album I own) Writing: Fiction, philosophy and other musings of mine Mark Docs: Shortcut to C:\Users\Mark Settings: Shortcut to C:\Files\Settings\Mark The Others folder contains the following sub-folders: The VC (Virtual Creations, my business – I develop websites) folder contains the following sub-folders: And again, all of those are pretty self-explanatory. Conclusion These tips have saved my sanity and helped keep me a productive geek, but what about you? What tips and tricks do you have to keep your files organized? Please share them with us in the comments. Come on, don’t be shy… Similar Articles Productive Geek Tips Fix For When Windows Explorer in Vista Stops Showing File NamesWhy Did Windows Vista’s Music Folder Icon Turn Yellow?Print or Create a Text File List of the Contents in a Directory the Easy WayCustomize the Windows 7 or Vista Send To MenuAdd Copy To / Move To on Windows 7 or Vista Right-Click Menu TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips Acronis Online Backup DVDFab 6 Revo Uninstaller Pro Registry Mechanic 9 for Windows Track Daily Goals With 42Goals Video Toolbox is a Superb Online Video Editor Fun with 47 charts and graphs Tomorrow is Mother’s Day Check the Average Speed of YouTube Videos You’ve Watched OutlookStatView Scans and Displays General Usage Statistics

Read the article
The ASP.NET Daily Community Spotlight - How posts get there, and how to make it your Visual Studio Start Page

- by Jon Galloway

One really cool part of my job is selecting the articles for the Daily Community Spotlight, on the home page of the ASP.NET website. The spotlight highlights a new post about ASP.NET development every day from a member of the ASP.NET community. You can find it on the home page of the ASP.NET site, at http://asp.net These posts aren't automatically drawn from a pool of RSS feeds or anything - I pick a new post for each day of the year. How I pick the posts I have a few important selection criteria: Interesting to well rounded ASP.NET developers The ASP.NET website has a lot of material for all skill and experience levels, from download / get started to advanced. I try to select community spotlight posts to round that out with fresh and timely information that working ASP.NET developers can really use. Posts highlight solutions to common problems, clever projects and code that helps you leverage ASP.NET, and important announcements about things you can use today. As part of that, I try to mix between ASP.NET MVC, Web Forms, and Web Pages (a.k.a. WebMatrix). As a professional developer, I want to keep on top of all of my options for ASP.NET development, and the common platform base they all share generally means that good ASP.NET code is good ASP.NET code. Exposing new and non-Microsoft community members as much as possible The exercise of selecting good ASP.NET community posts every day of the year has made me think about what the community is. Given the choice, I'll always favor non-Microsoft employees, but since Microsoft often hires ASP.NET community members and MVP's (myself included), I really think that the ASP.NET community includes developers who are using and writing about ASP.NET, both inside and outside of Microsoft. I'm especially excited about the opportunity to highlight new and lesser known bloggers. Usually being featured on the ASP.NET Community Spotlight gives a pretty good traffic bump, and I love being able to both provide great content to the community and encourage lesser known community members by giving them some (much deserved) attention. Announcements only when they're useful to working developers - not marketing Some of the posts are announcements about new releases, such as Scott Hanselman's post on ASP.NET Universal Providers for Session, Memebership, and Roles. I include those when I think they're interesting and of immediate use to you on projects. I occasionally get asked to link to new content from a team at Microsoft; if it's useful and timely content I'll ask them to point me to a blog post by an actual person rather than a faceless team. How the posts are managed This feed used to be managed by an internal spreadsheet on a Sharepoint site, which was painful for a lot of reasons. I took a cue from Jon Udell, who uses of a public Delicious feed feed for his Elm City project, and we moved the management of these posts over to a Delicious feed as well. You can hear more about Jon's use of Delicious in Elm City in our Herding Code interview - still one of my favorite interviews. We ended up with a simpler scenario, but Note: I watched the Yahoo/Delicious news over the past year and was happy to see that Delicious was recently acquired by the founders of YouTube. I investigated several other Delicious competitors, but am happy with Delicious for now. My Delicious feed here: http://www.delicious.com/jon_galloway You can also browse through this past year's ASP.NET Community Spotlight posts using the (pretty cool) Delicious Browse Bar Submitting articles I'm always on the lookout for new articles to feature. The best way to get them to me is to share them via Delicious. It's pretty easy - sign up for an account, then you can add a post and share it to me. Alternatively, you can send them to me via Twitter (@jongalloway) or e-mail (). If you do e-mail me, it helps to include a short description and your full name so I can credit you. Way too many developer blogs don't include names and pictures; if I can't find them I can't feature the post. Subscribing to the Community Spotlight feed The Community Spotlight is available as an RSS feed, so you might want to subscribe to it: http://www.asp.net/rss/spotlight Setting the ASP.NET Community Spotlight feed as your Visual Studio start page If you're an ASP.NET developer, you might consider setting the ASP.NET Community Spotlight as the content for your Visual Studio Start Page. It's really easy - here's how to do it in Visual Studio 2010: Display the Visual Studio Start Page if it's not already showing (View / Start Page) Click on the Latest News tab and enter the following RSS URL: http://www.asp.net/rss/spotlight If you didn't previously have RSS feeds enabled for your start page, click the Enable RSS Feed button Now, every time you start up Visual Studio you'll see great content from members of the ASP.NET community: You can also configure - and disable, if you'd like - the Visual Studio start page in the Tools / Options / Environment / Startup dialog. Credits I'll do a follow-up highlighting some places I commonly find great content for the feed, but I'd like to specifically point out two of them: Elijah Manor posts a lot of great content, which is available in his Twitter feed at @elijahmanor, on his Delicious feed, and on a dedicated website - Web Dev Tweets Chris Alcock's The Morning Brew is a must-read blog which highlights each day's best blog posts across the .NET community. He's an absolute machine, and no matter how obscure the post I find, I can guarantee he'll find it as well if he hasn't already. Did I say must read?

Read the article
The ugly evolution of running a background operation in the context of an ASP.NET app

- by Jeff

If you’re one of the two people who has followed my blog for many years, you know that I’ve been going at POP Forums now for over almost 15 years. Publishing it as an open source app has been a big help because it helps me understand how people want to use it, and having it translated to six languages is pretty sweet. Despite this warm and fuzzy group hug, there has been an ugly hack hiding in there for years. One of the things we find ourselves wanting to do is hide some kind of regular process inside of an ASP.NET application that runs periodically. The motivation for this has always been that a lot of people simply don’t have a choice, because they’re running the app on shared hosting, or don’t otherwise have access to a box that can run some kind of regular background service. In POP Forums, I “solved” this problem years ago by hiding some static timers in an HttpModule. Truthfully, this works well as long as you don’t run multiple instances of the app, which in the cloud world, is always a possibility. With the arrival of WebJobs in Azure, I’m going to solve this problem. This post isn’t about that. The other little hacky problem that I “solved” was spawning a background thread to queue emails to subscribed users of the forum. This evolved quite a bit over the years, starting with a long running page to mail users in real-time, when I had only a few hundred. By the time it got into the thousands, or tens of thousands, I needed a better way. What I did is launched a new thread that read all of the user data in, then wrote a queued email to the database (as in, the entire body of the email, every time), with the properly formatted opt-out link. It was super inefficient, but it worked. Then I moved my biggest site using it, CoasterBuzz, to an Azure Website, and it stopped working. So let’s start with the first stupid thing I was doing. The new thread was simply created with delegate code inline. As best I can tell, Azure Websites are more aggressive about garbage collection, because that thread didn’t queue even one message. When the calling server response went out of scope, so went the magic background thread. Duh, all I had to do was move the thread to a private static variable in the class. That’s the way I was able to keep stuff running from the HttpModule. (And yes, I know this is still prone to failure, particularly if the app recycles. For as infrequently as it’s used, I have not, however, experienced this.) It was still failing, but this time I wasn’t sure why. It would queue a few dozen messages, then die. Running in Azure, I had to turn on the application logging and FTP in to see what was going on. That led me to a helper method I was using as delegate to build the unsubscribe links. The idea here is that I didn’t want yet another config entry to describe the base URL, appended with the right path that would match the routing table. No, I wanted the app to figure it out for you, so I came up with this little thing: public static string FullUrlHelper(this Controller controller, string actionName, string controllerName, object routeValues = null) { var helper = new UrlHelper(controller.Request.RequestContext); var requestUrl = controller.Request.Url; if (requestUrl == null) return String.Empty; var url = requestUrl.Scheme + "://"; url += requestUrl.Host; url += (requestUrl.Port != 80 ? ":" + requestUrl.Port : ""); url += helper.Action(actionName, controllerName, routeValues); return url; } And yes, that should have been done with a string builder. This is useful for sending out the email verification messages, too. As clever as I thought I was with this, I was using a delegate in the admin controller to format these unsubscribe links for tens of thousands of users. I passed that delegate into a service class that did the email work: Func<User, string> unsubscribeLinkGenerator = user => this.FullUrlHelper("Unsubscribe", AccountController.Name, new { id = user.UserID, key = _profileService.GetUnsubscribeHash(user) }); _mailingListService.MailUsers(subject, body, htmlBody, unsubscribeLinkGenerator); Cool, right? Actually, not so much. If you look back at the helper, this delegate then will depend on the controller context to learn the routing and format for the URL. As you might have guessed, those things were turning null after a few dozen formatted links, when the original request to the admin controller went away. That this wasn’t already happening on my dedicated server is surprising, but again, I understand why the Azure environment might be eager to reclaim a thread after servicing the request. It’s already inefficient that I’m building the entire email for every user, but going back to check the routing table for the right link every time isn’t a win either. I put together a little hack to look up one generic URL, and use that as the basis for a string format. If you’re wondering why I didn’t just use the curly braces up front, it’s because they get URL formatted: var baseString = this.FullUrlHelper("Unsubscribe", AccountController.Name, new { id = "--id--", key = "--key--" }); baseString = baseString.Replace("--id--", "{0}").Replace("--key--", "{1}"); Func unsubscribeLinkGenerator = user => String.Format(baseString, user.UserID, _profileService.GetUnsubscribeHash(user)); _mailingListService.MailUsers(subject, body, htmlBody, unsubscribeLinkGenerator); And wouldn’t you know it, the new solution works just fine. It’s still kind of hacky and inefficient, but it will work until this somehow breaks too.

Read the article
How to restore your production database without needing additional storage

- by David Atkinson

Production databases can get very large. This in itself is to be expected, but when a copy of the database is needed the database must be restored, requiring additional and costly storage. For example, if you want to give each developer a full copy of your production server, you’ll need n times the storage cost for your n-developer team. The same is true for any test databases that are created during the course of your project lifecycle. If you’ve read my previous blog posts, you’ll be aware that I’ve been focusing on the database continuous integration theme. In my CI setup I create a “production”-equivalent database directly from its source control representation, and use this to test my upgrade scripts. Despite this being a perfectly valid and practical thing to do as part of a CI setup, it’s not the exact equivalent to running the upgrade script on a copy of the actual production database. So why shouldn’t I instead simply restore the most recent production backup as part of my CI process? There are two reasons why this would be impractical. 1. My CI environment isn’t an exact copy of my production environment. Indeed, this would be the case in a perfect world, and it is strongly recommended as a good practice if you follow Jez Humble and David Farley’s “Continuous Delivery” teachings, but in practical terms this might not always be possible, especially where storage is concerned. It may just not be possible to restore a huge production database on the environment you’ve been allotted. 2. It’s not just about the storage requirements, it’s also the time it takes to do the restore. The whole point of continuous integration is that you are alerted as early as possible whether the build (yes, the database upgrade script counts!) is broken. If I have to run an hour-long restore each time I commit a change to source control I’m just not going to get the feedback quickly enough to react. So what’s the solution? Red Gate has a technology, SQL Virtual Restore, that is able to restore a database without using up additional storage. Although this sounds too good to be true, the explanation is quite simple (although I’m sure the technical implementation details under the hood are quite complex!) Instead of restoring the backup in the conventional sense, SQL Virtual Restore will effectively mount the backup using its HyperBac technology. It creates a data and log file, .vmdf, and .vldf, that becomes the delta between the .bak file and the virtual database. This means that both read and write operations are permitted on a virtual database as from SQL Server’s point of view it is no different from a conventional database. Instead of doubling the storage requirements upon a restore, there is no ‘duplicate’ storage requirements, other than the trivially small virtual log and data files (see illustration below). The benefit is magnified the more databases you mount to the same backup file. This technique could be used to provide a large development team a full development instance of a large production database. It is also incredibly easy to set up. Once SQL Virtual Restore is installed, you simply run a conventional RESTORE command to create the virtual database. This is what I have running as part of a nightly “release test” process triggered by my CI tool. RESTORE DATABASE WidgetProduction_Virtual FROM DISK=N'D:\VirtualDatabase\WidgetProduction.bak' WITH MOVE N'WidgetProduction' TO N'C:\WidgetWF\ProdBackup\WidgetProduction_WidgetProduction_Virtual.vmdf', MOVE N'WidgetProduction_log' TO N'C:\WidgetWF\ProdBackup\WidgetProduction_log_WidgetProduction_Virtual.vldf', NORECOVERY, STATS=1, REPLACE GO RESTORE DATABASE WidgetProduction_Virtual WITH RECOVERY Note the only change from what you would do normally is the naming of the .vmdf and .vldf files. SQL Virtual Restore intercepts this by monitoring the extension and applies its magic, ensuring the ‘virtual’ restore happens rather than the conventional storage-heavy restore. My automated release test then applies the upgrade scripts to the virtual production database and runs some validation tests, giving me confidence that were I to run this on production for real, all would go smoothly. For illustration, here is my 8Gb production database: And its corresponding backup file: Here are the .vldf and .vmdf files, which represent the only additional used storage for the new database following the virtual restore. The beauty of this product is its simplicity. Once it is installed, the interaction with the backup and virtual database is exactly the same as before, as the clever stuff is being done at a lower level. SQL Virtual Restore can be downloaded as a fully functional 14-day trial. Technorati Tags: SQL Server

Read the article
How to restore your production database without needing additional storage

- by David Atkinson

Production databases can get very large. This in itself is to be expected, but when a copy of the database is needed the database must be restored, requiring additional and costly storage. For example, if you want to give each developer a full copy of your production server, you'll need n times the storage cost for your n-developer team. The same is true for any test databases that are created during the course of your project lifecycle. If you've read my previous blog posts, you'll be aware that I've been focusing on the database continuous integration theme. In my CI setup I create a "production"-equivalent database directly from its source control representation, and use this to test my upgrade scripts. Despite this being a perfectly valid and practical thing to do as part of a CI setup, it's not the exact equivalent to running the upgrade script on a copy of the actual production database. So why shouldn't I instead simply restore the most recent production backup as part of my CI process? There are two reasons why this would be impractical. 1. My CI environment isn't an exact copy of my production environment. Indeed, this would be the case in a perfect world, and it is strongly recommended as a good practice if you follow Jez Humble and David Farley's "Continuous Delivery" teachings, but in practical terms this might not always be possible, especially where storage is concerned. It may just not be possible to restore a huge production database on the environment you've been allotted. 2. It's not just about the storage requirements, it's also the time it takes to do the restore. The whole point of continuous integration is that you are alerted as early as possible whether the build (yes, the database upgrade script counts!) is broken. If I have to run an hour-long restore each time I commit a change to source control I'm just not going to get the feedback quickly enough to react. So what's the solution? Red Gate has a technology, SQL Virtual Restore, that is able to restore a database without using up additional storage. Although this sounds too good to be true, the explanation is quite simple (although I'm sure the technical implementation details under the hood are quite complex!) Instead of restoring the backup in the conventional sense, SQL Virtual Restore will effectively mount the backup using its HyperBac technology. It creates a data and log file, .vmdf, and .vldf, that becomes the delta between the .bak file and the virtual database. This means that both read and write operations are permitted on a virtual database as from SQL Server's point of view it is no different from a conventional database. Instead of doubling the storage requirements upon a restore, there is no 'duplicate' storage requirements, other than the trivially small virtual log and data files (see illustration below). The benefit is magnified the more databases you mount to the same backup file. This technique could be used to provide a large development team a full development instance of a large production database. It is also incredibly easy to set up. Once SQL Virtual Restore is installed, you simply run a conventional RESTORE command to create the virtual database. This is what I have running as part of a nightly "release test" process triggered by my CI tool. RESTORE DATABASE WidgetProduction_virtual FROM DISK=N'C:\WidgetWF\ProdBackup\WidgetProduction.bak' WITH MOVE N'WidgetProduction' TO N'C:\WidgetWF\ProdBackup\WidgetProduction_WidgetProduction_Virtual.vmdf', MOVE N'WidgetProduction_log' TO N'C:\WidgetWF\ProdBackup\WidgetProduction_log_WidgetProduction_Virtual.vldf', NORECOVERY, STATS=1, REPLACE GO RESTORE DATABASE mydatabase WITH RECOVERY Note the only change from what you would do normally is the naming of the .vmdf and .vldf files. SQL Virtual Restore intercepts this by monitoring the extension and applies its magic, ensuring the 'virtual' restore happens rather than the conventional storage-heavy restore. My automated release test then applies the upgrade scripts to the virtual production database and runs some validation tests, giving me confidence that were I to run this on production for real, all would go smoothly. For illustration, here is my 8Gb production database: And its corresponding backup file: Here are the .vldf and .vmdf files, which represent the only additional used storage for the new database following the virtual restore. The beauty of this product is its simplicity. Once it is installed, the interaction with the backup and virtual database is exactly the same as before, as the clever stuff is being done at a lower level. SQL Virtual Restore can be downloaded as a fully functional 14-day trial. Technorati Tags: SQL Server

Read the article
How to restore your production database without needing additional storage

- by David Atkinson

Production databases can get very large. This in itself is to be expected, but when a copy of the database is needed the database must be restored, requiring additional and costly storage. For example, if you want to give each developer a full copy of your production server, you'll need n times the storage cost for your n-developer team. The same is true for any test databases that are created during the course of your project lifecycle. If you've read my previous blog posts, you'll be aware that I've been focusing on the database continuous integration theme. In my CI setup I create a "production"-equivalent database directly from its source control representation, and use this to test my upgrade scripts. Despite this being a perfectly valid and practical thing to do as part of a CI setup, it's not the exact equivalent to running the upgrade script on a copy of the actual production database. So why shouldn't I instead simply restore the most recent production backup as part of my CI process? There are two reasons why this would be impractical. 1. My CI environment isn't an exact copy of my production environment. Indeed, this would be the case in a perfect world, and it is strongly recommended as a good practice if you follow Jez Humble and David Farley's "Continuous Delivery" teachings, but in practical terms this might not always be possible, especially where storage is concerned. It may just not be possible to restore a huge production database on the environment you've been allotted. 2. It's not just about the storage requirements, it's also the time it takes to do the restore. The whole point of continuous integration is that you are alerted as early as possible whether the build (yes, the database upgrade script counts!) is broken. If I have to run an hour-long restore each time I commit a change to source control I'm just not going to get the feedback quickly enough to react. So what's the solution? Red Gate has a technology, SQL Virtual Restore, that is able to restore a database without using up additional storage. Although this sounds too good to be true, the explanation is quite simple (although I'm sure the technical implementation details under the hood are quite complex!) Instead of restoring the backup in the conventional sense, SQL Virtual Restore will effectively mount the backup using its HyperBac technology. It creates a data and log file, .vmdf, and .vldf, that becomes the delta between the .bak file and the virtual database. This means that both read and write operations are permitted on a virtual database as from SQL Server's point of view it is no different from a conventional database. Instead of doubling the storage requirements upon a restore, there is no 'duplicate' storage requirements, other than the trivially small virtual log and data files (see illustration below). The benefit is magnified the more databases you mount to the same backup file. This technique could be used to provide a large development team a full development instance of a large production database. It is also incredibly easy to set up. Once SQL Virtual Restore is installed, you simply run a conventional RESTORE command to create the virtual database. This is what I have running as part of a nightly "release test" process triggered by my CI tool. RESTORE DATABASE WidgetProduction_virtual FROM DISK=N'C:\WidgetWF\ProdBackup\WidgetProduction.bak' WITH MOVE N'WidgetProduction' TO N'C:\WidgetWF\ProdBackup\WidgetProduction_WidgetProduction_Virtual.vmdf', MOVE N'WidgetProduction_log' TO N'C:\WidgetWF\ProdBackup\WidgetProduction_log_WidgetProduction_Virtual.vldf', NORECOVERY, STATS=1, REPLACE GO RESTORE DATABASE mydatabase WITH RECOVERY Note the only change from what you would do normally is the naming of the .vmdf and .vldf files. SQL Virtual Restore intercepts this by monitoring the extension and applies its magic, ensuring the 'virtual' restore happens rather than the conventional storage-heavy restore. My automated release test then applies the upgrade scripts to the virtual production database and runs some validation tests, giving me confidence that were I to run this on production for real, all would go smoothly. For illustration, here is my 8Gb production database: And its corresponding backup file: Here are the .vldf and .vmdf files, which represent the only additional used storage for the new database following the virtual restore. The beauty of this product is its simplicity. Once it is installed, the interaction with the backup and virtual database is exactly the same as before, as the clever stuff is being done at a lower level. SQL Virtual Restore can be downloaded as a fully functional 14-day trial. Technorati Tags: SQL Server

Read the article
Subterranean IL: Constructor constraints

- by Simon Cooper

The constructor generic constraint is a slightly wierd one. The ECMA specification simply states that it: constrains [the type] to being a concrete reference type (i.e., not abstract) that has a public constructor taking no arguments (the default constructor), or to being a value type. There seems to be no reference within the spec to how you actually create an instance of a generic type with such a constraint. In non-generic methods, the normal way of creating an instance of a class is quite different to initializing an instance of a value type. For a reference type, you use newobj: newobj instance void IncrementableClass::.ctor() and for value types, you need to use initobj: .locals init ( valuetype IncrementableStruct s1 ) ldloca 0 initobj IncrementableStruct But, for a generic method, we need a consistent method that would work equally well for reference or value types. Activator.CreateInstance<T> To solve this problem the CLR designers could have chosen to create something similar to the constrained. prefix; if T is a value type, call initobj, and if it is a reference type, call newobj instance void !!0::.ctor(). However, this solution is much more heavyweight than constrained callvirt. The newobj call is encoded in the assembly using a simple reference to a row in a metadata table. This encoding is no longer valid for a call to !!0::.ctor(), as different constructor methods occupy different rows in the metadata tables. Furthermore, constructors aren't virtual, so we would have to somehow do a dynamic lookup to the correct method at runtime without using a MethodTable, something which is completely new to the CLR. Trying to do this in IL results in the following verification error: newobj instance void !!0::.ctor() [IL]: Error: Unable to resolve token. This is where Activator.CreateInstance<T> comes in. We can call this method to return us a new T, and make the whole issue Somebody Else's Problem. CreateInstance does all the dynamic method lookup for us, and returns us a new instance of the correct reference or value type (strangely enough, Activator.CreateInstance<T> does not itself have a .ctor constraint on its generic parameter): .method private static !!0 CreateInstance<.ctor T>() { call !!0 [mscorlib]System.Activator::CreateInstance<!!0>() ret } Going further: compiler enhancements Although this method works perfectly well for solving the problem, the C# compiler goes one step further. If you decompile the C# version of the CreateInstance method above: private static T CreateInstance() where T : new() { return new T(); } what you actually get is this (edited slightly for space & clarity): .method private static !!T CreateInstance<.ctor T>() { .locals init ( [0] !!T CS$0$0000, [1] !!T CS$0$0001 ) DetectValueType: ldloca.s 0 initobj !!T ldloc.0 box !!T brfalse.s CreateInstance CreateValueType: ldloca.s 1 initobj !!T ldloc.1 ret CreateInstance: call !!0 [mscorlib]System.Activator::CreateInstance<T>() ret } What on earth is going on here? Looking closer, it's actually quite a clever performance optimization around value types. So, lets dissect this code to see what it does. The CreateValueType and CreateInstance sections should be fairly self-explanatory; using initobj for value types, and Activator.CreateInstance for reference types. How does the DetectValueType section work? First, the stack transition for value types: ldloca.s 0 // &[!!T(uninitialized)] initobj !!T // ldloc.0 // !!T box !!T // O[!!T] brfalse.s // branch not taken When the brfalse.s is hit, the top stack entry is a non-null reference to a boxed !!T, so execution continues to to the CreateValueType section. What about when !!T is a reference type? Remember, the 'default' value of an object reference (type O) is zero, or null. ldloca.s 0 // &[!!T(null)] initobj !!T // ldloc.0 // null box !!T // null brfalse.s // branch taken Because box on a reference type is a no-op, the top of the stack at the brfalse.s is null, and so the branch to CreateInstance is taken. For reference types, Activator.CreateInstance is called which does the full dynamic lookup using reflection. For value types, a simple initobj is called, which is far faster, and also eliminates the unboxing that Activator.CreateInstance has to perform for value types. However, this is strictly a performance optimization; Activator.CreateInstance<T> works for value types as well as reference types. Next... That concludes the initial premise of the Subterranean IL series; to cover the details of generic methods and generic code in IL. I've got a few other ideas about where to go next; however, if anyone has any itching questions, suggestions, or things you've always wondered about IL, do let me know.

Read the article
The standards that fail us and the intellectual bubble

- by Jeff

There has been a great deal of noise in the techie community about standards, and a sudden and unexplainable hate for Flash. This noise isn't coming from consumers... the countless soccer moms, teens and your weird uncle Bob, it's coming from the people who build (or at least claim to build) the stuff those consumers consume. If you could survey the position of consumers on the topic, they'd likely tell you that they just want stuff on the Web to work.The noise goes something like this: Web standards are the correct and right thing to use across the Intertubes, and anything not a part of those standards (Flash) is bad. Furthermore, the more recent noise is centered around the idea that HTML 5, along with Javascript, is the right thing to use. The arguments against Flash are, well, the truth is I haven't seen a good argument. I see anecdotal nonsense about high CPU usage and things I'd never think to check when I'm watching Piano Cat on YouTube, but these aren't arguments to me. Sure, I've seen it crash a browser a few times, but it's totally rare.But let's go back to standards. Yes, standards have played an important role in establishing the ubiquity of the Web. The protocols themselves, TCP/IP and HTTP, have been critical. HTML, which has served us well for a very long time, established an incredible foundation. Javascript did an OK job, and thanks to clever programmers writing great frameworks like JQuery, is becoming more and more useful. CSS is awful (there, I said it, I feel SO much better), and I'll never understand why it's so disconnected and different from anything else. It doesn't help that it's so widely misinterpreted by different browsers. Still, there's no question that standards are a good thing, and they've been good for the Web, consumers and publishers alike.HTML 4 has been with us for more than a decade. In Web years, that might as well be 80. HTML 5, contrary to popular belief, is not a standard, and likely won't be for many years to come. In fact, the Web hasn't really evolved at all in terms of its standards. The tools that generate the standard markup and script have, but at the end of the day, we're still living with standards that are more than ten years old. The "official" standards process has failed us.The Web evolved anyway, and did not wait for standards bodies to decide what to do next. It evolved in part because Macromedia, then Adobe, kept evolving Flash. In the earlier days, it mostly just did obnoxious splash pages, but then it started doing animation, and then rich apps as they added form input. Eventually it found its killer app: video. Now more than 95% of browsers have Flash installed. Consumers are better for it.But I'll do it one better... I'll go out on a limb and say that Flash is a standard. If it's that pervasive, I don't care what you tell me, it's a standard. Just because a company owns it doesn't mean that it's evil or not a standard. And hey, it pains me to say that as a developer, because I think the dev tools are the suck (more on that in a minute). But again, consumers don't care. They don't even pay for Flash. The bottom line is that if I put something Flash based on the Internet, it's likely that my audience will see it.And what about the speed of standards owned by a company? Look no further than Silverlight. Silverlight 2 (which I consider the "real" start to the story) came out about a year and a half ago. Now version 4 is out, and it has come a very long way in its capabilities. If you believe Riastats.com, more than half of browsers have it now. It didn't have to wait for standards bodies and nerds drafting documents, it's out today. At this rate, Silverlight will be on version 6 or 7 by the time HTML 5 is a ratified standard.Back to the noise, one of the things that has continually disappointed me about this profession is the number of people who get stuck in an intellectual bubble, color it with dogmatic principles, and completely ignore the actual marketplace where this stuff all has to live. We aren't machines; Binary thinking that forces us to choose between "open standards" and "proprietary lock-in" (the most loaded b.s. FUD term evar) isn't smart at all. The truth is that the <object> tag has allowed us to build incredible stuff on top of the old standards, and consumers have benefitted greatly. Consumer desire, capitalism, and yes, standards ratified by nerds who think about this stuff for years have all played a role in the broad adoption of the Interwebs.We could all do without the noise. At the end of the day, I'm going to build stuff for the Web that's good for my users, and I'm not going to base my decisions on a techie bubble religion. Imagine what the brilliant minds behind the noise could do for the Web if they joined me in that pursuit.

Read the article
SQL SERVER – Weekly Series – Memory Lane – #053 – Final Post in Series

- by Pinal Dave

It has been a fantastic journey to write memory lane series for an entire year. This series gave me the opportunity to go back and see what I have contributed to this blog throughout the last 7 years. This was indeed fantastic series as this provided me the opportunity to witness how technology has grown throughout the year and how I have progressed in my career while writing this blog post. This series was indeed fantastic experience readers as many joined during the last few years and were not sure what they have missed in recent years. Let us continue with the final episode of the Memory Lane Series. Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 Get Current User – Get Logged In User Here is the straight script which list logged in SQL Server users. Disable All Triggers on a Database – Disable All Triggers on All Servers Question : How to disable all the triggers for a database? Additionally, how to disable all the triggers for all servers? For answer execute the script in the blog post. Importance of Master Database for SQL Server Startup I have received following questions many times. I will list all the questions here and answer them together. What is the purpose of Master database? Should our backup Master database? Which database is must have database for SQL Server for startup? Which are the default system database created when SQL Server 2005 is installed for the first time? What happens if Master database is corrupted? Answers to all of the questions are very much related. 2008 DECLARE Multiple Variables in One Statement SQL Server is a great product and it has many features which are very unique to SQL Server. Regarding feature of SQL Server where multiple variable can be declared in one statement, it is absolutely possible to do. 2009 How to Enable Index – How to Disable Index – Incorrect syntax near ‘ENABLE’ Many times I have seen that the index is disabled when there is a large update operation on the table. Bulk insert of very large file updates in any table using SSIS is usually preceded by disabling the index and followed by enabling the index. I have seen many developers running the following query to disable the index. 2010 List of all the Views from Database Many emails I received suggesting that they have hundreds of the view and now have no clue what is going on and how many of them have indexes and how many does not have an index. Some even asked me if there is any way they can get a list of the views with the property of Index along with it. Here is the quick script which does exactly the same. You can also include many other columns from the same view. Minimum Maximum Memory – Server Memory Options I was recently reading about SQL Server Memory Options over here. While reading this one line really caught my attention is minimum value allowed for maximum memory options. The default setting for min server memory is 0, and the default setting for max server memory is 2147483647. The minimum amount of memory you can specify for max server memory is 16 megabytes (MB). 2011 Fundamentals of Columnstore Index There are two kinds of storage in a database. Row Store and Column Store. Row store does exactly as the name suggests – stores rows of data on a page – and column store stores all the data in a column on the same page. These columns are much easier to search – instead of a query searching all the data in an entire row whether the data are relevant or not, column store queries need only to search a much lesser number of the columns. How to Ignore Columnstore Index Usage in Query In summary the question in simple words “How can we ignore using the column store index in selective queries?” Very interesting question – you can use I can understand there may be the cases when the column store index is not ideal and needs to be ignored the same. You can use the query hint IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX to ignore the column store index. The SQL Server Engine will use any other index which is best after ignoring the column store index. 2012 Storing Variable Values in Temporary Array or Temporary List SQL Server does not support arrays or a dynamic length storage mechanism like list. Absolutely there are some clever workarounds and few extra-ordinary solutions but everybody can;t come up with such solution. Additionally, sometime the requirements are very simple that doing extraordinary coding is not required. Here is the simple case. Move Database Files MDF and LDF to Another Location It is not common to keep the Database on the same location where OS is installed. Usually Database files are in SAN, Separate Disk Array or on SSDs. This is done usually for performance reason and manageability perspective. Now the challenges comes up when database which was installed at not preferred default location and needs to move to a different location. Here is the quick tutorial how you can do it. UNION ALL and ORDER BY – How to Order Table Separately While Using UNION ALL If your requirement is such that you want your top and bottom query of the UNION resultset independently sorted but in the same result set you can add an additional static column and order by that column. Let us re-create the same scenario. Copy Data from One Table to Another Table – SQL in Sixty Seconds #031 – Video http://www.youtube.com/watch?v=FVWIA-ACMNo Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
Planning an Event–SPS NYC

- by MOSSLover

I bet some of you were wondering why I am not going to any events for the most part in June and July (aside from volunteering at SPS Chicago). Well I basically have no life for the next 2 months. We are approaching the 11th hour of SharePoint Saturday New York City. The event is slated to have 350 to 400 attendees. This is second year in a row I’ve helped run it with Jason Gallicchio. It’s amazingly crazy how much effort this event requires versus Kansas City. It’s literally 2x the volume of attendees, speakers, and sponsors plus don’t even get me started about volunteers. So here is a bit of the break down… We have 30 volunteers+ that Tasha Scott from the Hampton Roads Area will be managing the day of the event to do things like timing the speakers, handing out food, making sure people don’t walk into the event that did not sign up until we get a count for fire code, registering people, watching the sharpees, watching the prizes, making sure attendees get to the right place, opening and closing the partition in the big room, moving chairs, moving furniture, etc…Then there is Jason, Greg, and I who will be making sure that the speakers, sponsors, and everything is going smoothly in the background. We need to make sure that everything is setup properly and in the right spot. We also need to make sure signs are printed, schedules are created, bags are stuffed with sponsor material. Plus we need to send out emails to sponsors reminding them to send us the right information to post on the site for charity sessions, send us boxes with material to stuff bags, and we need to make sure that Michael Lotter gets there information for invoicing. We also need to check that Michael has invoiced everyone and who has paid, because we can’t order anything for the event without money. Once people have paid we need to setup food orders, speaker and volunteer dinners, buy prizes, buy bags, buy speakers/volunteer/organizer shirts, etc…During this process we need all the abstracts from speakers, all the bios, pictures, shirt sizes, and other items so we can create schedules and order items. We also need to keep track of who is attending the dinner the night before for volunteers and speakers and make sure we don’t hit capacity. Then there is attendee tracking and making sure that we don’t hit too many attendees. We need to make sure that attendees know where to go and what to do. We have to make all kinds of random supply lists during this process and keep on track with a variety of lists and emails plus conference calls. All in all it’s a lot of work and I am trying to keep track of it all the top so that we don’t duplicate anything or miss anything. So basically all in all if you don’t see me around for another month don’t worry I do exist. Right now if you look at what I’m doing I am traveling every Monday morning and Thursday night back and forth to Washington DC from New Jersey. Every night I am working on organizational stuff for SharePoint Saturday New York City. Every Tuesday night we are running an event conference call. Every weekend I am either with family or my boyfriend and cat trying hard not to touch the event. So all my time is pretty much work, event, and family/boyfriend. I have 0 bandwidth for anything in the community. If you compound that with my severe allergy problems in DC and a doctor’s appointment every month plus a new med once a week I’m lucky I am still standing and walking. So basically once July 30th hits hopefully Jason Gallicchio, Greg Hurlman, and myself will be able to breathe a little easier. If I forget to do this thank you Greg and Jason. Thank you Tom Daly. Thank you Michael Lotter. Thank you Tasha Scott. Thank you Kevin Griffin. Thank you all the volunteers, speakers, sponsors, and attendees who will and have made this event a success. Hopefully, we have enough time until next year to regroup, recharge, and make the event grow bigger in a different venue. Awesome job everyone we sole out within 3 days of registration and we still have several weeks to go. Right now the waitlist is at 49 people with room to grow. If you attend the event thank all these guys I mentioned above for making it possible. It’s going to be awesome I know it but I probably won’t remember half of it due to the blur of things that we will all be taking care of the day of the event. Catch you all in the end of July/Early August where I will attempt to post something useful and clever and possibly while wearing a fez. Technorati Tags: SPS NYC,SharePoint Saturday,SharePoint Saturday New York City

Read the article
Functional Adaptation

- by Charles Courchaine

In real life and OO programming we’re often faced with using adapters, DVI to VGA, 1/4” to 1/8” audio connections, 110V to 220V, wrapping an incompatible interface with a new one, and so on. Where the adapter pattern is generally considered for interfaces and classes a similar technique can be applied to method signatures. To be fair, this adaptation is generally used to reduce the number of parameters but I’m sure there are other clever possibilities to be had. As Jan questioned in the last post, how can we use a common method to execute an action if the action has a differing number of parameters, going back to the greeting example it was suggested having an AddName method that takes a first and last name as parameters. This is exactly what we’ll address in this post. Let’s set the stage with some review and some code changes. First, our method that handles the setup/tear-down infrastructure for our WCF service: 1: private static TResult ExecuteGreetingFunc<TResult>(Func<IGreeting, TResult> theGreetingFunc) 2: { 3: IGreeting aGreetingService = null; 4: try 5: { 6: aGreetingService = GetGreetingChannel(); 7: return theGreetingFunc(aGreetingService); 8: } 9: finally 10: { 11: CloseWCFChannel((IChannel)aGreetingService); 12: } 13: } Our original AddName method: 1: private static string AddName(string theName) 2: { 3: return ExecuteGreetingFunc<string>(theGreetingService => theGreetingService.AddName(theName)); 4: } Our new AddName method: 1: private static int AddName(string firstName, string lastName) 2: { 3: return ExecuteGreetingFunc<int>(theGreetingService => theGreetingService.AddName(firstName, lastName)); 4: } Let’s change the AddName method, just a little bit more for this example and have it take the greeting service as a parameter. 1: private static int AddName(IGreeting greetingService, string firstName, string lastName) 2: { 3: return greetingService.AddName(firstName, lastName); 4: } The new signature of AddName using the Func delegate is now Func<IGreeting, string, string, int>, which can’t be used with ExecuteGreetingFunc as is because it expects Func<IGreeting, TResult>. Somehow we have to eliminate the two string parameters before we can use this with our existing method. This is where we need to adapt AddName to match what ExecuteGreetingFunc expects, and we’ll do so in the following progression. 1: Func<IGreeting, string, string, int> -> Func<IGreeting, string, int> 2: Func<IGreeting, string, int> -> Func<IGreeting, int> For the first step, we’ll create a method using the lambda syntax that will “eliminate” the last name parameter: 1: string lastNameToAdd = "Smith"; 2: //Func<IGreeting, string, string, int> -> Func<IGreeting, string, int> 3: Func<IGreeting, string, int> addName = (greetingService, firstName) => AddName(greetingService, firstName, lastNameToAdd); The new addName method gets us one step close to the signature we need. Let’s say we’re going to call this in a loop to add several names, we’ll take the final step from Func<IGreeting, string, int> -> Func<IGreeting, int> in line as a lambda passed to ExecuteGreetingFunc like so: 1: List<string> firstNames = new List<string>() { "Bob", "John" }; 2: int aID; 3: foreach (string firstName in firstNames) 4: { 5: //Func<IGreeting, string, int> -> Func<IGreeting, int> 6: aID = ExecuteGreetingFunc<int>(greetingService => addName(greetingService, firstName)); 7: Console.WriteLine(GetGreeting(aID)); 8: } If for some reason you needed to break out the lambda on line 6 you could replace it with 1: aID = ExecuteGreetingFunc<int>(ApplyAddName(addName, firstName)); and use this method: 1: private static Func<IGreeting, int> ApplyAddName(Func<IGreeting, string, int> addName, string lastName) 2: { 3: return greetingService => addName(greetingService, lastName); 4: } Splitting out a lambda into its own method is useful both in this style of coding as well as LINQ queries to improve the debugging experience. It is not strictly necessary to break apart the steps & functions as was shown above; the lambda in line 6 (of the foreach example) could include both the last name and first name instead of being composed of two functions. The process demonstrated above is one of partially applying functions, this could have also been done with Currying (also see Dustin Campbell’s excellent post on Currying for the canonical curried add example). Matthew Podwysocki also has some good posts explaining both Currying and partial application and a follow up post that further clarifies the difference between Currying and partial application. In either technique the ultimate goal is to reduce the number of parameters passed to a function. Currying makes it a single parameter passed at each step, where partial application allows one to use multiple parameters at a time as we’ve done here. This technique isn’t for everyone or every problem, but can be extremely handy when you need to adapt a call to something you don’t control.

Read the article
A deadlock was detected while trying to lock variables in SSIS

Error: 0xC001405C at SQL Log Status: A deadlock was detected while trying to lock variables "User::RowCount" for read/write access. A lock cannot be acquired after 16 attempts. The locks timed out. Have you ever considered variable locking when building your SSIS packages? I expect many people haven’t just because most of the time you never see an error like the one above. I’ll try and explain a few key concepts about variable locking and hopefully you never will see that error. First of all, what is all this variable locking all about? Put simply SSIS variables have to be locked before they can be accessed, and then of course unlocked once you have finished with them. This is baked into SSIS, presumably to reduce the risk of race conditions, but with that comes some additional overhead in that you need to be careful to avoid lock conflicts in some scenarios. The most obvious place you will come across any hint of locking (no pun intended) is the Script Task or Script Component with their ReadOnlyVariables and ReadWriteVariables properties. These two properties allow you to enter lists of variables to be used within the task, or to put it another way, these lists of variables to be locked, so that they are available within the task. During the task pre-execute phase the variables and locked, you then use them during the execute phase when you code is run, and then unlocked for you during the post-execute phase. So by entering the variable names in one of the two list, the locking is taken care of for you, and you just read and write to the Dts.Variables collection that is exposed in the task for the purpose. As you can see in the image above, the variable PackageInt is specified, which means when I write the code inside that task I don’t have to worry about locking at all, as shown below. public void Main() { // Set the variable value to something new Dts.Variables["PackageInt"].Value = 199; // Raise an event so we can play in the event handler bool fireAgain = true; Dts.Events.FireInformation(0, "Script Task Code", "This is the script task raising an event.", null, 0, ref fireAgain); Dts.TaskResult = (int)ScriptResults.Success; } As you can see as well as accessing the variable, hassle free, I also raise an event. Now consider a scenario where I have an event hander as well as shown below. Now what if my event handler uses tries to use the same variable as well? Well obviously for the point of this post, it fails with the error quoted previously. The reason why is clearly illustrated if you consider the following sequence of events. Package execution starts Script Task in Control Flow starts Script Task in Control Flow locks the PackageInt variable as specified in the ReadWriteVariables property Script Task in Control Flow executes script, and the On Information event is raised The On Information event handler starts Script Task in On Information event handler starts Script Task in On Information event handler attempts to lock the PackageInt variable (for either read or write it doesn’t matter), but will fail because the variable is already locked. The problem is caused by the event handler task trying to use a variable that is already locked by the task in Control Flow. Events are always raised synchronously, therefore the task in Control Flow that is raising the event will not regain control until the event handler has completed, so we really do have un-resolvable locking conflict, better known as a deadlock. In this scenario we can easily resolve the problem by managing the variable locking explicitly in code, so no need to specify anything for the ReadOnlyVariables and ReadWriteVariables properties. public void Main() { // Set the variable value to something new, with explicit lock control Variables lockedVariables = null; Dts.VariableDispenser.LockOneForWrite("PackageInt", ref lockedVariables); lockedVariables["PackageInt"].Value = 199; lockedVariables.Unlock(); // Raise an event so we can play in the event handler bool fireAgain = true; Dts.Events.FireInformation(0, "Script Task Code", "This is the script task raising an event.", null, 0, ref fireAgain); Dts.TaskResult = (int)ScriptResults.Success; } Now the package will execute successfully because the variable lock has already been released by the time the event is raised, so no conflict occurs. For those of you with a SQL Engine background this should all sound strangely familiar, and boils down to getting in and out as fast as you can to reduce the risk of lock contention, be that SQL pages or SSIS variables. Unfortunately we cannot always manage the locking ourselves. The Execute SQL Task is very often used in conjunction with variables, either to pass in parameter values or get results out. Either way the task will manage the locking for you, and will fail when it cannot lock the variables it requires. The scenario outlined above is clear cut deadlock scenario, both parties are waiting on each other, so it is un-resolvable. The mechanism used within SSIS isn’t actually that clever, and whilst the message says it is a deadlock, it really just means it tried a few times, and then gave up. The last part of the error message is actually the most accurate in terms of the failure, A lock cannot be acquired after 16 attempts. The locks timed out. Now this may come across as a recommendation to always manage locking manually in the Script Task or Script Component yourself, but I think that would be an overreaction. It is more of a reminder to be aware that in high concurrency scenarios, especially when sharing variables across multiple objects, locking is important design consideration. Update – Make sure you don’t try and use explicit locking as well as leaving the variable names in the ReadOnlyVariables and ReadWriteVariables lock lists otherwise you’ll get the deadlock error, you cannot lock a variable twice!

Read the article
Myths about Coding Craftsmanship part 2

- by tom

Myth 3: The source of all bad code is inept developers and stupid people When you review code is this what you assume? Shame on you. You are probably making assumptions in your code if you are assuming so much already. Bad code can be the result of any number of causes including but not limited to using dated techniques (like boxing when generics are available), not following standards (“look how he does the spacing between arguments!” or “did he really just name that variable ‘bln_Hello_Cats’?”), being redundant, using properties, methods, or objects in a novel way (like switching on button.Text between “Hello World” and “Hello World “ //clever use of space character… sigh), not following the SOLID principals, hacking around assumptions made in earlier iterations / hacking in features that should be worked into the overall design. The first two issues, while annoying are pretty easy to spot and can be fixed so easily. If your coding team is made up of experienced professionals who are passionate about staying current then these shouldn’t be happening. If you work with a variety of skills, backgrounds, and experience then there will be some of this stuff going on. If you have an opportunity to mentor such a developer who is receptive to constructive criticism don’t be a jerk; help them and the codebase will improve. A little patience can improve the codebase, your work environment, and even your perspective. The novelty and redundancy I have encountered has often been the use of creativity when language knowledge was perceived as unavailable or too time consuming. When developers learn on the job you get a lot of this. Rather than going to MSDN developers will use what they know. Depending on the constraints of their assignment hacking together what they know may seem quite practical. This was not stupid though I often wonder how much time is actually “saved” by hacking. These issues are often harder to untangle if we ever do. They can also grow out of control as we write hack after hack to make it work and get back to some development that is satisfying. Hacking upon an existing hack is what I call “feeding the monster”. Code monsters are anti-patterns and hacks gone wild. The reason code monsters continue to get bigger is that they keep growing in scope, touching more and more of the application. This is not the result of dumb developers. It is probably the result of avoiding design, not taking the time to understand the problems or anticipate or communicate the vision of the product. If our developers don’t understand the purpose of a feature or product how do we expect potential customers to do so? Forethought and organization are often what is missing from bad code. Developers who do not use the SOLID principals should be encouraged to learn these principals and be given guidance on how to apply them. The time “saved” by giving hackers room to hack will be made up for and then some. Not as technical debt but as shoddy work that if not replaced will be struggled with again and again. Bad code is not the result of dumb developers (usually) it is the result of trying to do too much without the proper resources and neglecting the right thing that needs doing with the first thoughtless thing that comes into our heads. Object oriented code is all about relationships between objects. Coders who believe their coworkers are all fools tend to write objects that are difficult to work with, not eager to explain themselves, and perform erratically and irrationally. If you constantly find you are surrounded by idiots you may want to ask yourself if you are being unreasonable, if you are being closed minded, of if you have chosen the right profession. Opening your mind up to the idea that you probably work with rational, well-intentioned people will probably make you a better coder and it might even make you less grumpy. If you are surrounded by jerks who do not engage in the exchange of ideas who do not care about their customers or the durability of the code you are building together then I suggest you find a new place to work. Myth 4: Customers don’t care about “beautiful” code Craftsmanship is customer focused because it means that the job was done right, the product will withstand the abuse, modifications, and scrutiny of our customers. Users can appreciate a predictable timeline for a release, a product delivered on time and on budget, a feature set that does not interfere with the task(s) it is supporting, quick turnarounds on exception messages, self healing issues, and less issues. These are all hindered by skimping on craftsmanship. When we write data access and when we write reusable code. What do you think? Does bad code come primarily from low IQ individuals? Do customers care about beautiful code?

Read the article
Subterranean IL: The ThreadLocal type

- by Simon Cooper

I came across ThreadLocal<T> while I was researching ConcurrentBag. To look at it, it doesn't really make much sense. What's all those extra Cn classes doing in there? Why is there a GenericHolder<T,U,V,W> class? What's going on? However, digging deeper, it's a rather ingenious solution to a tricky problem. Thread statics Declaring that a variable is thread static, that is, values assigned and read from the field is specific to the thread doing the reading, is quite easy in .NET: [ThreadStatic] private static string s_ThreadStaticField; ThreadStaticAttribute is not a pseudo-custom attribute; it is compiled as a normal attribute, but the CLR has in-built magic, activated by that attribute, to redirect accesses to the field based on the executing thread's identity. TheadStaticAttribute provides a simple solution when you want to use a single field as thread-static. What if you want to create an arbitary number of thread static variables at runtime? Thread-static fields can only be declared, and are fixed, at compile time. Prior to .NET 4, you only had one solution - thread local data slots. This is a lesser-known function of Thread that has existed since .NET 1.1: LocalDataStoreSlot threadSlot = Thread.AllocateNamedDataSlot("slot1"); string value = "foo"; Thread.SetData(threadSlot, value); string gettedValue = (string)Thread.GetData(threadSlot); Each instance of LocalStoreDataSlot mediates access to a single slot, and each slot acts like a separate thread-static field. As you can see, using thread data slots is quite cumbersome. You need to keep track of LocalDataStoreSlot objects, it's not obvious how instances of LocalDataStoreSlot correspond to individual thread-static variables, and it's not type safe. It's also relatively slow and complicated; the internal implementation consists of a whole series of classes hanging off a single thread-static field in Thread itself, using various arrays, lists, and locks for synchronization. ThreadLocal<T> is far simpler and easier to use. ThreadLocal ThreadLocal provides an abstraction around thread-static fields that allows it to be used just like any other class; it can be used as a replacement for a thread-static field, it can be used in a List<ThreadLocal<T>>, you can create as many as you need at runtime. So what does it do? It can't just have an instance-specific thread-static field, because thread-static fields have to be declared as static, and so shared between all instances of the declaring type. There's something else going on here. The values stored in instances of ThreadLocal<T> are stored in instantiations of the GenericHolder<T,U,V,W> class, which contains a single ThreadStatic field (s_value) to store the actual value. This class is then instantiated with various combinations of the Cn types for generic arguments. In .NET, each separate instantiation of a generic type has its own static state. For example, GenericHolder<int,C0,C1,C2> has a completely separate s_value field to GenericHolder<int,C1,C14,C1>. This feature is (ab)used by ThreadLocal to emulate instance thread-static fields. Every time an instance of ThreadLocal is constructed, it is assigned a unique number from the static s_currentTypeId field using Interlocked.Increment, in the FindNextTypeIndex method. The hexadecimal representation of that number then defines the specific Cn types that instantiates the GenericHolder class. That instantiation is therefore 'owned' by that instance of ThreadLocal. This gives each instance of ThreadLocal its own ThreadStatic field through a specific unique instantiation of the GenericHolder class. Although GenericHolder has four type variables, the first one is always instantiated to the type stored in the ThreadLocal<T>. This gives three free type variables, each of which can be instantiated to one of 16 types (C0 to C15). This puts an upper limit of 4096 (163) on the number of ThreadLocal<T> instances that can be created for each value of T. That is, there can be a maximum of 4096 instances of ThreadLocal<string>, and separately a maximum of 4096 instances of ThreadLocal<object>, etc. However, there is an upper limit of 16384 enforced on the total number of ThreadLocal instances in the AppDomain. This is to stop too much memory being used by thousands of instantiations of GenericHolder<T,U,V,W>, as once a type is loaded into an AppDomain it cannot be unloaded, and will continue to sit there taking up memory until the AppDomain is unloaded. The total number of ThreadLocal instances created is tracked by the ThreadLocalGlobalCounter class. So what happens when either limit is reached? Firstly, to try and stop this limit being reached, it recycles GenericHolder type indexes of ThreadLocal instances that get disposed using the s_availableIndices concurrent stack. This allows GenericHolder instantiations of disposed ThreadLocal instances to be re-used. But if there aren't any available instantiations, then ThreadLocal falls back on a standard thread local slot using TLSHolder. This makes it very important to dispose of your ThreadLocal instances if you'll be using lots of them, so the type instantiations can be recycled. The previous way of creating arbitary thread-static variables, thread data slots, was slow, clunky, and hard to use. In comparison, ThreadLocal can be used just like any other type, and each instance appears from the outside to be a non-static thread-static variable. It does this by using the CLR type system to assign each instance of ThreadLocal its own instantiated type containing a thread-static field, and so delegating a lot of the bookkeeping that thread data slots had to do to the CLR type system itself! That's a very clever use of the CLR type system.

Read the article
"Guiding" a Domain Expert to Retire from Programming

- by James Kolpack

I've got a friend who does IT at a local non-profit where they're using a custom web application which is no longer supported by the company who built it. (out of business, support was too expensive, I'm not sure...) Development on this app started around 10+ years ago so the technologies being harnessed are pretty out of date now - classic asp using vbscript and SQL Server 2000. The application domain is in the realm of government bookkeeping - so even though the development team is long gone, there are often new requirements of this software. Enter the... The domain expert. This is an middle aged accounting whiz without much (or any?) prior development experience. He studied the pages, code and queries and learned how to ape the style of the original team which, believe me, is mediocre at best. He's very clever and very tenacious but has no experience in software beyond what he's picked up from this app. Otherwise, he's a pleasant guy to talk to and definitely knows his domain. My friend in IT, and probably his superiors in the company, want him out of the code. They view him as wasting his expertise on coding tasks he shouldn't be doing. My friend got me involved with a few small contracts which I handled without much problem - other than somewhat of a communication barrier with the domain expert. He explained the requirements very quickly, assuming prior knowledge of the domain which I do not have. This is partially his normal style, and I think maybe a bit of resentment from my involvement. So, I think he feels like the owner of the code and has entrenched himself in a development position. So... his coding technique. One of his latest endeavors was to make a page that only he could reach (theoretically - the security model for the system is wretched) where he can enter a raw SQL query, run it, and save the query to run again later. A report that I worked on had been originally implemented by him using 6 distinct queries, 3 or 4 temp tables to coordinate the data between the queries, and the final result obtained by importing the data from the final query into Access and doing a pivot and some formatting. It worked - well, some of the results were incorrect - but at what a cost! (I implemented the report in a single query with at least 1/10th the amount of code.) He edits code in notepad. He doesn't seem to know about online reference material for the languages. I recently read an article on Dr. Dobbs titled "What Makes Bad Programmers Different" - and instantly thought of our domain expert. From the article: Their code is large, messy, and bug laden. They have very superficial knowledge of their problem domain and their tools. Their code has a lot of copy/paste and they have very little interest in techniques that reduce it. The fail to account for edge cases, while inefficiently dealing with the general case. They never have time to comment their code or break it into smaller pieces. Empirical evidence plays no little role in their decisions. 5.5 out of 6. My friend is wanting me to argue the case to their management - specifically, I got this email from their manager to respond to: ...Also, I need to talk to you about what effect there is from Domain Expert continuing to make edits to the live environment. If that is a problem for you I need to know so I can have his access blocked. Some examples would help. In my opinion, from a technical standpoint, it's dangerous to have him making changes without any oversight. On the other hand, I'm just doing one-off contracts at this point and don't have much desire to get involved deeply enough that I'm essentially arguing as one of the Bobs from Office Space. I'd like to help my friend out - but I feel like I'm getting in the middle of a political battle. More importantly - if I do get involved and suggest that his editing privileges be removed, it needs to be handled carefully so that doesn't feel belittled. He is beyond a doubt the foremost expert on this system. I'm hoping this is familiar territory for some other stackechangers, because I'm feeling a little bewildered. How should I respond? Should I argue that he shouldn't be allowed to touch the code? Should I phrase it as "no single developer, no matter how experienced, should be working on production code unchecked"? Should I argue to keep him involved with the code, but with a review process? Should I say "glad I could help, but uh, I'm busy now!" Other options? Thanks a bunch!

Read the article
Joining on NULLs

- by Dave Ballantyne

A problem I see on a fairly regular basis is that of dealing with NULL values. Specifically here, where we are joining two tables on two columns, one of which is ‘optional’ ie is nullable. So something like this: i.e. Lookup where all the columns are equal, even when NULL. NULL’s are a tricky thing to initially wrap your mind around. Statements like “NULL is not equal to NULL and neither is it not not equal to NULL, it’s NULL” can cause a serious brain freeze and leave you a gibbering wreck and needing your mummy. Before we plod on, time to setup some data to demo against. Create table #SourceTable ( Id integer not null, SubId integer null, AnotherCol char(255) not null ) go create unique clustered index idxSourceTable on #SourceTable(id,subID) go with cteNums as ( select top(1000) number from master..spt_values where type ='P' ) insert into #SourceTable select Num1.number,nullif(Num2.number,0),'SomeJunk' from cteNums num1 cross join cteNums num2 go Create table #LookupTable ( Id integer not null, SubID integer null ) go insert into #LookupTable Select top(100) id,subid from #SourceTable where subid is not null order by newid() go insert into #LookupTable Select top(3) id,subid from #SourceTable where subid is null order by newid() If that has run correctly, you will have 1 million rows in #SourceTable and 103 rows in #LookupTable. We now want to join one to the other. First attempt – Lets just join select * from #SourceTable join #LookupTable on #LookupTable.id = #SourceTable.id and #LookupTable.SubID = #SourceTable.SubID OK, that’s a fail. We had 100 rows back, we didn’t correctly account for the 3 rows that have null values. Remember NULL <> NULL and the join clause specifies SUBID=SUBID, which for those rows is not true. Second attempt – Lets deal with those pesky NULLS select * from #SourceTable join #LookupTable on #LookupTable.id = #SourceTable.id and isnull(#LookupTable.SubID,0) = isnull(#SourceTable.SubID,0) OK, that’s the right result, well done and 99.9% of the time that is where its left. It is a relatively trivial CPU overhead to wrap ISNULL around both columns and compare that result, so no problems. But, although that’s true, this a relational database we are using here, not a procedural language. SQL is a declarative language, we are making a request to the engine to get the results we want. How we ask for them can make a ton of difference. Lets look at the plan for our second attempt, specifically the clustered index seek on the #SourceTable There are 2 predicates. The ‘seek predicate’ and ‘predicate’. The ‘seek predicate’ describes how SQLServer has been able to use an Index. Here, it has been able to navigate the index to resolve where ID=ID. So far so good, but what about the ‘predicate’ (aka residual probe) ? This is a row-by-row operation. For each row found in the index matching the Seek Predicate, the leaf level nodes have been scanned and tested using this logical condition. In this example [Expr1007] is the result of the IsNull operation on #LookupTable and that is tested for equality with the IsNull operation on #SourceTable. This residual probe is quite a high overhead, if we can express our statement slightly differently to take full advantage of the index and make the test part of the ‘Seek Predicate’. Third attempt – X is null and Y is null So, lets state the query in a slightly manner: select * from #SourceTable join #LookupTable on #LookupTable.id = #SourceTable.id and ( #LookupTable.SubID = #SourceTable.SubID or (#LookupTable.SubID is null and #SourceTable.SubId is null) ) So its slightly wordier and may not be as clear in its intent to the human reader, that is what comments are for, but the key point is that it is now clearer to the query optimizer what our intention is. Let look at the plan for that query, again specifically the index seek operation on #SourceTable No ‘predicate’, just a ‘Seek Predicate’ against the index to resolve both ID and SubID. A subtle difference that can be easily overlooked. But has it made a difference to the performance ? Well, yes , a perhaps surprisingly high one. Clever query optimizer well done. If you are using a scalar function on a column, you a pretty much guaranteeing that a residual probe will be used. By re-wording the query you may well be able to avoid this and use the index completely to resolve lookups. In-terms of performance and scalability your system will be in a much better position if you can.

Read the article
Deduping your redundancies

- by nospam(at)example.com (Joerg Moellenkamp)

Robin Harris of Storagemojo pointed to an interesting article about about deduplication and it's impact to the resiliency of your data against data corruption on ACM Queue. The problem in short: A considerable number of filesystems store important metadata at multiple locations. For example the ZFS rootblock is copied to three locations. Other filesystems have similar provisions to protect their metadata. However you can easily proof, that the rootblock pointer in the uberblock of ZFS for example is pointing to blocks with absolutely equal content in all three locatition (with zdb -uu and zdb -r). It has to be that way, because they are protected by the same checksum. A number of devices offer block level dedup, either as an option or as part of their inner workings. However when you store three identical blocks on them and the devices does block level dedup internally, the device may just deduplicated your redundant metadata to a block stored just once that is stored on the non-voilatile storage. When this block is corrupted, you have essentially three corrupted copies. Three hit with one bullet. This is indeed an interesting problem: A device doing deduplication doesn't know if a block is important or just a datablock. This is the reason why I like deduplication like it's done in ZFS. It's an integrated part and so important parts don't get deduplicated away. A disk accessed by a block level interface doesn't know anything about the importance of a block. A metadata block is nothing different to it's inner mechanism than a normal data block because there is no way to tell that this is important and that those redundancies aren't allowed to fall prey to some clever deduplication mechanism. Robin talks about this in regard of the Sandforce disk controllers who use a kind of dedup to reduce some of the nasty effects of writing data to flash, but the problem is much broader. However this is relevant whenever you are using a device with block level deduplication. It's just the point that you have to activate it for most implementation by command, whereas certain devices do this by default or by design and you don't know about it. However I'm not perfectly sure about that ? given that storage administration and server administration are often different groups with different business objectives I would ask your storage guys if they have activated dedup without telling somebody elase on their boxes in order to speak less often with the storage sales rep. The problem is even more interesting with ZFS. You may use ditto blocks to protect important data to store multiple copies of data in the pool to increase redundancy, even when your pool just consists out of one disk or just a striped set of disk. However when your device is doing dedup internally it may remove your redundancy before it hits the nonvolatile storage. You've won nothing. Just spend your disk quota on the the LUNs in the SAN and you make your disk admin happy because of the good dedup ratio However you can just fall in this specific "deduped ditto block"trap when your pool just consists out of a single device, because ZFS writes ditto blocks on different disks, when there is more than just one disk. Yet another reason why you should spend some extra-thought when putting your zpool on a single LUN, especially when the LUN is sliced and dices out of a large heap of storage devices by a storage controller. However I have one problem with the articles and their specific mention of ZFS: You can just hit by this problem when you are using the deduplicating device for the pool. However in the specifically mentioned case of SSD this isn't the usecase. Most implementations of SSD in conjunction with ZFS are hybrid storage pools and so rotating rust disk is used as pool and SSD are used as L2ARC/sZIL. And there it simply doesn't matter: When you really have to resort to the sZIL (your system went down, it doesn't matter of one block or several blocks are corrupt, you have to fail back to the last known good transaction group the device. On the other side, when a block in L2ARC is corrupt, you simply read it from the pool and in HSP implementations this is the already mentioned rust. In conjunction with ZFS this is more interesting when using a storage array, that is capable to do dedup and where you use LUNs for your pool. However as mentioned before, on those devices it's a user made decision to do so, and so it's less probable that you deduplicating your redundancies. Other filesystems lacking acapability similar to hybrid storage pools are more "haunted" by this problem of SSD using dedup-like mechanisms internally, because those filesystem really store the data on the the SSD instead of using it just as accelerating devices. However at the end Robin is correct: It's jet another point why protecting your data by creating redundancies by dispersing it several disks (by mirror or parity RAIDs) is really important. No dedup mechanism inside a device can dedup away your redundancy when you write it to a totally different and indepenent device.

Read the article

Search Results

Search found 1124 results on 45 pages for 'anthony clever'.

Page 42/45 | < Previous Page | 38 39 40 41 42 43 44 45 | Next Page >

- by Janice J. Heiss

- by Jason Fitzpatrick

- by Tony Davis

- by Laila

- by Janice J. Heiss

- by Reed

- by dlo

- by user64204

- by Mark Virtue

- by Jon Galloway

- by Jeff

- by David Atkinson

- by David Atkinson

- by David Atkinson

- by Simon Cooper

- by Jeff

- by Pinal Dave

- by MOSSLover

- by Charles Courchaine

- by tom

- by Simon Cooper

- by James Kolpack

- by Dave Ballantyne

- by nospam(at)example.com (Joerg Moellenkamp)

< Previous Page | 38 39 40 41 42 43 44 45 | Next Page >