Search Results

Search found 2347 results on 94 pages for 'slightly frustrated'.

Page 86/94 | < Previous Page | 82 83 84 85 86 87 88 89 90 91 92 93  | Next Page >

  • The DOS DEBUG Environment

    - by MarkPearl
    Today I thought I would go back in time and have a look at the DEBUG command that has been available since the beginning of dawn in DOS, MS-DOS and Microsoft Windows. up to today I always knew it was there, but had no clue on how to use it so for those that are interested this might be a great geek party trick to pull out when you want the awe the younger generation and want to show them what “real” programming is about. But wait, you will have to do it relatively quickly as it seems like DEBUG was finally dumped from the Windows group in Windows 7. Not to worry, pull out that Windows XP box which will get you even more geek points and you can still poke DEBUG a bit. So, for those that are interested and want to find out a bit about the history of DEBUG read the wiki link here. That all put aside, lets get our hands dirty.. How to Start DEBUG in Windows Make sure your version of Windows supports DEBUG. Open up a console window Make a directory where you want to play with debug – in my instance I called it C221 Enter the directory and type Debug You will get a response with a – as illustrated in the image below…   The commands available in DEBUG There are several commands available in DEBUG. The most common ones are A (Assemble) R (Register) T (Trace) G (Go) D (Dump or Display) U (Unassemble) E (Enter) P (Proceed) N (Name) L (Load) W (Write) H (Hexadecimal) I (Input) O (Output) Q (Quit) I am not going to cover all these commands, but what I will do is go through a few of them briefly. A is for Assemble Command (to write code) The A command translates assembly language statements into machine code. It is quite useful for writing small assembly programs. Below I have written a very basic assembly program. The code typed out is as follows mov ax,0015 mov cx,0023 sub cx,ax mov [120],al mov cl,[120]A nop R is for Register (to jump to a point in memory) The r command turns out to be one of the most frequent commands you will use in DEBUG. It allows you to view the contents of registers and to change their values. It can be used with the following combinations… R – Displays the contents of all the registers R f – Displays the flags register R register_name – Displays the contents of a specific register All three methods are illustrated in the image above T is for Trace (To execute a program step by step) The t command allows us to execute the program step by step. Before we can trace the program we need to point back to the beginning of the program. We do this by typing in r ip, which moves us back to memory point 100. We then type trace which executes the first line of code (line 100) (As shown in the image below starting from the red arrow). You can see from the above image that the register AX now contains 0015 as per our instruction mov ax,0015 You can also see that the IP points to line 0103 which has the MOV CX,0023 command If we type t again it will now execute the second line of the program which moves 23 in the cx register. Again, we can see that the line of code was executed and that the CX register now holds the value of 23. What I would like to highlight now is the section underlined in red. These are the status flags. The ones we are going to look at now are 1st (NV), 4th (PL), 5th (NZ) & 8th (NC) NV means no overflow, the alternate would be OV PL means that the sign of the previous arithmetic operation was Plus, the alternate would be NG (Negative) NZ means that the results of the previous arithmetic operation operation was Not Zero, the alternate would be ZR NC means that No final Carry resulted from the previous arithmetic operation. CY means that there was a final Carry. We could now follow this process of entering the t command until the entire program is executed line by line. G is for Go (To execute a program up to a certain line number) So we have looked at executing a program line by line, which is fine if your program is minuscule BUT totally unpractical if we have any decent sized program. A quicker way to run some lines of code is to use the G command. The ‘g’ command executes a program up to a certain specified point. It can be used in connection with the the reset IP command. You would set your initial point and then run the G command with the line you want to end on. P is for Proceed (Similar to trace but slightly more streamlined) Another command similar to trace is the proceed command. All that the p command does is if it is called and it encounters a CALL, INT or LOOP command it terminates the program execution. In the example below I modified our example program to include an int 20 at the end of it as illustrated in the image below… Then when executing the code when I encountered the int 20 command I typed the P command and the program terminated normally (illustrated below). D is for Dump (or for those more polite Display) So, we have all these assembly lines of code, but if you have ever opened up an exe or com file in a text/hex editor, it looks nothing like assembly code. The D command is a way that we can see what our code looks like in memory (or in a hex editor). If we examined the image above, we can see that Debug is storing our assembly code with each instruction following immediately after the previous one. For instance in memory address 110 we have int and 111 we have 20. If we examine the dump of memory we can see at memory point 110 CD is stored and at memory point 111 20 is stored. U is for Unassemble (or Convert Machine code to Assembly Code) So up to now we have gone through a bunch of commands, but probably one of the most useful is the U command. Let’s say we don’t understand machine code so well and so instead we want to see it in its equivalent assembly code. We can type the U command followed by the start memory point, followed by the end memory point and it will show us the assembly code equivalent of the machine code. E is for a bunch of things… The E command can be used for a bunch of things… One example is to enter data or machine code instructions directly into memory. It can also be used to display the contents of memory locations. I am not going to worry to much about it in this post. N / L / W is for Name, Load & Write So we have written out assembly code in debug, and now we want to save it to disk, or write it as a com file or load it. This is where the N, L & W command come in handy. The n command is used to give a name to the executable program file and is pretty simple to use. The w command is a bit trickier. It saves to disk all the memory between point bx and point cx so you need to specify the bx memory address and the cx memory address for it to write your code. Let’s look at an example illustrated below. You do this by calling the r command followed by the either bx or cx. We can then go to the directory where we were working and will see the new file with the name we specified. The L command is relatively simple. You would first specify the name of the file you would like to load using the N command, and then call the L command. Q is for Quit The last command that I am going to write about in this post is the Q command. Simply put, calling the Q command exits DEBUG. Commands we did not Cover Out of the standard DEBUG commands we covered A, T, G, D, U, E, P, R, N, L & W. The ones we did not cover were H, I & O – I might make mention of these in a later post, but for the basics they are not really needed. Some Useful Resources Please note this post is based on the COS2213 handouts for UNISA A Guide to DEBUG - http://mirror.href.com/thestarman/asm/debug/debug.htm#NT

    Read the article

  • Using SQL Execution Plans to discover the Swedish alphabet

    - by Rob Farley
    SQL Server is quite remarkable in a bunch of ways. In this post, I’m using the way that the Query Optimizer handles LIKE to keep it SARGable, the Execution Plans that result, Collations, and PowerShell to come up with the Swedish alphabet. SARGability is the ability to seek for items in an index according to a particular set of criteria. If you don’t have SARGability in play, you need to scan the whole index (or table if you don’t have an index). For example, I can find myself in the phonebook easily, because it’s sorted by LastName and I can find Farley in there by moving to the Fs, and so on. I can’t find everyone in my suburb easily, because the phonebook isn’t sorted that way. I can’t even find people who have six letters in their last name, because also the book is sorted by LastName, it’s not sorted by LEN(LastName). This is all stuff I’ve looked at before, including in the talk I gave at SQLBits in October 2010. If I try to find everyone who’s names start with F, I can do that using a query a bit like: SELECT LastName FROM dbo.PhoneBook WHERE LEFT(LastName,1) = 'F'; Unfortunately, the Query Optimizer doesn’t realise that all the entries that satisfy LEFT(LastName,1) = 'F' will be together, and it has to scan the whole table to find them. But if I write: SELECT LastName FROM dbo.PhoneBook WHERE LastName LIKE 'F%'; then SQL is smart enough to understand this, and performs an Index Seek instead. To see why, I look further into the plan, in particular, the properties of the Index Seek operator. The ToolTip shows me what I’m after: You’ll see that it does a Seek to find any entries that are at least F, but not yet G. There’s an extra Predicate in there (a Residual Predicate if you like), which checks that each LastName is really LIKE F% – I suppose it doesn’t consider that the Seek Predicate is quite enough – but most of the benefit is seen by its working out the Seek Predicate, filtering to just the “at least F but not yet G” section of the data. This got me curious though, particularly about where the G comes from, and whether I could leverage it to create the Swedish alphabet. I know that in the Swedish language, there are three extra letters that appear at the end of the alphabet. One of them is ä that appears in the word Västerås. It turns out that Västerås is quite hard to find in an index when you’re looking it up in a Swedish map. I talked about this briefly in my five-minute talk on Collation from SQLPASS (the one which was slightly less than serious). So by looking at the plan, I can work out what the next letter is in the alphabet of the collation used by the column. In other words, if my alphabet were Swedish, I’d be able to tell what the next letter after F is – just in case it’s not G. It turns out it is… Yes, the Swedish letter after F is G. But I worked this out by using a copy of my PhoneBook table that used the Finnish_Swedish_CI_AI collation. I couldn’t find how the Query Optimizer calculates the G, and my friend Paul White (@SQL_Kiwi) tells me that it’s frustratingly internal to the QO. He’s particularly smart, even if he is from New Zealand. To investigate further, I decided to do some PowerShell, leveraging the Get-SqlPlan function that I blogged about recently (make sure you also have the SqlServerCmdletSnapin100 snap-in added). I started by indicating that I was going to use Finnish_Swedish_CI_AI as my collation of choice, and that I’d start whichever letter cam straight after the number 9. I figure that this is a cheat’s way of guessing the first letter of the alphabet (but it doesn’t actually work in Unicode – luckily I’m using varchar not nvarchar. Actually, there are a few aspects of this code that only work using ASCII, so apologies if you were wanting to apply it to Greek, Japanese, etc). I also initialised my $alphabet variable. $collation = 'Finnish_Swedish_CI_AI'; $firstletter = '9'; $alphabet = ''; Now I created the table for my test. A single field would do, and putting a Clustered Index on it would suffice for the Seeks. Invoke-Sqlcmd -server . -data tempdb -query "create table dbo.collation_test (col varchar(10) collate $collation primary key);" Now I get into the looping. $c = $firstletter; $stillgoing = $true; while ($stillgoing) { I construct the query I want, seeking for entries which start with whatever $c has reached, and get the plan for it: $query = "select col from dbo.collation_test where col like '$($c)%';"; [xml] $pl = get-sqlplan $query "." "tempdb"; At this point, my $pl variable is a scary piece of XML, representing the execution plan. A bit of hunting through it showed me that the EndRange element contained what I was after, and that if it contained NULL, then I was done. $stillgoing = ($pl.ShowPlanXML.BatchSequence.Batch.Statements.StmtSimple.QueryPlan.RelOp.IndexScan.SeekPredicates.SeekPredicateNew.SeekKeys.EndRange -ne $null); Now I could grab the value out of it (which came with apostrophes that needed stripping), and append that to my $alphabet variable.   if ($stillgoing)   {  $c=$pl.ShowPlanXML.BatchSequence.Batch.Statements.StmtSimple.QueryPlan.RelOp.IndexScan.SeekPredicates.SeekPredicateNew.SeekKeys.EndRange.RangeExpressions.ScalarOperator.ScalarString.Replace("'","");     $alphabet += $c;   } Finally, finishing the loop, dropping the table, and showing my alphabet! } Invoke-Sqlcmd -server . -data tempdb -query "drop table dbo.collation_test;"; $alphabet; When I run all this, I see that the Swedish alphabet is ABCDEFGHIJKLMNOPQRSTUVXYZÅÄÖ, which matches what I see at Wikipedia. Interesting to see that the letters on the end are still there, even with Case Insensitivity. Turns out they’re not just “letters with accents”, they’re letters in their own right. I’m sure you gave up reading long ago, and really aren’t that fazed about the idea of doing this using PowerShell. I chose PowerShell because I’d already come up with an easy way of grabbing the estimated plan for a query, and PowerShell does allow for easy navigation of XML. I find the most interesting aspect of this as the fact that the Query Optimizer uses the next letter of the alphabet to maintain the SARGability of LIKE. I’m hoping they do something similar for a whole bunch of operations. Oh, and the fact that you know how to find stuff in the IKEA catalogue. Footnote: If you are interested in whether this works in other languages, you might want to consider the following screenshot, which shows that in principle, it should work with Japanese. It might be a bit harder to run this in PowerShell though, as I’m not sure how it translates. In Hiragana, the Japanese alphabet starts ?, ?, ?, ?, ?, ...

    Read the article

  • CodePlex Daily Summary for Sunday, May 16, 2010

    CodePlex Daily Summary for Sunday, May 16, 2010New Projects3D Calculator: 3D Calc is a simple calculator application for Windows Phone 7, the purpose of this project is to demo the 3D animations capabilities of WP7 and sh...azaleas: AzaleasBlueset Studio Opensource Projects: Only for Opensource projects form Blueset Studio.Breck: A Phoenix and Jumper Moneky Production: Breck is a first person non-violent shooter developed in C++ and Dark GDK. After the main game is developed we are looking into making a sequel or...Discuz! Forum SDK: This project is use to login in and post or reply topic on discuz forum.Dominion.NET: Evolving Dominion source code originally written in VB6 and posted by "jatill" on Collectible Card Game Headquarters. Migration of the design and s...EkspSys2010-ITR: A mini project for the course Experimental System devolopment in spring 2010Facebook Graph Toolkit: This project is a .Net implementation of the Facebook Graph API. The aim of this project is to be a replacement to the existing Facebook Toolkit (h...iFree: This is a solution for Vietnamese network socialInfoPath Editor for Developer: InfoPath Editor for developer allows user to modify the html text directly inside InfoPath designer or filler and push the change back to InfoPath ...iZeit: Run your own online calendar, with blog integration, recurrence, todo list and categories.machgos dotNet Tests: Just some little test-projects for learningmim: TBAMinePost: MinePost is a game made for the first 48 hour Reddit Game Jam.Mockina: Mockina is a mock framework. Expression tree syntax is used to specify which members to mock, both public and non-public. The code is easy to under...MSBuild Launch Pad (mPad): This is just another shell extension for MSBuild to enable quick execution of MSBuild scripts via Windows Explorer context menu. (C) 2010 Lex LiPeacock: A browser like tabbed applicationPrimeCalculation: PrimeCalculation is a .NET app to calculate primes in a given range. Speed on Core2Duo 2,4GHZ: Found all primes from 0 to 1 billion in 35 seconds (...Slightly Silverlight: A Framework that leverages Silverlight for processing, business logic but standard HTML for the presentation layer.Stopwatch: Stopwatch is a tool for measuring the time. To start and pause stopwatch you only need to press a key on the keyboard. An additional context menu a...YAXLib: Yet Another XML Serialization Library for the .NET Framework: YAXLib is an XML Serialization library which helps you structure freely the XML result, choose among private and public fields to be serialized, an...New ReleasesActivate Your Glutes: v1.0.3.0: This release is a migration to VS2010, .Net 4, MVC2 and Entity Framework 4. The code has also been considerably cleaned up - taking advantage of E...AnyCAD: AnyCAD.Free.ENU.v1.1: http://www.anycad.net Modeling •2D: Line, Rectangle, Arc, Arch, Circle, Spline, Polygon •Feature: Extrude, Loft, Chamfer, Sweep, Revol •Boolean: ...Blueset Studio Opensource Projects: 多功能计算器 3.5: 稳定版本。Code for Rapid C# Windows Development eBook: LLBLGen LINQPad Data Context Driver Ver 1.0.0.0: First release of a Static LLBLGen Pro Data Context Driver for LINQPad I recommend LINQPad 4 as it seems more stable with this driver than LINQPad 2.DSQLT - Dynamic SQL Templates: Release 1.2. Some behaviour has changed!!: Attention. Some behaviour has changed! Now its necessary to use WildCards in the pattern-parameter for DSQLT.AllSourceContains DSQLT.Databases DSQ...FDS AutoCAD plug-in: FDS to AutoCAD plug-in: Basic functionality was implemented. Some routines like setting fds executable location are still not automated.Feature Builder Guidance Extensions: FBGX 2 - Standalone FX: Background: The Feature Builder Guidance is extensible and displays guidance content supplied by all the Feature Builder Guidance Extensions (FBGX...Floe IRC Client: Floe IRC Client 2010-05 R3: - You can now right click on the input box to get options for toggling bold, underline, colors, etc. - The size of the nickname column is now saved...Floe IRC Client: Floe IRC Client 2010-05 R4: - A user's channel status now appears next to their nick when they talk (e.g. @Nick or +Nick) - Fixed an error where certain kinds of network probl...HD-Trailers.NET Downloader: HD-Trailers.NET Downloader v1.0: Version 1.0 Thanks to Wolfgang for all his help. I let this project languish for too long while focusing on other things, but his involvement has ...InfoPath Editor for Developer: InfoPath Editor Beta 1: Intial Release: Can load InfoPath inner html. Can edit InfoPath inner html. InfoPath 2007 only.LinkSharp: LinkSharp 0.1.0: First release of LinkSharp. Set up iis, and use the sql script to create a new database.PowerAuras: PowerAuras V3.0.0F: This version adds better integration with GTFO New Flags Added PvP flag In 5-Man Instance In Raid Instance In Battleground In ArenaRx Contrib: V1.4: Add the ability to catch internal exception and the ability to publish error by queue adaptersSEO SiteMap: SEO SiteMap RC1: -SevenZipLib Library: v9.13.2: Stable release associated with 7z.dll 9.13 beta. Ability to create and update archives not implemente yet.Silverlight / WPF Controls: Upload, FlipPanel, DeepZoom, Animation, Encryption: Code Camp Demonstration: This code example demonstrates MVVM/MEF with WPF with attached properties,security and custom ICommand class.SQL Data Capture - Black Box Application Testing: SQLDataCapture V1.2: Added Entity Framework Support to CRUD generator (Insert Stored Procedure) and switched to VS 2010 for development.Stopwatch: Stopwatch 0.1: Stopwatch Release 0.1VCC: Latest build, v2.1.30515.0: Automatic drop of latest buildYet another developer blog - Examples: Asynchronous TreeView in ASP.NET MVC: This sample application shows how to use jQuery TreeView plugin for creating an asynchronous TreeView in ASP.NET MVC. This application is accompani...Most Popular ProjectsRawrWBFS ManagerAJAX Control ToolkitMicrosoft SQL Server Product Samples: DatabaseSilverlight ToolkitWindows Presentation Foundation (WPF)patterns & practices – Enterprise LibraryMicrosoft SQL Server Community & SamplesPHPExcelASP.NETMost Active Projectspatterns & practices – Enterprise LibraryRawrPHPExcelBlogEngine.NETMicrosoft Biology FoundationCustomer Portal Accelerator for Microsoft Dynamics CRMWindows Azure Command-line Tools for PHP DevelopersMirror Testing SystemN2 CMSStyleCop

    Read the article

  • User Guide to Dropbox Shared Folders

    - by Matthew Guay
    Dropbox is an incredibly useful tool for keeping all your files synced between your computers and the cloud.  Here we’re going to look at how you can keep all of your team on the same page with Dropbox shared folders. Creating a Shared Folder Setting up a shared folder in Dropbox is easy.  Add the files you want to share to a folder in Dropbox on your computer, then right-click in the folder, select Dropbox, and then choose Share This Folder.   Alternately, log into your Dropbox account online, click the drop-down menu beside the folder you want to share, and click Share this folder. Now, enter the email addresses of the people you want to share the folder with, and optionally enter a message explaining why you’re sharing the folder. The people you invite will receive an email inviting them to view and join the shared folder.  If they haven’t signed up for Dropbox, they can directly signup; otherwise, they can simply log into their Dropbox account and start adding or editing files. Shared folders have a slightly different icon in your Dropbox.  Notice the shared folder on the left has an icon with 2 people, while the folder on the right that is not shared, shows previews of its contents. See Your Shared Folder’s History Whenever your collaborators with your shared folders add or change files, you will see a tooltip notification telling you what changed. You can also view the changes online.  Log into your Dropbox account in your browser and select the Events tab.  This shows all changes to your Dropbox, but you can view only the changes in your shared folder by selecting its name on the left sidebar. Now you can see all recent changes to your folder, and can also see who added or removed each file.   On the bottom of the page, you can even add a comment that all the collaborators will see. If someone deleted a file you still need, you can restore it by clicking its link in this online history.  Or, you can view any deleted files by right-clicking in your Dropbox folder in Explorer.  Select Dropbox, and then click Show Deleted Files.   Get Notified When a Change is Made You’re not always in front of your computer; you’ve got a life beyond your projects, after all (at least hopefully).  If you really want to stay connected to what’s happening with your project, though, you can easily do that no matter where you are. Your shared Dropbox folder’s history page offers an RSS feed of all changes to the folder.  Click  the Subscribe to this feed hyperlink. Now, in the popup that opens, click “Copy to clipboard” so you can use this RSS feed. You can subscribe to RSS feeds through many web browsers, email clients, dedicated feed readers, and more.  In Firefox, Internet Explorer 7/8, or Opera, you can paste the feed address into your address bar and subscribe to the feed directly in your browser.   However, subscribing to the feed in a desktop application won’t help you much when you’re away from your computer.  One great option is to subscribe in the popular Google Reader.  Then you can check your feed from any browser, on any computer or mobile device. To add your Dropbox feed to Google Reader, log into Google Reader (link below), click Add a subscription on the top left, paste your RSS feed from Dropbox, and click Add.   Now you can see any changes to files or folders in Google Reader. You can even add your feed to your iGoogle homepage.  Click the Add it Now button on the right in the front page of Google Reader to add your feeds to iGoogle.   Now you can see updates on your files from your homepage.  If you’re using a different computer, just login to your Google account to see what’s happening. You can also access your Google Reader feeds from many programs and apps for most major Smartphones including iPhone, Windows Phone, and Blackberry. Receive a Tweet or Text When Changes are Made If you’re a hyper-connected individual, chances are you send and receive tweets on the go.  If so, this might be the best way for you to get notified when changes are made to your Dropbox shared folder.  To do this, first create a new Twitter account to publish your changes through.  If you don’t want the whole world to see your updates, click Settings and set your new Twitter account to Private. Once the new account is created, follow it with your normal Twitter account so you’ll see updates. Now, let’s publish our Dropbox RSS feed to Twitter.  Create an account with Twitterfeed (link below). Once your account is setup, add your feed to it.  Name your feed, and enter your Feed address from Dropbox.  Click Advanced Settings to make your feed work just like you want. In Advanced Settings, change the frequency to “Every 30 mins” to make sure you’re updated on changes as quick as possible.  You can also change other settings if you like. Click “Continue to Step 2”, and then click Twitter under the available services to add your account. Make sure your signed into your new Twitter account, and then click Authenticate Twitter. Allow the application. Now, finally, click Create Service. Whenever a change is made, you will receive a tweet via your new Twitter account.  And since you can receive tweets via text message or many mobile applications, you’ll never be very far away from your Dropbox changes!   Conclusion Dropbox shared folders are a great way to keep your whole team working together on the same files in a project.  And with these handy tricks, you can keep up with your shared files wherever you are! There are a lot of cool things you can do with Dropbox make sure to check out our posts on adding Dropbox to the Windows 7 Start menu, Accessing Dropbox files from Chrome, and Syncing your Pidgin Profile Across Multiple PCs. Links Signup or access your Dropbox account Google Reader Tweet your feed with Twitterfeed Similar Articles Productive Geek Tips How to Add and Manage Shared Folders on Windows Home ServerManage User Accounts in Windows Home ServerAdd "My Dropbox" to Your Windows 7 Start MenuComplete Guide to Networking Windows 7 with XP and VistaMoving Your Personal Data Folders in Windows Vista the Easy Way TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips DVDFab 6 Revo Uninstaller Pro Registry Mechanic 9 for Windows PC Tools Internet Security Suite 2010 Office 2010 reviewed in depth by Ed Bott FoxClocks adds World Times in your Statusbar (Firefox) Have Fun Editing Photo Editing with Citrify Outlook Connector Upgrade Error Gadfly is a cool Twitter/Silverlight app Enable DreamScene in Windows 7

    Read the article

  • On StringComparison Values

    - by Jesse
    When you use the .NET Framework’s String.Equals and String.Compare methods do you use an overloStringComparison enumeration value? If not, you should be because the value provided for that StringComparison argument can have a big impact on the results of your string comparison. The StringComparison enumeration defines values that fall into three different major categories: Culture-sensitive comparison using a specific culture, defaulted to the Thread.CurrentThread.CurrentCulture value (StringComparison.CurrentCulture and StringComparison.CurrentCutlureIgnoreCase) Invariant culture comparison (StringComparison.InvariantCulture and StringComparison.InvariantCultureIgnoreCase) Ordinal (byte-by-byte) comparison of  (StringComparison.Ordinal and StringComparison.OrdinalIgnoreCase) There is a lot of great material available that detail the technical ins and outs of these different string comparison approaches. If you’re at all interested in the topic these two MSDN articles are worth a read: Best Practices For Using Strings in the .NET Framework: http://msdn.microsoft.com/en-us/library/dd465121.aspx How To Compare Strings: http://msdn.microsoft.com/en-us/library/cc165449.aspx Those articles cover the technical details of string comparison well enough that I’m not going to reiterate them here other than to say that the upshot is that you typically want to use the culture-sensitive comparison whenever you’re comparing strings that were entered by or will be displayed to users and the ordinal comparison in nearly all other cases. So where does that leave the invariant culture comparisons? The “Best Practices For Using Strings in the .NET Framework” article has the following to say: “On balance, the invariant culture has very few properties that make it useful for comparison. It does comparison in a linguistically relevant manner, which prevents it from guaranteeing full symbolic equivalence, but it is not the choice for display in any culture. One of the few reasons to use StringComparison.InvariantCulture for comparison is to persist ordered data for a cross-culturally identical display. For example, if a large data file that contains a list of sorted identifiers for display accompanies an application, adding to this list would require an insertion with invariant-style sorting.” I don’t know about you, but I feel like that paragraph is a bit lacking. Are there really any “real world” reasons to use the invariant culture comparison? I think the answer to this question is, “yes”, but in order to understand why we should first think about what the invariant culture comparison really does. The invariant culture comparison is really just a culture-sensitive comparison using a special invariant culture (Michael Kaplan has a great post on the history of the invariant culture on his blog: http://blogs.msdn.com/b/michkap/archive/2004/12/29/344136.aspx). This means that the invariant culture comparison will apply the linguistic customs defined by the invariant culture which are guaranteed not to differ between different machines or execution contexts. This sort of consistently does prove useful if you needed to maintain a list of strings that are sorted in a meaningful and consistent way regardless of the user viewing them or the machine on which they are being viewed. Example: Prototype Names Let’s say that you work for a large multi-national toy company with branch offices in 10 different countries. Each year the company would work on 15-25 new toy prototypes each of which is assigned a “code name” while it is under development. Coming up with fun new code names is a big part of the company culture that everyone really enjoys, so to be fair the CEO of the company spent a lot of time coming up with a prototype naming scheme that would be fun for everyone to participate in, fair to all of the different branch locations, and accessible to all members of the organization regardless of the country they were from and the language that they spoke. Each new prototype will get a code name that begins with a letter following the previously created name using the alphabetical order of the Latin/Roman alphabet. Each new year prototype names would start back at “A”. The country that leads the prototype development effort gets to choose the name in their native language. (An appropriate Romanization system will be used for countries where the primary language is not written in the Latin/Roman alphabet. For example, the Pinyin system could be used for Chinese). To avoid repeating names, a list of all current and past prototype names will be maintained on each branch location’s company intranet site. Assuming that maintaining a single pre-sorted list is not feasible among all of the highly distributed intranet implementations, what string comparison method would you use to sort each year’s list of prototype names so that the list is both meaningful and consistent regardless of the country within which the list is being viewed? Sorting the list with a culture-sensitive comparison using the default configured culture on each country’s intranet server the list would probably work most of the time, but subtle differences between cultures could mean that two different people would see a list that was sorted slightly differently. The CEO wants the prototype names to be a unifying aspect of company culture and is adamant that everyone see the the same list sorted in the same order and there’s no way to guarantee a consistent sort across different cultures using the culture-sensitive string comparison rules. The culture-sensitive sort would produce a meaningful list for the specific user viewing it, but it wouldn’t always be consistent between different users. Sorting with the ordinal comparison would certainly be consistent regardless of the user viewing it, but would it be meaningful? Let’s say that the current year’s prototype name list looks like this: Antílope (Spanish) Babouin (French) Cahoun (Czech) Diamond (English) Flosse (German) If you were to sort this list using ordinal rules you’d end up with: Antílope Babouin Diamond Flosse Cahoun This sort is no good because the entry for “C” appears the bottom of the list after “F”. This is because the Czech entry for the letter “C” makes use of a diacritic (accent mark). The ordinal string comparison does a byte-by-byte comparison of the code points that make up each character in the string and the code point for the “C” with the diacritic mark is higher than any letter without a diacritic mark, which pushes that entry to the bottom of the sorted list. The CEO wants each country to be able to create prototype names in their native language, which means we need to allow for names that might begin with letters that have diacritics, so ordinal sorting kills the meaningfulness of the list. As it turns out, this situation is actually well-suited for the invariant culture comparison. The invariant culture accounts for linguistically relevant factors like the use of diacritics but will provide a consistent sort across all machines that perform the sort. Now that we’ve walked through this example, the following line from the “Best Practices For Using Strings in the .NET Framework” makes a lot more sense: One of the few reasons to use StringComparison.InvariantCulture for comparison is to persist ordered data for a cross-culturally identical display That line describes the prototype name example perfectly: we need a way to persist ordered data for a cross-culturally identical display. While this example is 100% made-up, I think it illustrates that there are indeed real-world situations where the invariant culture comparison is useful.

    Read the article

  • Fun with Aggregates

    - by Paul White
    There are interesting things to be learned from even the simplest queries.  For example, imagine you are given the task of writing a query to list AdventureWorks product names where the product has at least one entry in the transaction history table, but fewer than ten. One possible query to meet that specification is: SELECT p.Name FROM Production.Product AS p JOIN Production.TransactionHistory AS th ON p.ProductID = th.ProductID GROUP BY p.ProductID, p.Name HAVING COUNT_BIG(*) < 10; That query correctly returns 23 rows (execution plan and data sample shown below): The execution plan looks a bit different from the written form of the query: the base tables are accessed in reverse order, and the aggregation is performed before the join.  The general idea is to read all rows from the history table, compute the count of rows grouped by ProductID, merge join the results to the Product table on ProductID, and finally filter to only return rows where the count is less than ten. This ‘fully-optimized’ plan has an estimated cost of around 0.33 units.  The reason for the quote marks there is that this plan is not quite as optimal as it could be – surely it would make sense to push the Filter down past the join too?  To answer that, let’s look at some other ways to formulate this query.  This being SQL, there are any number of ways to write logically-equivalent query specifications, so we’ll just look at a couple of interesting ones.  The first query is an attempt to reverse-engineer T-SQL from the optimized query plan shown above.  It joins the result of pre-aggregating the history table to the Product table before filtering: SELECT p.Name FROM ( SELECT th.ProductID, cnt = COUNT_BIG(*) FROM Production.TransactionHistory AS th GROUP BY th.ProductID ) AS q1 JOIN Production.Product AS p ON p.ProductID = q1.ProductID WHERE q1.cnt < 10; Perhaps a little surprisingly, we get a slightly different execution plan: The results are the same (23 rows) but this time the Filter is pushed below the join!  The optimizer chooses nested loops for the join, because the cardinality estimate for rows passing the Filter is a bit low (estimate 1 versus 23 actual), though you can force a merge join with a hint and the Filter still appears below the join.  In yet another variation, the < 10 predicate can be ‘manually pushed’ by specifying it in a HAVING clause in the “q1” sub-query instead of in the WHERE clause as written above. The reason this predicate can be pushed past the join in this query form, but not in the original formulation is simply an optimizer limitation – it does make efforts (primarily during the simplification phase) to encourage logically-equivalent query specifications to produce the same execution plan, but the implementation is not completely comprehensive. Moving on to a second example, the following query specification results from phrasing the requirement as “list the products where there exists fewer than ten correlated rows in the history table”: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) < 10 ); Unfortunately, this query produces an incorrect result (86 rows): The problem is that it lists products with no history rows, though the reasons are interesting.  The COUNT_BIG(*) in the EXISTS clause is a scalar aggregate (meaning there is no GROUP BY clause) and scalar aggregates always produce a value, even when the input is an empty set.  In the case of the COUNT aggregate, the result of aggregating the empty set is zero (the other standard aggregates produce a NULL).  To make the point really clear, let’s look at product 709, which happens to be one for which no history rows exist: -- Scalar aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709;   -- Vector aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709 GROUP BY th.ProductID; The estimated execution plans for these two statements are almost identical: You might expect the Stream Aggregate to have a Group By for the second statement, but this is not the case.  The query includes an equality comparison to a constant value (709), so all qualified rows are guaranteed to have the same value for ProductID and the Group By is optimized away. In fact there are some minor differences between the two plans (the first is auto-parameterized and qualifies for trivial plan, whereas the second is not auto-parameterized and requires cost-based optimization), but there is nothing to indicate that one is a scalar aggregate and the other is a vector aggregate.  This is something I would like to see exposed in show plan so I suggested it on Connect.  Anyway, the results of running the two queries show the difference at runtime: The scalar aggregate (no GROUP BY) returns a result of zero, whereas the vector aggregate (with a GROUP BY clause) returns nothing at all.  Returning to our EXISTS query, we could ‘fix’ it by changing the HAVING clause to reject rows where the scalar aggregate returns zero: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) BETWEEN 1 AND 9 ); The query now returns the correct 23 rows: Unfortunately, the execution plan is less efficient now – it has an estimated cost of 0.78 compared to 0.33 for the earlier plans.  Let’s try adding a redundant GROUP BY instead of changing the HAVING clause: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY th.ProductID HAVING COUNT_BIG(*) < 10 ); Not only do we now get correct results (23 rows), this is the execution plan: I like to compare that plan to quantum physics: if you don’t find it shocking, you haven’t understood it properly :)  The simple addition of a redundant GROUP BY has resulted in the EXISTS form of the query being transformed into exactly the same optimal plan we found earlier.  What’s more, in SQL Server 2008 and later, we can replace the odd-looking GROUP BY with an explicit GROUP BY on the empty set: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ); I offer that as an alternative because some people find it more intuitive (and it perhaps has more geek value too).  Whichever way you prefer, it’s rather satisfying to note that the result of the sub-query does not exist for a particular correlated value where a vector aggregate is used (the scalar COUNT aggregate always returns a value, even if zero, so it always ‘EXISTS’ regardless which ProductID is logically being evaluated). The following query forms also produce the optimal plan and correct results, so long as a vector aggregate is used (you can probably find more equivalent query forms): WHERE Clause SELECT p.Name FROM Production.Product AS p WHERE ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) < 10; APPLY SELECT p.Name FROM Production.Product AS p CROSS APPLY ( SELECT NULL FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ) AS ca (dummy); FROM Clause SELECT q1.Name FROM ( SELECT p.Name, cnt = ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) FROM Production.Product AS p ) AS q1 WHERE q1.cnt < 10; This last example uses SUM(1) instead of COUNT and does not require a vector aggregate…you should be able to work out why :) SELECT q.Name FROM ( SELECT p.Name, cnt = ( SELECT SUM(1) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID ) FROM Production.Product AS p ) AS q WHERE q.cnt < 10; The semantics of SQL aggregates are rather odd in places.  It definitely pays to get to know the rules, and to be careful to check whether your queries are using scalar or vector aggregates.  As we have seen, query plans do not show in which ‘mode’ an aggregate is running and getting it wrong can cause poor performance, wrong results, or both. © 2012 Paul White Twitter: @SQL_Kiwi email: [email protected]

    Read the article

  • Event Handlers Not Getting Called? - wxWidgets & C++

    - by Alex
    Hello all, I'm working on a program for my C++ programming class, using wxWidgets. I'm having a huge problem in that my event handlers (I assume) are not getting called, because when I click on the button to trigger the event, nothing happens. My question is: Can you help me find the problem and explain why they would not be getting called? The event handlers OnAbout and OnQuit are working, just not OnCompute or OnClear. I'm really frustrated as I can't figure this out. Thanks a bunch in advance! #include "wx/wx.h" #include "time.h" #include <string> using std::string; // create object of Time class Time first; class App: public wxApp { virtual bool OnInit(); }; class MainPanel : public wxPanel { public: // Constructor for panel class // Constructs my panel class // Params - wxWindow pointer // no return type // pre-conditions: none // post-conditions: none MainPanel(wxWindow* parent); // OnCompute is the event handler for the Compute button // params - none // preconditions - none // postconditions - tasks will have been carried otu successfully // returns void void OnCompute(wxCommandEvent& WXUNUSED(event)); // OnClear is the event handler for the Clear button // params - none // preconditions - none // postconditions - all text areas will be cleared of data // returns void void OnClear(wxCommandEvent& WXUNUSED(event)); // Destructor for panel class // params none // preconditions - none // postconditions - none // no return type ~MainPanel( ); private: wxStaticText *startLabel; wxStaticText *endLabel; wxStaticText *pCLabel; wxStaticText *newEndLabel; wxTextCtrl *start; wxTextCtrl *end; wxTextCtrl *pC; wxTextCtrl *newEnd; wxButton *compute; wxButton *clear; DECLARE_EVENT_TABLE() }; class MainFrame: public wxFrame { private: wxPanel *mainPanel; public: MainFrame(const wxString& title, const wxPoint& pos, const wxSize& size); void OnQuit(wxCommandEvent& event); void OnAbout(wxCommandEvent& event); ~MainFrame(); DECLARE_EVENT_TABLE() }; enum { ID_Quit = 1, ID_About, BUTTON_COMPUTE = 100, BUTTON_CLEAR = 200 }; IMPLEMENT_APP(App) BEGIN_EVENT_TABLE(MainFrame, wxFrame) EVT_MENU(ID_Quit, MainFrame::OnQuit) EVT_MENU(ID_About, MainFrame::OnAbout) END_EVENT_TABLE() BEGIN_EVENT_TABLE(MainPanel, wxPanel) EVT_MENU(BUTTON_COMPUTE, MainPanel::OnCompute) EVT_MENU(BUTTON_CLEAR, MainPanel::OnClear) END_EVENT_TABLE() bool App::OnInit() { MainFrame *frame = new MainFrame( _("Good Guys Delivery Time Calculator"), wxPoint(50, 50), wxSize(450,340) ); frame->Show(true); SetTopWindow(frame); return true; } MainPanel::MainPanel(wxWindow* parent) : wxPanel(parent) { startLabel = new wxStaticText(this, -1, "Start Time:", wxPoint(75, 35)); start = new wxTextCtrl(this, -1, "", wxPoint(135, 35), wxSize(40, 21)); endLabel = new wxStaticText(this, -1, "End Time:", wxPoint(200, 35)); end = new wxTextCtrl(this, -1, "", wxPoint(260, 35), wxSize(40, 21)); pCLabel = new wxStaticText(this, -1, "Percent Change:", wxPoint(170, 85)); pC = new wxTextCtrl(this, -1, "", wxPoint(260, 85), wxSize(40, 21)); newEndLabel = new wxStaticText(this, -1, "New End Time:", wxPoint(180, 130)); newEnd = new wxTextCtrl(this, -1, "", wxPoint(260, 130), wxSize(40, 21)); compute = new wxButton(this, BUTTON_COMPUTE, "Compute", wxPoint(135, 185), wxSize(75, 35)); clear = new wxButton(this, BUTTON_CLEAR, "Clear", wxPoint(230, 185), wxSize(75, 35)); } MainPanel::~MainPanel() {} MainFrame::MainFrame(const wxString& title, const wxPoint& pos, const wxSize& size) : wxFrame( NULL, -1, title, pos, size ) { mainPanel = new MainPanel(this); wxMenu *menuFile = new wxMenu; menuFile->Append( ID_About, _("&About...") ); menuFile->AppendSeparator(); menuFile->Append( ID_Quit, _("E&xit") ); wxMenuBar *menuBar = new wxMenuBar; menuBar->Append( menuFile, _("&File") ); SetMenuBar( menuBar ); CreateStatusBar(); SetStatusText( _("Hi") ); } MainFrame::~MainFrame() {} void MainFrame::OnQuit(wxCommandEvent& WXUNUSED(event)) { Close(TRUE); } void MainFrame::OnAbout(wxCommandEvent& WXUNUSED(event)) { wxMessageBox( _("Alex Olson\nProject 11"), _("About"), wxOK | wxICON_INFORMATION, this); } void MainPanel::OnCompute(wxCommandEvent& WXUNUSED(event)) { int startT; int endT; int newEndT; double tD; wxString startTString = start->GetValue(); wxString endTString = end->GetValue(); startT = wxAtoi(startTString); endT = wxAtoi(endTString); pC->GetValue().ToDouble(&tD); first.SetStartTime(startT); first.SetEndTime(endT); first.SetTimeDiff(tD); try { first.ValidateData(); newEndT = first.ComputeEndTime(); *newEnd << newEndT; } catch (BaseException& e) { wxMessageBox(_(e.GetMessage()), _("Something Went Wrong!"), wxOK | wxICON_INFORMATION, this); } } void MainPanel::OnClear(wxCommandEvent& WXUNUSED(event)) { start->Clear(); end->Clear(); pC->Clear(); newEnd->Clear(); }

    Read the article

  • NerdDinner form validation DataAnnotations ERROR in MVC2 when a form field is left blank.

    - by Edward Burns
    Platform: Windows 7 Ultimate IDE: Visual Studio 2010 Ultimate Web Environment: ASP.NET MVC 2 Database: SQL Server 2008 R2 Express Data Access: Entity Framework 4 Form Validation: DataAnnotations Sample App: NerdDinner from Wrox Pro ASP.NET MVC 2 Book: Wrox Professional MVC 2 Problem with Chapter 1 - Section: "Integrating Validation and Business Rule Logic with Model Classes" (pages 33 to 35) ERROR Synopsis: NerdDinner form validation ERROR with DataAnnotations and db nulls. DataAnnotations in sample code does not work when the database fields are set to not allow nulls. ERROR occurs with the code from the book and with the sample code downloaded from codeplex. Help! I'm really frustrated by this!! I can't believe something so simple just doesn't work??? Steps to reproduce ERROR: Set Database fields to not allow NULLs (See Picture) Set NerdDinnerEntityModel Dinner class fields' Nullable property to false (See Picture) Add DataAnnotations for Dinner_Validation class (CODE A) Create Dinner repository class (CODE B) Add CREATE action to DinnerController (CODE C) This is blank form before posting (See Picture) This null ERROR occurs when posting a blank form which should be intercepted by the Dinner_Validation class DataAnnotations. Note ERROR message says that "This property cannot be set to a null value. WTH??? (See Picture) The next ERROR occurs during the edit process. Here is the Edit controller action (CODE D) This is the "Edit" form with intentionally wrong input to test Dinner Validation DataAnnotations (See Picture) The ERROR occurs again when posting the edit form with blank form fields. The post request should be intercepted by the Dinner_Validation class DataAnnotations. Same null entry error. WTH??? (See Picture) See screen shots at: http://www.intermedia4web.com/temp/nerdDinner/StackOverflowNerdDinnerQuestionshort.png CODE A: [MetadataType(typeof(Dinner_Validation))] public partial class Dinner { } [Bind(Include = "Title, EventDate, Description, Address, Country, ContactPhone, Latitude, Longitude")] public class Dinner_Validation { [Required(ErrorMessage = "Title is required")] [StringLength(50, ErrorMessage = "Title may not be longer than 50 characters")] public string Title { get; set; } [Required(ErrorMessage = "Description is required")] [StringLength(265, ErrorMessage = "Description must be 256 characters or less")] public string Description { get; set; } [Required(ErrorMessage="Event date is required")] public DateTime EventDate { get; set; } [Required(ErrorMessage = "Address is required")] public string Address { get; set; } [Required(ErrorMessage = "Country is required")] public string Country { get; set; } [Required(ErrorMessage = "Contact phone is required")] public string ContactPhone { get; set; } [Required(ErrorMessage = "Latitude is required")] public double Latitude { get; set; } [Required(ErrorMessage = "Longitude is required")] public double Longitude { get; set; } } CODE B: public class DinnerRepository { private NerdDinnerEntities _NerdDinnerEntity = new NerdDinnerEntities(); // Query Method public IQueryable<Dinner> FindAllDinners() { return _NerdDinnerEntity.Dinners; } // Query Method public IQueryable<Dinner> FindUpcomingDinners() { return from dinner in _NerdDinnerEntity.Dinners where dinner.EventDate > DateTime.Now orderby dinner.EventDate select dinner; } // Query Method public Dinner GetDinner(int id) { return _NerdDinnerEntity.Dinners.FirstOrDefault(d => d.DinnerID == id); } // Insert Method public void Add(Dinner dinner) { _NerdDinnerEntity.Dinners.AddObject(dinner); } // Delete Method public void Delete(Dinner dinner) { foreach (var rsvp in dinner.RSVPs) { _NerdDinnerEntity.RSVPs.DeleteObject(rsvp); } _NerdDinnerEntity.Dinners.DeleteObject(dinner); } // Persistence Method public void Save() { _NerdDinnerEntity.SaveChanges(); } } CODE C: // ************************************** // GET: /Dinners/Create/ // ************************************** public ActionResult Create() { Dinner dinner = new Dinner() { EventDate = DateTime.Now.AddDays(7) }; return View(dinner); } // ************************************** // POST: /Dinners/Create/ // ************************************** [HttpPost] public ActionResult Create(Dinner dinner) { if (ModelState.IsValid) { dinner.HostedBy = "The Code Dude"; _dinnerRepository.Add(dinner); _dinnerRepository.Save(); return RedirectToAction("Details", new { id = dinner.DinnerID }); } else { return View(dinner); } } CODE D: // ************************************** // GET: /Dinners/Edit/{id} // ************************************** public ActionResult Edit(int id) { Dinner dinner = _dinnerRepository.GetDinner(id); return View(dinner); } // ************************************** // POST: /Dinners/Edit/{id} // ************************************** [HttpPost] public ActionResult Edit(int id, FormCollection formValues) { Dinner dinner = _dinnerRepository.GetDinner(id); if (TryUpdateModel(dinner)){ _dinnerRepository.Save(); return RedirectToAction("Details", new { id=dinner.DinnerID }); } return View(dinner); } I have sent Wrox and one of the authors a request for help but have not heard back from anyone. Readers of the book cannot even continue to finish the rest of chapter 1 because of these errors. Even if I download the latest build from Codeplex, it still has the same errors. Can someone please help me and tell me what needs to be fixed? Thanks - Ed.

    Read the article

  • techniques for an AI for a highly cramped turn-based tactics game

    - by Adam M.
    I'm trying to write an AI for a tactics game in the vein of Final Fantasy Tactics or Vandal Hearts. I can't change the game rules in any way, only upgrade the AI. I have experience programming AI for classic board games (basically minimax and its variants), but I think the branching factor is too great for the approach to be reasonable here. I'll describe the game and some current AI flaws that I'd like to fix. I'd like to hear ideas for applicable techniques. I'm a decent enough programmer, so I only need the ideas, not an implementation (though that's always appreciated). I'd rather not expend effort chasing (too many) dead ends, so although speculation and brainstorming are good and probably helpful, I'd prefer to hear from somebody with actual experience solving this kind of problem. For those who know it, the game is the land battle mini-game in Sid Meier's Pirates! (2004) and you can skim/skip the next two paragraphs. For those who don't, here's briefly how it works. The battle is turn-based and takes place on a 16x16 grid. There are three terrain types: clear (no hindrance), forest (hinders movement, ranged attacks, and sight), and rock (impassible, but does not hinder attacks or sight). The map is randomly generated with roughly equal amounts of each type of terrain. Because there are many rock and forest tiles, movement is typically very cramped. This is tactically important. The terrain is not flat; higher terrain gives minor bonuses. The terrain is known to both sides. The player is always the attacker and the AI is always the defender, so it's perfectly valid for the AI to set up a defensive position and just wait. The player wins by killing all defenders or by getting a unit to the city gates (a tile on the other side of the map). There are very few units on each side, usually 4-8. Because of this, it's crucial not to take damage without gaining some advantage from it. Units can take multiple actions per turn. All units on one side move before any units on the other side. Order of execution is important, and interleaving of actions between units is often useful. Units have melee and ranged attacks. Melee attacks vary widely in strength; ranged attacks have the same strength but vary in range. The main challenges I face are these: Lots of useful move combinations start with a "useless" move that gains no immediate advantage, or even loses advantage, in order to set up a powerful flank attack in the future. And, since the player units are stronger and have longer range, the AI pretty much always has to take some losses before they can start to gain kills. The AI must be able to look ahead to distinguish between sacrificial actions that provide a future benefit and those that don't. Because the terrain is so cramped, most of the tactics come down to achieving good positioning with multiple units that work together to defend an area. For instance, two defenders can often dominate a narrow pass by positioning themselves so an enemy unit attempting to pass must expose itself to a flank attack. But one defender in the same pass would be useless, and three units can defend a slightly larger pass. Etc. The AI should be able to figure out where the player must go to reach the city gates and how to best position its few units to cover the approaches, shifting, splitting, or combining them appropriately as the player moves. Because flank attacks are extremely deadly (and engineering flank attacks is key to the player strategy), the AI should be competent at moving its units so that they cover each other's flanks unless the sacrifice of a unit would give a substantial benefit. They should also be able to force flank attacks on players, for instance by threatening a unit from two different directions such that responding to one threat exposes the flank to the other. The AI should attack if possible, but sometimes there are no good ways to approach the player's position. In that case, the AI should be able to recognize this and set up a defensive position of its own. But the AI shouldn't be vulnerable to a trivial exploit where the player repeatedly opens and closes a hole in his defense and shoots at the AI as it approaches and retreats. That is, the AI should ideally be able to recognize that the player is capable of establishing a solid defense of an area, even if the defense is not currently in place. (I suppose if a good unit allocation algorithm existed, as needed for the second bullet point, the AI could run it on the player units to see where they could defend.) Because it's important to choose a good order of action and interleave actions between units, it's not as simple as just finding the best move for each unit in turn. All of these can be accomplished with a minimax search in theory, but the search space is too large, so specialized techniques are needed. I thought about techniques such as influence mapping, but I don't see how to use the technique to great effect. I thought about assigning goals to the units. This can help them work together in some limited way, and the problem of "how do I accomplish this goal?" is easier to solve than "how do I win this battle?", but assigning good goals is a hard problem in itself, because it requires knowing whether the goal is achievable and whether it's a good use of resources. So, does anyone have specific ideas for techniques that can help cleverize this AI? Update: I found a related question on Stackoverflow: http://stackoverflow.com/questions/3133273/ai-for-a-final-fantasy-tactics-like-game The selected answer gives a decent approach to choosing between alternative actions, but it doesn't seem to have much ability to look into the future and discern beneficial sacrifices from wasteful ones. It also focuses on a single unit at a time and it's not clear how it could be extended to support cooperation between units in defending or attacking.

    Read the article

  • Accessing SharePoint 2010 Data with REST/OData on Windows Phone 7

    - by Jan Tielens
    Consuming SharePoint 2010 data in Windows Phone 7 applications using the CTP version of the developer tools is quite a challenge. The issue is that the SharePoint 2010 data is not anonymously available; users need to authenticate to be able to access the data. When I first tried to access SharePoint 2010 data from my first Hello-World-type Windows Phone 7 application I thought “Hey, this should be easy!” because Windows Phone 7 development based on Silverlight and SharePoint 2010 has a Client Object Model for Silverlight. Unfortunately you can’t use the Client Object Model of SharePoint 2010 on the Windows Phone platform; there’s a reference to an assembly that’s not available (System.Windows.Browser). My second thought was “OK, no problem!” because SharePoint 2010 also exposes a REST/OData API to access SharePoint data. Using the REST API in SharePoint 2010 is as easy as making a web request for a URL (in which you specify the data you’d like to retrieve), e.g. http://yoursiteurl/_vti_bin/listdata.svc/Announcements. This is very easy to accomplish in a Silverlight application that’s running in the context of a page in a SharePoint site, because the credentials of the currently logged on user are automatically picked up and passed to the WCF service. But a Windows Phone application is of course running outside of the SharePoint site’s page, so the application should build credentials that have to be passed to SharePoint’s WCF service. This turns out to be a small challenge in Silverlight 3, the WebClient doesn’t support authentication; there is a Credentials property but when you set it and make the request you get a NotImplementedException exception. Probably this issued will be solved in the very near future, since Silverlight 4 does support authentication, and there’s already a WCF Data Services download that uses this new platform feature of Silverlight 4. So when Windows Phone platform switches to Silverlight 4, you can just use the WebClient to get the data. Even more, if the OData Client Library for Windows Phone 7 gets updated after that, things should get even easier! By the way: the things I’m writing in this paragraph are just assumptions that I make which make a lot of sense IMHO, I don’t have any info all of this will happen, but I really hope so. So are SharePoint developers out of the Windows Phone development game until they get this fixed? Well luckily not, when the HttpWebRequest class is being used instead, you can pass credentials! Using the HttpWebRequest class is slightly more complex than using the WebClient class, but the end result is that you have access to your precious SharePoint 2010 data. The following code snippet is getting all the announcements of an Annoucements list in a SharePoint site: HttpWebRequest webReq =     (HttpWebRequest)HttpWebRequest.Create("http://yoursite/_vti_bin/listdata.svc/Announcements");webReq.Credentials = new NetworkCredential("username", "password"); webReq.BeginGetResponse(    (result) => {        HttpWebRequest asyncReq = (HttpWebRequest)result.AsyncState;         XDocument xdoc = XDocument.Load(            ((HttpWebResponse)asyncReq.EndGetResponse(result)).GetResponseStream());         XNamespace ns = "http://www.w3.org/2005/Atom";        var items = from item in xdoc.Root.Elements(ns + "entry")                    select new { Title = item.Element(ns + "title").Value };         this.Dispatcher.BeginInvoke(() =>        {            foreach (var item in items)                MessageBox.Show(item.Title);        });    }, webReq); When you try this in a Windows Phone 7 application, make sure you add a reference to the System.Xml.Linq assembly, because the code uses Linq to XML to parse the resulting Atom feed, so the Title of every announcement is being displayed in a MessageBox. Check out my previous post if you’d like to see a more polished sample Windows Phone 7 application that displays SharePoint 2010 data.When you plan to use this technique, it’s of course a good idea to encapsulate the code doing the request, so it becomes really easy to get the data that you need. In the following code snippet you can find the GetAtomFeed method that gets the contents of any Atom feed, even if you need to authenticate to get access to the feed. delegate void GetAtomFeedCallback(Stream responseStream); public MainPage(){    InitializeComponent();     SupportedOrientations = SupportedPageOrientation.Portrait |         SupportedPageOrientation.Landscape;     string url = "http://yoursite/_vti_bin/listdata.svc/Announcements";    string username = "username";    string password = "password";    string domain = "";     GetAtomFeed(url, username, password, domain, (s) =>    {        XNamespace ns = "http://www.w3.org/2005/Atom";        XDocument xdoc = XDocument.Load(s);         var items = from item in xdoc.Root.Elements(ns + "entry")                    select new { Title = item.Element(ns + "title").Value };         this.Dispatcher.BeginInvoke(() =>        {            foreach (var item in items)            {                MessageBox.Show(item.Title);            }        });    });} private static void GetAtomFeed(string url, string username,     string password, string domain, GetAtomFeedCallback cb){    HttpWebRequest webReq = (HttpWebRequest)HttpWebRequest.Create(url);    webReq.Credentials = new NetworkCredential(username, password, domain);     webReq.BeginGetResponse(        (result) =>        {            HttpWebRequest asyncReq = (HttpWebRequest)result.AsyncState;            HttpWebResponse resp = (HttpWebResponse)asyncReq.EndGetResponse(result);            cb(resp.GetResponseStream());        }, webReq);}

    Read the article

  • Charms and the App Bar

    - by Dennis Vroegop
    Ok. I admit. I made a mistake in the last post about our planespotter app. I have dedicated a full part of the hub to Social. I also had a section called Friends but that made sense since I said that “Friends” is a special group of people that connect to each other through our app and only our app. Social however is sharing our spots with Twitter, Facebook and so on. Now, we could write that functionality in our app in a different section but there is one small problem with that: users don’t expect that. Ok, I admit. The mistake was quite deliberate to give me an excuse to write this part. But still: the mistake is one I see a lot. People are trying to do stuff in their application that they shouldn’t be doing. This always strike me as slightly odd: why do some work when others have already done it for you and you can just use it? After all: good developers are lazy (lazy people will always try to find the easiest way to do something and in development land this usually means the cleanest and best to support way…) So. What is that part that Microsoft has done for us and we don’t have to do ourselves? The answer lies on the right hand of your Win8 screen: This is a screenshot of my tablet (as you can see I am writing this right now….) When I swipe my finger from out of the screen on the right inside the screen (or move the mouse to the upper right corner) this menu will appear. Next to settings and the start menu button we’ll find the Search and the Share charms. These are two ways that your app can share the information it contains with the rest of the world, or at least: the rest of your system. So don’t write a Search feature in your app. Don’t write a Share feature in your app. It’s here already. Users, once they are used to Windows 8, will use that feature and expect it to work. If it doesn’t, they won’t like your app and you can kiss you dreams of everlasting fame goodbye. So use these two. What are they? Well, simply they are parts of a contract. In your app you say somewhere in code that you are supporting Search and Share. So when the user selects Share the system will interrogate the current app in the foreground if it supports this feature. Your app will say “But why, yes, I do!” Then the system will ask the app “Ok then, wisecrack, then share!” and you will have to provide the system with some information about the format. Other applications have subscribed to be at the receiving end of the Share contract. They have told the system that they support Sharing (receiving) and which formats they understand. If one or more of them support the formats you specify, the user will see them. The user clicks / taps on the app of their choice and data is moved from your app to the new one. So if you say you support Facebook and Twitter users can post data from your app to these networks by selecting Share. The same applies to Search. Don’t make a “search” button in your app but use the contract to tell the system that you support search and use that instead. Users will be grateful (remember that bar with men/women/creatures that are waiting for you?) The more and more people get to know Windows 8, the more they will use this. And if you are one of the people who wrote an app that helped them learn the system, well, that’s even better. So. We don’t have a Share or a Search button. We do have other buttons. Most important: we probably need a “New Spot” button. And a “Filter” might be useful. Or someway to open the camera so you can add a picture to the spot. Where will be put those? The answer is the “Appbar” . This is a application / context aware menu that slides up from the bottom of the screen when you move your finger / mouse from below the screen into it. From above downwards works just as well. Here you see an example of the appbar from the People app. (click on it for a larger version). This appears whenever you slide your finger up from below of down from above. This is where you put your commands. Remember, this is context aware so this menu will change when you are in different parts of your app or when you have selected different items. There are a few conventions when you create this appbar. First, the items on the right are “General” items, meaning they have little to do with what is on the screen right now. I think this would be a great place to add our “New Spot” icon. On the far left are items associated with the current selected item or screen. So if you have a spot selected, the button for Add Photo should be visible here and on the left hand side. Not everything is as clear as this, but this is what you should strive for. Group items together. And please note: this is the only place in Metro design where we are allowed to use lines as separators. So when you want to separate a group of icons from another group, add a line. Also note the simplicity of the buttons. No colors, no lights or shadows, no 3D. After a couple of years of fancy almost realistic looking icons people have finally decided that hey, this is a virtual world: it’s ok to look virtual as well. So make things as readable and clear as possible and don’t try to duplicate nature. It’s all about the information, remember? (If you don’t remember I’d like to point you to a older blog post of mine about the what and why of Metro). So.. think about the buttons a bit and think about Share and Search. What will you put there? Remember: this is the way the users interact with your apps and while you shouldn’t judge a book by its covers when it comes to people, this isn’t entirely so when it comes to apps. People DO judge an app by its looks and the way it feels. Take advantage of that. History has learned that a crappy app with a GREAT user interface gets better reviews than a GREAT app with a lousy UI… I know: developers will find this extremely unfair but that’s the world we live in (No, I am not saying you should deliver rubbish apps). Next time: we’ll start by building the darn thing!

    Read the article

  • SortedDictionary and SortedList

    - by Simon Cooper
    Apart from Dictionary<TKey, TValue>, there's two other dictionaries in the BCL - SortedDictionary<TKey, TValue> and SortedList<TKey, TValue>. On the face of it, these two classes do the same thing - provide an IDictionary<TKey, TValue> interface where the iterator returns the items sorted by the key. So what's the difference between them, and when should you use one rather than the other? (as in my previous post, I'll assume you have some basic algorithm & datastructure knowledge) SortedDictionary We'll first cover SortedDictionary. This is implemented as a special sort of binary tree called a red-black tree. Essentially, it's a binary tree that uses various constraints on how the nodes of the tree can be arranged to ensure the tree is always roughly balanced (for more gory algorithmical details, see the wikipedia link above). What I'm concerned about in this post is how the .NET SortedDictionary is actually implemented. In .NET 4, behind the scenes, the actual implementation of the tree is delegated to a SortedSet<KeyValuePair<TKey, TValue>>. One example tree might look like this: Each node in the above tree is stored as a separate SortedSet<T>.Node object (remember, in a SortedDictionary, T is instantiated to KeyValuePair<TKey, TValue>): class Node { public bool IsRed; public T Item; public SortedSet<T>.Node Left; public SortedSet<T>.Node Right; } The SortedSet only stores a reference to the root node; all the data in the tree is accessed by traversing the Left and Right node references until you reach the node you're looking for. Each individual node can be physically stored anywhere in memory; what's important is the relationship between the nodes. This is also why there is no constructor to SortedDictionary or SortedSet that takes an integer representing the capacity; there are no internal arrays that need to be created and resized. This may seen trivial, but it's an important distinction between SortedDictionary and SortedList that I'll cover later on. And that's pretty much it; it's a standard red-black tree. Plenty of webpages and datastructure books cover the algorithms behind the tree itself far better than I could. What's interesting is the comparions between SortedDictionary and SortedList, which I'll cover at the end. As a side point, SortedDictionary has existed in the BCL ever since .NET 2. That means that, all through .NET 2, 3, and 3.5, there has been a bona-fide sorted set class in the BCL (called TreeSet). However, it was internal, so it couldn't be used outside System.dll. Only in .NET 4 was this class exposed as SortedSet. SortedList Whereas SortedDictionary didn't use any backing arrays, SortedList does. It is implemented just as the name suggests; two arrays, one containing the keys, and one the values (I've just used random letters for the values): The items in the keys array are always guarenteed to be stored in sorted order, and the value corresponding to each key is stored in the same index as the key in the values array. In this example, the value for key item 5 is 'z', and for key item 8 is 'm'. Whenever an item is inserted or removed from the SortedList, a binary search is run on the keys array to find the correct index, then all the items in the arrays are shifted to accomodate the new or removed item. For example, if the key 3 was removed, a binary search would be run to find the array index the item was at, then everything above that index would be moved down by one: and then if the key/value pair {7, 'f'} was added, a binary search would be run on the keys to find the index to insert the new item, and everything above that index would be moved up to accomodate the new item: If another item was then added, both arrays would be resized (to a length of 10) before the new item was added to the arrays. As you can see, any insertions or removals in the middle of the list require a proportion of the array contents to be moved; an O(n) operation. However, if the insertion or removal is at the end of the array (ie the largest key), then it's only O(log n); the cost of the binary search to determine it does actually need to be added to the end (excluding the occasional O(n) cost of resizing the arrays to fit more items). As a side effect of using backing arrays, SortedList offers IList Keys and Values views that simply use the backing keys or values arrays, as well as various methods utilising the array index of stored items, which SortedDictionary does not (and cannot) offer. The Comparison So, when should you use one and not the other? Well, here's the important differences: Memory usage SortedDictionary and SortedList have got very different memory profiles. SortedDictionary... has a memory overhead of one object instance, a bool, and two references per item. On 64-bit systems, this adds up to ~40 bytes, not including the stored item and the reference to it from the Node object. stores the items in separate objects that can be spread all over the heap. This helps to keep memory fragmentation low, as the individual node objects can be allocated wherever there's a spare 60 bytes. In contrast, SortedList... has no additional overhead per item (only the reference to it in the array entries), however the backing arrays can be significantly larger than you need; every time the arrays are resized they double in size. That means that if you add 513 items to a SortedList, the backing arrays will each have a length of 1024. To conteract this, the TrimExcess method resizes the arrays back down to the actual size needed, or you can simply assign list.Capacity = list.Count. stores its items in a continuous block in memory. If the list stores thousands of items, this can cause significant problems with Large Object Heap memory fragmentation as the array resizes, which SortedDictionary doesn't have. Performance Operations on a SortedDictionary always have O(log n) performance, regardless of where in the collection you're adding or removing items. In contrast, SortedList has O(n) performance when you're altering the middle of the collection. If you're adding or removing from the end (ie the largest item), then performance is O(log n), same as SortedDictionary (in practice, it will likely be slightly faster, due to the array items all being in the same area in memory, also called locality of reference). So, when should you use one and not the other? As always with these sort of things, there are no hard-and-fast rules. But generally, if you: need to access items using their index within the collection are populating the dictionary all at once from sorted data aren't adding or removing keys once it's populated then use a SortedList. But if you: don't know how many items are going to be in the dictionary are populating the dictionary from random, unsorted data are adding & removing items randomly then use a SortedDictionary. The default (again, there's no definite rules on these sort of things!) should be to use SortedDictionary, unless there's a good reason to use SortedList, due to the bad performance of SortedList when altering the middle of the collection.

    Read the article

  • SQL SERVER – Weekly Series – Memory Lane – #050

    - by Pinal Dave
    Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 Executing Remote Stored Procedure – Calling Stored Procedure on Linked Server In this example we see two different methods of how to call Stored Procedures remotely.  Connection Property of SQL Server Management Studio SSMS A very simple example of the how to build connection properties for SQL Server with the help of SSMS. Sample Example of RANKING Functions – ROW_NUMBER, RANK, DENSE_RANK, NTILE SQL Server has a total of 4 ranking functions. Ranking functions return a ranking value for each row in a partition. All the ranking functions are non-deterministic. T-SQL Script to Add Clustered Primary Key Jr. DBA asked me three times in a day, how to create Clustered Primary Key. I gave him following sample example. That was the last time he asked “How to create Clustered Primary Key to table?” 2008 2008 – TRIM() Function – User Defined Function SQL Server does not have functions which can trim leading or trailing spaces of any string at the same time. SQL does have LTRIM() and RTRIM() which can trim leading and trailing spaces respectively. SQL Server 2008 also does not have TRIM() function. User can easily use LTRIM() and RTRIM() together and simulate TRIM() functionality. http://www.youtube.com/watch?v=1-hhApy6MHM 2009 Earlier I have written two different articles on the subject Remove Bookmark Lookup. This article is as part 3 of original article. Please read the first two articles here before continuing reading this article. Query Optimization – Remove Bookmark Lookup – Remove RID Lookup – Remove Key Lookup Query Optimization – Remove Bookmark Lookup – Remove RID Lookup – Remove Key Lookup – Part 2 Query Optimization – Remove Bookmark Lookup – Remove RID Lookup – Remove Key Lookup – Part 3 Interesting Observation – Query Hint – FORCE ORDER SQL Server never stops to amaze me. As regular readers of this blog already know that besides conducting corporate training, I work on large-scale projects on query optimizations and server tuning projects. In one of the recent projects, I have noticed that a Junior Database Developer used the query hint Force Order; when I asked for details, I found out that the basic concept was not properly understood by him. Queries Waiting for Memory Allocation to Execute In one of the recent projects, I was asked to create a report of queries that are waiting for memory allocation. The reason was that we were doubtful regarding whether the memory was sufficient for the application. The following query can be useful in similar cases. Queries that do not have to wait on a memory grant will not appear in the result set of following query. 2010 Quickest Way to Identify Blocking Query and Resolution – Dirty Solution As the title suggests, this is quite a dirty solution; it’s not as elegant as you expect. However, it works totally fine. Simple Explanation of Data Type Precedence While I was working on creating a question for SQL SERVER – SQL Quiz – The View, The Table and The Clustered Index Confusion, I had actually created yet another question along with this question. However, I felt that the one which is posted on the SQL Quiz is much better than this one because what makes that more challenging question is that it has a multiple answer. Encrypted Stored Procedure and Activity Monitor I recently had received questionable if any stored procedure is encrypted can we see its definition in Activity Monitor.Answer is - No. Let us do a quick test. Let us create following Stored Procedure and then launch the Activity Monitor and check the text. Indexed View always Use Index on Table A single table can have maximum 249 non clustered indexes and 1 clustered index. In SQL Server 2008, a single table can have maximum 999 non clustered indexes and 1 clustered index. It is widely believed that a table can have only 1 clustered index, and this belief is true. I have some questions for all of you. Let us assume that I am creating view from the table itself and then create a clustered index on it. In my view, I am selecting the complete table itself. 2011 Detecting Database Case Sensitive Property using fn_helpcollations() I received a question on how to determine the case sensitivity of the database. The quick answer to this is to identify the collation of the database and check the properties of the collation. I have previously written how one can identify database collation. Once you have figured out the collation of the database, you can put that in the WHERE condition of the following T-SQL and then check the case sensitivity from the description. Server Side Paging in SQL Server CE (Compact Edition) SQL Server Denali is coming up with new T-SQL of Paging. I have written about the same earlier.SQL SERVER – Server Side Paging in SQL Server Denali – A Better Alternative,  SQL SERVER – Server Side Paging in SQL Server Denali Performance Comparison, SQL SERVER – Server Side Paging in SQL Server Denali – Part2 What is very interesting is that SQL Server CE 4.0 have the same feature introduced. Here is the quick example of the same. To run the script in the example, you will have to do installWebmatrix 4.0 and download sample database. Once done you can run following script. Why I am Going to Attend PASS Summit Unite 2011 The four-day event will be marked by a lot of learning, sharing, and networking, which will help me increase both my knowledge and contacts. Every year, PASS Summit provides me a golden opportunity to build my network as well as to identify and meet potential customers or employees. 2012 Manage Help Settings – CTRL + ALT + F1 This is very interesting read as my daughter once accidently came across a screen in SQL Server Management Studio. It took me 2-3 minutes to figure out how she has created the same screen. Recover the Accidentally Renamed Table “I accidentally renamed table in my SSMS. I was scrolling very fast and I made mistakes. It was either because I double clicked or clicked on F2 (shortcut key for renaming). However, I have made the mistake and now I have no idea how to fix this. If you have renamed the table, I think you pretty much is out of luck. Here are few things which you can do which can give you an idea about what your table name can be if you are lucky. Identify Numbers of Non Clustered Index on Tables for Entire Database Here is the script which will give you numbers of non clustered indexes on any table in entire database. Identify Most Resource Intensive Queries – SQL in Sixty Seconds #029 – Video Here is the complete complete script which I have used in the SQL in Sixty Seconds Video. Thanks Harsh for important Tip in the comment. http://www.youtube.com/watch?v=3kDHC_Tjrns Advanced Data Quality Services with Melissa Data – Azure Data Market For the purposes of the review, I used a database I had in an Excel spreadsheet with name and address information. Upon a cursory inspection, there are miscellaneous problems with these records; some addresses are missing ZIP codes, others missing a city, and some records are slightly misspelled or have unparsed suites. With DQS, I can easily add a knowledge base to help standardize my values, such as for state abbreviations. But how do I know that my address is correct? Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • Microsoft Business Intelligence Seminar 2011

    - by DavidWimbush
    I was lucky enough to attend the maiden presentation of this at Microsoft Reading yesterday. It was pretty gripping stuff not only because of what was said but also because of what could only be hinted at. Here's what I took away from the day. (Disclaimer: I'm not a BI guru, just a reasonably experienced BI developer, so I may have misunderstood or misinterpreted a few things. Particularly when so much of the talk was about the vision and subtle hints of what is coming. Please comment if you think I've got anything wrong. I'm also not going to even try to cover Master Data Services as I struggled to imagine how you would actually use it.) I was a bit worried when I learned that the whole day was going to be presented by one guy but Rafal Lukawiecki is a very engaging speaker. He's going to be presenting this about 20 times around the world over the coming months. If you get a chance to hear him speak, I say go for it. No doubt some of the hints will become clearer as Denali gets closer to RTM. Firstly, things are definitely happening in the SQL Server Reporting and BI world. Traditionally IT would build a data warehouse, then cubes on top of that, and then publish them in a structured and controlled way. But, just as with many IT projects in general, by the time it's finished the business has moved on and the system no longer meets their requirements. This not sustainable and something more agile is needed but there has to be some control. Apparently we're going to be hearing the catchphrase 'Balancing agility with control' a lot. More users want more access to more data. Can they define what they want? Of course not, but they'll recognise it when they see it. It's estimated that only 28% of potential BI users have meaningful access to the data they need, so there is a real pent-up demand. The answer looks like: give them some self-service tools so they can experiment and see what works, and then IT can help to support the results. It's estimated that 32% of Excel users are comfortable with its analysis tools such as pivot tables. It's the power user's preferred tool. Why fight it? That's why PowerPivot is an Excel add-in and that's why they released a Data Mining add-in for it as well. It does appear that the strategy is going to be to use Reporting Services (in SharePoint mode), PowerPivot, and possibly something new (smiles and hints but no details) to create reports and explore data. Everything will be published and managed in SharePoint which gives users the ability to mash-up, share and socialise what they've found out. SharePoint also gives IT tools to understand what people are looking at and where to concentrate effort. If PowerPivot report X becomes widely used, it's time to check that it shows what they think it does and perhaps get it a bit more under central control. There was more SharePoint detail that went slightly over my head regarding where Excel Services and Excel Web Application fit in, the differences between them, and the suggestion that it is likely they will one day become one (but not in the immediate future). That basic pattern is set to be expanded upon by further exploiting Vertipaq (the columnar indexing engine that enables PowerPivot to store and process a lot of data fast and in a small memory footprint) to provide scalability 'from the desktop to the data centre', and some yet to be detailed advances in 'frictionless deployment' (part of which is about making the difference between local and the cloud pretty much irrelevant). Excel looks like becoming Microsoft's primary BI client. It already has: the ability to consume cubes strong visualisation tools slicers (which are part of Excel not PowerPivot) a data mining add-in PowerPivot A major hurdle for self-service BI is presenting the data in a consumable format. You can't just give users PowerPivot and a server with a copy of the OLTP database(s). Building cubes is labour intensive and doesn't always give the user what they need. This is where the BI Semantic Model (BISM) comes in. I gather it's a layer of metadata you define that can combine multiple data sources (and types of data source) into a clear 'interface' that users can work with. It comes with a new query language called DAX. SSAS cubes are unlikely to go away overnight because, with their pre-calculated results, they are still the most efficient way to work with really big data sets. A few other random titbits that came up: Reporting Services is going to get some good new stuff in Denali. Keep an eye on www.projectbotticelli.com for the slides. You can also view last year's seminar sessions which covered a lot of the same ground as far as the overall strategy is concerned. They plan to add more material as Denali's features are publicly exposed. Check out the PASS keynote address for a showing of Yahoo's SQL BI servers. Apparently they wheeled the rack out on stage still plugged in and running! Check out the Excel 2010 Data Mining Add-Ins. 32 bit only at present but 64 bit is on the way. There are lots of data sets, many of them free, at the Windows Azure Marketplace Data Market (where you can also get ESRI shape files). If you haven't already seen it, have a look at the Silverlight Pivot Viewer (http://weblogs.asp.net/scottgu/archive/2010/06/29/silverlight-pivotviewer-now-available.aspx). The Bing Maps Data Connector is worth a look if you're into spatial stuff (http://www.bing.com/community/site_blogs/b/maps/archive/2010/07/13/data-connector-sql-server-2008-spatial-amp-bing-maps.aspx).  

    Read the article

  • Improving Manageability of Virtual Environments

    - by Jeff Victor
    Boot Environments for Solaris 10 Branded Zones Until recently, Solaris 10 Branded Zones on Solaris 11 suffered one notable regression: Live Upgrade did not work. The individual packaging and patching tools work correctly, but the ability to upgrade Solaris while the production workload continued running did not exist. A recent Solaris 11 SRU (Solaris 11.1 SRU 6.4) restored most of that functionality, although with a slightly different concept, different commands, and without all of the feature details. This new method gives you the ability to create and manage multiple boot environments (BEs) for a Solaris 10 Branded Zone, and modify the active or any inactive BE, and to do so while the production workload continues to run. Background In case you are new to Solaris: Solaris includes a set of features that enables you to create a bootable Solaris image, called a Boot Environment (BE). This newly created image can be modified while the original BE is still running your workload(s). There are many benefits, including improved uptime and the ability to reboot into (or downgrade to) an older BE if a newer one has a problem. In Solaris 10 this set of features was named Live Upgrade. Solaris 11 applies the same basic concepts to the new packaging system (IPS) but there isn't a specific name for the feature set. The features are simply part of IPS. Solaris 11 Boot Environments are not discussed in this blog entry. Although a Solaris 10 system can have multiple BEs, until recently a Solaris 10 Branded Zone (BZ) in a Solaris 11 system did not have this ability. This limitation was addressed recently, and that enhancement is the subject of this blog entry. This new implementation uses two concepts. The first is the use of a ZFS clone for each BE. This makes it very easy to create a BE, or many BEs. This is a distinct advantage over the Live Upgrade feature set in Solaris 10, which had a practical limitation of two BEs on a system, when using UFS. The second new concept is a very simple mechanism to indicate the BE that should be booted: a ZFS property. The new ZFS property is named com.oracle.zones.solaris10:activebe (isn't that creative? ). It's important to note that the property is inherited from the original BE's file system to any BEs you create. In other words, all BEs in one zone have the same value for that property. When the (Solaris 11) global zone boots the Solaris 10 BZ, it boots the BE that has the name that is stored in the activebe property. Here is a quick summary of the actions you can use to manage these BEs: To create a BE: Create a ZFS clone of the zone's root dataset To activate a BE: Set the ZFS property of the root dataset to indicate the BE To add a package or patch to an inactive BE: Mount the inactive BE Add packages or patches to it Unmount the inactive BE To list the available BEs: Use the "zfs list" command. To destroy a BE: Use the "zfs destroy" command. Preparation Before you can use the new features, you will need a Solaris 10 BZ on a Solaris 11 system. You can use these three steps - on a real Solaris 11.1 server or in a VirtualBox guest running Solaris 11.1 - to create a Solaris 10 BZ. The Solaris 11.1 environment must be at SRU 6.4 or newer. Create a flash archive on the Solaris 10 system s10# flarcreate -n s10-system /net/zones/archives/s10-system.flar Configure the Solaris 10 BZ on the Solaris 11 system s11# zonecfg -z s10z Use 'create' to begin configuring a new zone. zonecfg:s10z create -t SYSsolaris10 zonecfg:s10z set zonepath=/zones/s10z zonecfg:s10z exit s11# zoneadm list -cv ID NAME STATUS PATH BRAND IP 0 global running / solaris shared - s10z configured /zones/s10z solaris10 excl Install the zone from the flash archive s11# zoneadm -z s10z install -a /net/zones/archives/s10-system.flar -p You can find more information about the migration of Solaris 10 environments to Solaris 10 Branded Zones in the documentation. The rest of this blog entry demonstrates the commands you can use to accomplish the aforementioned actions related to BEs. New features in action Note that the demonstration of the commands occurs in the Solaris 10 BZ, as indicated by the shell prompt "s10z# ". Many of these commands can be performed in the global zone instead, if you prefer. If you perform them in the global zone, you must change the ZFS file system names. Create The only complicated action is the creation of a BE. In the Solaris 10 BZ, create a new "boot environment" - a ZFS clone. You can assign any name to the final portion of the clone's name, as long as it meets the requirements for a ZFS file system name. s10z# zfs snapshot rpool/ROOT/zbe-0@snap s10z# zfs clone -o mountpoint=/ -o canmount=noauto rpool/ROOT/zbe-0@snap rpool/ROOT/newBE cannot mount 'rpool/ROOT/newBE' on '/': directory is not empty filesystem successfully created, but not mounted You can safely ignore that message: we already know that / is not empty! We have merely told ZFS that the default mountpoint for the clone is the root directory. List the available BEs and active BE Because each BE is represented by a clone of the rpool/ROOT dataset, listing the BEs is as simple as listing the clones. s10z# zfs list -r rpool/ROOT NAME USED AVAIL REFER MOUNTPOINT rpool/ROOT 3.55G 42.9G 31K legacy rpool/ROOT/zbe-0 1K 42.9G 3.55G / rpool/ROOT/newBE 3.55G 42.9G 3.55G / The output shows that two BEs exist. Their names are "zbe-0" and "newBE". You can tell Solaris that one particular BE should be used when the zone next boots by using a ZFS property. Its name is com.oracle.zones.solaris10:activebe. The value of that property is the name of the clone that contains the BE that should be booted. s10z# zfs get com.oracle.zones.solaris10:activebe rpool/ROOT NAME PROPERTY VALUE SOURCE rpool/ROOT com.oracle.zones.solaris10:activebe zbe-0 local Change the active BE When you want to change the BE that will be booted next time, you can just change the activebe property on the rpool/ROOT dataset. s10z# zfs get com.oracle.zones.solaris10:activebe rpool/ROOT NAME PROPERTY VALUE SOURCE rpool/ROOT com.oracle.zones.solaris10:activebe zbe-0 local s10z# zfs set com.oracle.zones.solaris10:activebe=newBE rpool/ROOT s10z# zfs get com.oracle.zones.solaris10:activebe rpool/ROOT NAME PROPERTY VALUE SOURCE rpool/ROOT com.oracle.zones.solaris10:activebe newBE local s10z# shutdown -y -g0 -i6 After the zone has rebooted: s10z# zfs get com.oracle.zones.solaris10:activebe rpool/ROOT rpool/ROOT com.oracle.zones.solaris10:activebe newBE local s10z# zfs mount rpool/ROOT/newBE / rpool/export /export rpool/export/home /export/home rpool /rpool Mount the original BE to see that it's still there. s10z# zfs mount -o mountpoint=/mnt rpool/ROOT/zbe-0 s10z# ls /mnt Desktop export platform Documents export.backup.20130607T214951Z proc S10Flar home rpool TT_DB kernel sbin bin lib system boot lost+found tmp cdrom mnt usr dev net var etc opt Patch an inactive BE At this point, you can modify the original BE. If you would prefer to modify the new BE, you can restore the original value to the activebe property and reboot, and then mount the new BE to /mnt (or another empty directory) and modify it. Let's mount the original BE so we can modify it. (The first command is only needed if you haven't already mounted that BE.) s10z# zfs mount -o mountpoint=/mnt rpool/ROOT/zbe-0 s10z# patchadd -R /mnt -M /var/sadm/spool 104945-02 Note that the typical usage will be: Create a BE Mount the new (inactive) BE Use the package and patch tools to update the new BE Unmount the new BE Reboot Delete an inactive BE ZFS clones are children of their parent file systems. In order to destroy the parent, you must first "promote" the child. This reverses the parent-child relationship. (For more information on this, see the documentation.) The original rpool/ROOT file system is the parent of the clones that you create as BEs. In order to destroy an earlier BE that is that parent of other BEs, you must first promote one of the child BEs to be the ZFS parent. Only then can you destroy the original BE. Fortunately, this is easier to do than to explain: s10z# zfs promote rpool/ROOT/newBE s10z# zfs destroy rpool/ROOT/zbe-0 s10z# zfs list -r rpool/ROOT NAME USED AVAIL REFER MOUNTPOINT rpool/ROOT 3.56G 269G 31K legacy rpool/ROOT/newBE 3.56G 269G 3.55G / Documentation This feature is so new, it is not yet described in the Solaris 11 documentation. However, MOS note 1558773.1 offers some details. Conclusion With this new feature, you can add and patch packages to boot environments of a Solaris 10 Branded Zone. This ability improves the manageability of these zones, and makes their use more practical. It also means that you can use the existing P2V tools with earlier Solaris 10 updates, and modify the environments after they become Solaris 10 Branded Zones.

    Read the article

  • Anatomy of a .NET Assembly - CLR metadata 1

    - by Simon Cooper
    Before we look at the bytes comprising the CLR-specific data inside an assembly, we first need to understand the logical format of the metadata (For this post I only be looking at simple pure-IL assemblies; mixed-mode assemblies & other things complicates things quite a bit). Metadata streams Most of the CLR-specific data inside an assembly is inside one of 5 streams, which are analogous to the sections in a PE file. The name of each section in a PE file starts with a ., and the name of each stream in the CLR metadata starts with a #. All but one of the streams are heaps, which store unstructured binary data. The predefined streams are: #~ Also called the metadata stream, this stream stores all the information on the types, methods, fields, properties and events in the assembly. Unlike the other streams, the metadata stream has predefined contents & structure. #Strings This heap is where all the namespace, type & member names are stored. It is referenced extensively from the #~ stream, as we'll be looking at later. #US Also known as the user string heap, this stream stores all the strings used in code directly. All the strings you embed in your source code end up in here. This stream is only referenced from method bodies. #GUID This heap exclusively stores GUIDs used throughout the assembly. #Blob This heap is for storing pure binary data - method signatures, generic instantiations, that sort of thing. Items inside the heaps (#Strings, #US, #GUID and #Blob) are indexed using a simple binary offset from the start of the heap. At that offset is a coded integer giving the length of that item, then the item's bytes immediately follow. The #GUID stream is slightly different, in that GUIDs are all 16 bytes long, so a length isn't required. Metadata tables The #~ stream contains all the assembly metadata. The metadata is organised into 45 tables, which are binary arrays of predefined structures containing information on various aspects of the metadata. Each entry in a table is called a row, and the rows are simply concatentated together in the file on disk. For example, each row in the TypeRef table contains: A reference to where the type is defined (most of the time, a row in the AssemblyRef table). An offset into the #Strings heap with the name of the type An offset into the #Strings heap with the namespace of the type. in that order. The important tables are (with their table number in hex): 0x2: TypeDef 0x4: FieldDef 0x6: MethodDef 0x14: EventDef 0x17: PropertyDef Contains basic information on all the types, fields, methods, events and properties defined in the assembly. 0x1: TypeRef The details of all the referenced types defined in other assemblies. 0xa: MemberRef The details of all the referenced members of types defined in other assemblies. 0x9: InterfaceImpl Links the types defined in the assembly with the interfaces that type implements. 0xc: CustomAttribute Contains information on all the attributes applied to elements in this assembly, from method parameters to the assembly itself. 0x18: MethodSemantics Links properties and events with the methods that comprise the get/set or add/remove methods of the property or method. 0x1b: TypeSpec 0x2b: MethodSpec These tables provide instantiations of generic types and methods for each usage within the assembly. There are several ways to reference a single row within a table. The simplest is to simply specify the 1-based row index (RID). The indexes are 1-based so a value of 0 can represent 'null'. In this case, which table the row index refers to is inferred from the context. If the table can't be determined from the context, then a particular row is specified using a token. This is a 4-byte value with the most significant byte specifying the table, and the other 3 specifying the 1-based RID within that table. This is generally how a metadata table row is referenced from the instruction stream in method bodies. The third way is to use a coded token, which we will look at in the next post. So, back to the bytes Now we've got a rough idea of how the metadata is logically arranged, we can now look at the bytes comprising the start of the CLR data within an assembly: The first 8 bytes of the .text section are used by the CLR loader stub. After that, the CLR-specific data starts with the CLI header. I've highlighted the important bytes in the diagram. In order, they are: The size of the header. As the header is a fixed size, this is always 0x48. The CLR major version. This is always 2, even for .NET 4 assemblies. The CLR minor version. This is always 5, even for .NET 4 assemblies, and seems to be ignored by the runtime. The RVA and size of the metadata header. In the diagram, the RVA 0x20e4 corresponds to the file offset 0x2e4 Various flags specifying if this assembly is pure-IL, whether it is strong name signed, and whether it should be run as 32-bit (this is how the CLR differentiates between x86 and AnyCPU assemblies). A token pointing to the entrypoint of the assembly. In this case, 06 (the last byte) refers to the MethodDef table, and 01 00 00 refers to to the first row in that table. (after a gap) RVA of the strong name signature hash, which comes straight after the CLI header. The RVA 0x2050 corresponds to file offset 0x250. The rest of the CLI header is mainly used in mixed-mode assemblies, and so is zeroed in this pure-IL assembly. After the CLI header comes the strong name hash, which is a SHA-1 hash of the assembly using the strong name key. After that comes the bodies of all the methods in the assembly concatentated together. Each method body starts off with a header, which I'll be looking at later. As you can see, this is a very small assembly with only 2 methods (an instance constructor and a Main method). After that, near the end of the .text section, comes the metadata, containing a metadata header and the 5 streams discussed above. We'll be looking at this in the next post. Conclusion The CLI header data doesn't have much to it, but we've covered some concepts that will be important in later posts - the logical structure of the CLR metadata and the overall layout of CLR data within the .text section. Next, I'll have a look at the contents of the #~ stream, and how the table data is arranged on disk.

    Read the article

  • Developing a Cost Model for Cloud Applications

    - by BuckWoody
    Note - please pay attention to the date of this post. As much as I attempt to make the information below accurate, the nature of distributed computing means that components, units and pricing will change over time. The definitive costs for Microsoft Windows Azure and SQL Azure are located here, and are more accurate than anything you will see in this post: http://www.microsoft.com/windowsazure/offers/  When writing software that is run on a Platform-as-a-Service (PaaS) offering like Windows Azure / SQL Azure, one of the questions you must answer is how much the system will cost. I will not discuss the comparisons between on-premise costs (which are nigh impossible to calculate accurately) versus cloud costs, but instead focus on creating a general model for estimating costs for a given application. You should be aware that there are (at this writing) two billing mechanisms for Windows and SQL Azure: “Pay-as-you-go” or consumption, and “Subscription” or commitment. Conceptually, you can consider the former a pay-as-you-go cell phone plan, where you pay by the unit used (at a slightly higher rate) and the latter as a standard cell phone plan where you commit to a contract and thus pay lower rates. In this post I’ll stick with the pay-as-you-go mechanism for simplicity, which should be the maximum cost you would pay. From there you may be able to get a lower cost if you use the other mechanism. In any case, the model you create should hold. Developing a good cost model is essential. As a developer or architect, you’ll most certainly be asked how much something will cost, and you need to have a reliable way to estimate that. Businesses and Organizations have been used to paying for servers, software licenses, and other infrastructure as an up-front cost, and power, people to the systems and so on as an ongoing (and sometimes not factored) cost. When presented with a new paradigm like distributed computing, they may not understand the true cost/value proposition, and that’s where the architect and developer can guide the conversation to make a choice based on features of the application versus the true costs. The two big buckets of use-types for these applications are customer-based and steady-state. In the customer-based use type, each successful use of the program results in a sale or income for your organization. Perhaps you’ve written an application that provides the spot-price of foo, and your customer pays for the use of that application. In that case, once you’ve estimated your cost for a successful traversal of the application, you can build that into the price you charge the user. It’s a standard restaurant model, where the price of the meal is determined by the cost of making it, plus any profit you can make. In the second use-type, the application will be used by a more-or-less constant number of processes or users and no direct revenue is attached to the system. A typical example is a customer-tracking system used by the employees within your company. In this case, the cost model is often created “in reverse” - meaning that you pilot the application, monitor the use (and costs) and that cost is held steady. This is where the comparison with an on-premise system becomes necessary, even though it is more difficult to estimate those on-premise true costs. For instance, do you know exactly how much cost the air conditioning is because you have a team of system administrators? This may sound trivial, but that, along with the insurance for the building, the wiring, and every other part of the system is in fact a cost to the business. There are three primary methods that I’ve been successful with in estimating the cost. None are perfect, all are demand-driven. The general process is to lay out a matrix of: components units cost per unit and then multiply that times the usage of the system, based on which components you use in the program. That sounds a bit simplistic, but using those metrics in a calculation becomes more detailed. In all of the methods that follow, you need to know your application. The components for a PaaS include computing instances, storage, transactions, bandwidth and in the case of SQL Azure, database size. In most cases, architects start with the first model and progress through the other methods to gain accuracy. Simple Estimation The simplest way to calculate costs is to architect the application (even UML or on-paper, no coding involved) and then estimate which of the components you’ll use, and how much of each will be used. Microsoft provides two tools to do this - one is a simple slider-application located here: http://www.microsoft.com/windowsazure/pricing-calculator/  The other is a tool you download to create an “Return on Investment” (ROI) spreadsheet, which has the advantage of leading you through various questions to estimate what you plan to use, located here: https://roianalyst.alinean.com/msft/AutoLogin.do?d=176318219048082115  You can also just create a spreadsheet yourself with a structure like this: Program Element Azure Component Unit of Measure Cost Per Unit Estimated Use of Component Total Cost Per Component Cumulative Cost               Of course, the consideration with this model is that it is difficult to predict a system that is not running or hasn’t even been developed. Which brings us to the next model type. Measure and Project A more accurate model is to actually write the code for the application, using the Software Development Kit (SDK) which can run entirely disconnected from Azure. The code should be instrumented to estimate the use of the application components, logging to a local file on the development system. A series of unit and integration tests should be run, which will create load on the test system. You can use standard development concepts to track this usage, and even use Windows Performance Monitor counters. The best place to start with this method is to use the Windows Azure Diagnostics subsystem in your code, which you can read more about here: http://blogs.msdn.com/b/sumitm/archive/2009/11/18/introducing-windows-azure-diagnostics.aspx This set of API’s greatly simplifies tracking the application, and in fact you can use this information for more than just a cost model. After you have the tracking logs, you can plug the numbers into ay of the tools above, which should give a representative cost or in some cases a unit cost. The consideration with this model is that the SDK fabric is not a one-to-one comparison with performance on the actual Windows Azure fabric. Those differences are usually smaller, but they do need to be considered. Also, you may not be able to accurately predict the load on the system, which might lead to an architectural change, which changes the model. This leads us to the next, most accurate method for a cost model. Sample and Estimate Using standard statistical and other predictive math, once the application is deployed you will get a bill each month from Microsoft for your Azure usage. The bill is quite detailed, and you can export the data from it to do analysis, and using methods like regression and so on project out into the future what the costs will be. I normally advise that the architect also extrapolate a unit cost from those metrics as well. This is the information that should be reported back to the executives that pay the bills: the past cost, future projected costs, and unit cost “per click” or “per transaction”, as your case warrants. The challenge here is in the model itself - statistical methods are not foolproof, and the larger the sample (in this case I recommend the entire population, not a smaller sample) is key. References and Tools Articles: http://blogs.msdn.com/b/patrick_butler_monterde/archive/2010/02/10/windows-azure-billing-overview.aspx http://technet.microsoft.com/en-us/magazine/gg213848.aspx http://blog.codingoutloud.com/2011/06/05/azure-faq-how-much-will-it-cost-me-to-run-my-application-on-windows-azure/ http://blogs.msdn.com/b/johnalioto/archive/2010/08/25/10054193.aspx http://geekswithblogs.net/iupdateable/archive/2010/02/08/qampa-how-can-i-calculate-the-tco-and-roi-when.aspx   Other Tools: http://cloud-assessment.com/ http://communities.quest.com/community/cloud_tools

    Read the article

  • Source-control 'wet-work'?

    - by Phil Factor
    When a design or creative work is flawed beyond remedy, it is often best to destroy it and start again. The other day, I lost the code to a long and intricate SQL batch I was working on. I’d thought it was impossible, but it happened. With all the technology around that is designed to prevent this occurring, this sort of accident has become a rare event.  If it weren’t for a deranged laptop, and my distraction, the code wouldn’t have been lost this time.  As always, I sighed, had a soothing cup of tea, and typed it all in again.  The new code I hastily tapped in  was much better: I’d held in my head the essence of how the code should work rather than the details: I now knew for certain  the start point, the end, and how it should be achieved. Instantly the detritus of half-baked thoughts fell away and I was able to write logical code that performed better.  Because I could work so quickly, I was able to hold the details of all the columns and variables in my head, and the dynamics of the flow of data. It was, in fact, easier and quicker to start from scratch rather than tidy up and refactor the existing code with its inevitable fumbling and half-baked ideas. What a shame that technology is now so good that developers rarely experience the cleansing shock of losing one’s code and having to rewrite it from scratch.  If you’ve never accidentally lost  your code, then it is worth doing it deliberately once for the experience. Creative people have, until Technology mistakenly prevented it, torn up their drafts or sketches, threw them in the bin, and started again from scratch.  Leonardo’s obsessive reworking of the Mona Lisa was renowned because it was so unusual:  Most artists have been utterly ruthless in destroying work that didn’t quite make it. Authors are particularly keen on writing afresh, and the results are generally positive. Lawrence of Arabia actually lost the entire 250,000 word manuscript of ‘The Seven Pillars of Wisdom’ by accidentally leaving it on a train at Reading station, before rewriting a much better version.  Now, any writer or artist is seduced by technology into altering or refining their work rather than casting it dramatically in the bin or setting a light to it on a bonfire, and rewriting it from the blank page.  It is easy to pick away at a flawed work, but the real creative process is far more brutal. Once, many years ago whilst running a software house that supplied commercial software to local businesses, I’d been supervising an accounting system for a farming cooperative. No packaged system met their needs, and it was all hand-cut code.  For us, it represented a breakthrough as it was for a government organisation, and success would guarantee more contracts. As you’ve probably guessed, the code got mangled in a disk crash just a week before the deadline for delivery, and the many backups all proved to be entirely corrupted by a faulty tape drive.  There were some fragments left on individual machines, but they were all of different versions.  The developers were in despair.  Strangely, I managed to re-write the bulk of a three-month project in a manic and caffeine-soaked weekend.  Sure, that elegant universally-applicable input-form routine was‘nt quite so elegant, but it didn’t really need to be as we knew what forms it needed to support.  Yes, the code lacked architectural elegance and reusability. By dawn on Monday, the application passed its integration tests. The developers rose to the occasion after I’d collapsed, and tidied up what I’d done, though they were reproachful that some of the style and elegance had gone out of the application. By the delivery date, we were able to install it. It was a smaller, faster application than the beta they’d seen and the user-interface had a new, rather Spartan, appearance that we swore was done to conform to the latest in user-interface guidelines. (we switched to Helvetica font to look more ‘Bauhaus’ ). The client was so delighted that he forgave the new bugs that had crept in. I still have the disk that crashed, up in the attic. In IT, we have had mixed experiences from complete re-writes. Lotus 123 never really recovered from a complete rewrite from assembler into C, Borland made the mistake with Arago and Quattro Pro  and Netscape’s complete rewrite of their Navigator 4 browser was a white-knuckle ride. In all cases, the decision to rewrite was a result of extreme circumstances where no other course of action seemed possible.   The rewrite didn’t come out of the blue. I prefer to remember the rewrite of Minix by young Linus Torvalds, or the rewrite of Bitkeeper by a slightly older Linus.  The rewrite of CP/M didn’t do too badly either, did it? Come to think of it, the guy who decided to rewrite the windowing system of the Xerox Star never regretted the decision. I’ll agree that one should often resist calls for a rewrite. One of the worst habits of the more inexperienced programmer is to denigrate whatever code he or she inherits, and then call loudly for a complete rewrite. They are buoyed up by the mistaken belief that they can do better. This, however, is a different psychological phenomenon, more related to the idea of some motorcyclists that they are operating on infinite lives, or the occasional squaddies that if they charge the machine-guns determinedly enough all will be well. Grim experience brings out the humility in any experienced programmer.  I’m referring to quite different circumstances here. Where a team knows the requirements perfectly, are of one mind on methodology and coding standards, and they already have a solution, then what is wrong with considering  a complete rewrite? Rewrites are so painful in the early stages, until that point where one realises the payoff, that even I quail at the thought. One needs a natural disaster to push one over the edge. The trouble is that source-control systems, and disaster recovery systems, are just too good nowadays.   If I were to lose this draft of this very blog post, I know I’d rewrite it much better. However, if you read this, you’ll know I didn’t have the nerve to delete it and start again.  There was a time that one prayed that unreliable hardware would deliver you from an unmaintainable mess of a codebase, but now technology has made us almost entirely immune to such a merciful act of God. An old friend of mine with long experience in the software industry has long had the idea of the ‘source-control wet-work’,  where one hires a malicious hacker in some wild eastern country to hack into one’s own  source control system to destroy all trace of the source to an application. Alas, backup systems are just too good to make this any more than a pipedream. Somehow, it would be difficult to promote the idea. As an alternative, could one construct a source control system that, on doing all the code-quality metrics, would systematically destroy all trace of source code that failed the quality test? Alas, I can’t see many managers buying into the idea. In reading the full story of the near-loss of Toy Story 2, it set me thinking. It turned out that the lucky restoration of the code wasn’t the happy ending one first imagined it to be, because they eventually came to the conclusion that the plot was fundamentally flawed and it all had to be rewritten anyway.  Was this an early  case of the ‘source-control wet-job’?’ It is very hard nowadays to do a rapid U-turn in a development project because we are far too prone to cling to our existing source-code.

    Read the article

  • Finally, upgrade from Nokia X3 to Samsung Galaxy S III

    This time, something slightly different but nonetheless not less interesting, hopefully. Living on a remote island like Mauritius, ill-praised 'Cyber Island' in the Indian Ocean, has its advantages in life style and relaxed environment to life in but in terms of technological aspects it can be quite a nightmare. Well, I guess this might be different story to report about... one day. Cyber Island Mauritius Despite it's shiny advertisement as Cyber Island and business in ICT hub to Africa, Mauritius is not on the latest track of available models in computer hardware or, in the context of this article, cellulars or smart-phone, or communication technology in general. Okay, I have to admit that this statement is only partly true. Money can buy, even here in Mauritius. Luckily, there are ways and ways to deal with this outcry of modern, read: technological, civilisation issues. Online shopping you might think? Yes, for sure, until you discover in your checkout procedure that a small island in the Indian Ocean isn't a preferred destination for delivery and the precious time you spent on putting your items into your cart and feeding your personal level of anticipation gets ruined on the last stint. Ordering from abroad saves you money Anyway, I got in touch with my personal courier and luckily there were some extra-kilos left in the luggage. First obstacle sorted, we have a Transporter! Okay, on the next occasion off to Amazon online and using their Prime service for fast delivery. Actually, the order was placed on Saturday evening and everything got delivered on Tuesday morning - nice job in less than 72 hours. Okay, among the items of that shopping rush I ordered a shiny Samsung Galaxy S III 16GB in oceanic blue - did I mention, that you hardly get a blue model in Mauritius? - for my BWE. Interesting side-notes: First, Amazon Germany dropped the prices for roughly 30% on the S3, and we got the 16GB model for less than 500 Euro (or approx. Rs. 19.500,-) compared to the usual Rs. 27.000,- on the local market. It even varies whether the local price is inclusive or exclusive VAT (15%). Second, since a while she was bothering me to get an iPhone and an iPad for her, fair enough I thought, decent hardware, posh design and reliable services. Until we watched the 'magical' introduction of Samsung's new models at the IFA exhibition, she read the bashing comments on Google+ on the iPhone 5 and I gave her a brief summary on the law suit between Apple and Samsung in the USA. So, yes, Samsung USA is right, the next big thing is already here - literally. My BWE loves the look and touch of the Galaxy S3. And for me it was more cost-effective in terms of purchases done at the App Store, ups, Play Store. Transfer of contacts, text messages and media files Okay, now that the hardware is in place, how to transfer all those contacts, text messages, media files, etc. between those two devices? In the past, I used to use the Nokia Communication Suite between various models but now for Android? Well, as usual Google and Bing are reliable friends and among the first hits I came across an article about How to Transfer Contacts from Nokia to Android. Couldn't be easier, right? Well, sort of... my main Windows systems are already running on Windows 8, and this actually caused problems with the mobile/smart-phone device drivers. The article provides the download for an older version 1.10 which upgrades to 2.11 (as time of writing this entry) but both couldn't get the Galaxy S3 and the Nokia connected. Shame on me... the product page clearly doesn't mention Windows 8 (for now) and Windows 8 isn't available for the general audience at all... After I took a spare machine running on Windows Vista everything went smooth. Software installed, upgrade done, device drivers for Android automatically downloaded and installed, and the same painless routine for the Nokia part. I think, I rebooted the system twice during the whole setup procedure but hey, it was more or less a distraction while coding some stuff in ASP.NET MVC and Telerik Kendo UI. The transfer of contacts and text messages was done via Wondershare MobileGo for Android, and all media files by moving the additional microSD card from one device to the other. But even without an external SD card, it would have been very easy to copy the files via Windows Explorer directly. Little catch and excellent service Fine, we are almost done and the only step left is to shift the SIM card... Ouch, gotcha! The X3 uses a standard size SIM card while the S III only accepts microSIM form factor. What an irony, bigger smartphone needs smaller SIM card. Luckily, the next showroom of Emtel is just 5 mins away up the road, and the service staff over there know their job. Finally, after roughly 10 mins of paper work, activation and small chit-chat, the S3 came to life on the mobile network. Owning a smart-phone now and knowing that my BWE would like to interact more on social networks away from home, especially to upload pictures and provide local 'check-ins', I activated a data package for her in advance, too. Even that it is Saturday, everything was already done and ready to be used. Nice bonus: The Emtel clerk directly offered me to set up the configuration for the Emtel data services, yes sure, go ahead, this saves me to search for that in the settings. Okay, spoiler-alert here, setting a static APN to access the Emtel network and the internet wouldn't be a challenge. But hey, she already had the phone in her hands and I could keep my eyes on the children. Well done, Emtel! Resume Thanks to the useful software package by Wondershare is was a hands-free experience to transfer all the data from a Nokia mobile on Symbian S60 to a Samsung Galaxy S III on Android Ice Cream Sandwich (ICS). In the future, this wont be a serious issue at all anymore thanks to synchronisation services and cloud storage. And for now, I'm only waiting for the official upgrades for Jelly Bean.

    Read the article

  • DRY and SRP

    - by Timothy Klenke
    Originally posted on: http://geekswithblogs.net/TimothyK/archive/2014/06/11/dry-and-srp.aspxKent Beck’s XP Simplicity Rules (aka Four Rules of Simple Design) are a prioritized list of rules that when applied to your code generally yield a great design.  As you’ll see from the above link the list has slightly evolved over time.  I find today they are usually listed as: All Tests Pass Don’t Repeat Yourself (DRY) Express Intent Minimalistic These are prioritized.  If your code doesn’t work (rule 1) then everything else is forfeit.  Go back to rule one and get the code working before worrying about anything else. Over the years the community have debated whether the priority of rules 2 and 3 should be reversed.  Some say a little duplication in the code is OK as long as it helps express intent.  I’ve debated it myself.  This recent post got me thinking about this again, hence this post.   I don’t think it is fair to compare “Expressing Intent” against “DRY”.  This is a comparison of apples to oranges.  “Expressing Intent” is a principal of code quality.  “Repeating Yourself” is a code smell.  A code smell is merely an indicator that there might be something wrong with the code.  It takes further investigation to determine if a violation of an underlying principal of code quality has actually occurred. For example “using nouns for method names”, “using verbs for property names”, or “using Booleans for parameters” are all code smells that indicate that code probably isn’t doing a good job at expressing intent.  They are usually very good indicators.  But what principle is the code smell of Duplication pointing to and how good of an indicator is it? Duplication in the code base is bad for a couple reasons.  If you need to make a change and that needs to be made in a number of locations it is difficult to know if you have caught all of them.  This can lead to bugs if/when one of those locations is overlooked.  By refactoring the code to remove all duplication there will be left with only one place to change, thereby eliminating this problem. With most projects the code becomes the single source of truth for a project.  If a production code base is inconsistent with a five year old requirements or design document the production code that people are currently living with is usually declared as the current reality (or truth).  Requirement or design documents at this age in a project life cycle are usually of little value. Although comparing production code to external documentation is usually straight forward, duplication within the code base muddles this declaration of truth.  When code is duplicated small discrepancies will creep in between the two copies over time.  The question then becomes which copy is correct?  As different factions debate how the software should work, trust in the software and the team behind it erodes. The code smell of Duplication points to a violation of the “Single Source of Truth” principle.  Let me define that as: A stakeholder’s requirement for a software change should never cause more than one class to change. Violation of the Single Source of Truth principle will always result in duplication in the code.  However, the inverse is not always true.  Duplication in the code does not necessarily indicate that there is a violation of the Single Source of Truth principle. To illustrate this, let’s look at a retail system where the system will (1) send a transaction to a bank and (2) print a receipt for the customer.  Although these are two separate features of the system, they are closely related.  The reason for printing the receipt is usually to provide an audit trail back to the bank transaction.  Both features use the same data:  amount charged, account number, transaction date, customer name, retail store name, and etcetera.  Because both features use much of the same data, there is likely to be a lot of duplication between them.  This duplication can be removed by making both features use the same data access layer. Then start coming the divergent requirements.  The receipt stakeholder wants a change so that the account number has the last few digits masked out to protect the customer’s privacy.  That can be solve with a small IF statement whilst still eliminating all duplication in the system.  Then the bank wants to take a picture of the customer as well as capture their signature and/or PIN number for enhanced security.  Then the receipt owner wants to pull data from a completely different system to report the customer’s loyalty program point total. After a while you realize that the two stakeholders have somewhat similar, but ultimately different responsibilities.  They have their own reasons for pulling the data access layer in different directions.  Then it dawns on you, the Single Responsibility Principle: There should never be more than one reason for a class to change. In this example we have two stakeholders giving two separate reasons for the data access class to change.  It is clear violation of the Single Responsibility Principle.  That’s a problem because it can often lead the project owner pitting the two stakeholders against each other in a vein attempt to get them to work out a mutual single source of truth.  But that doesn’t exist.  There are two completely valid truths that the developers need to support.  How is this to be supported and honour the Single Responsibility Principle?  The solution is to duplicate the data access layer and let each stakeholder control their own copy. The Single Source of Truth and Single Responsibility Principles are very closely related.  SST tells you when to remove duplication; SRP tells you when to introduce it.  They may seem to be fighting each other, but really they are not.  The key is to clearly identify the different responsibilities (or sources of truth) over a system.  Sometimes there is a single person with that responsibility, other times there are many.  This can be especially difficult if the same person has dual responsibilities.  They might not even realize they are wearing multiple hats. In my opinion Single Source of Truth should be listed as the second rule of simple design with Express Intent at number three.  Investigation of the DRY code smell should yield to the proper application SST, without violating SRP.  When necessary leave duplication in the system and let the class names express the different people that are responsible for controlling them.  Knowing all the people with responsibilities over a system is the higher priority because you’ll need to know this before you can express it.  Although it may be a code smell when there is duplication in the code, it does not necessarily mean that the coder has chosen to be expressive over DRY or that the code is bad.

    Read the article

  • External File Upload Optimizations for Windows Azure

    - by rgillen
    [Cross posted from here: http://rob.gillenfamily.net/post/External-File-Upload-Optimizations-for-Windows-Azure.aspx] I’m wrapping up a bit of the work we’ve been doing on data movement optimizations for cloud computing and the latest set of data yielded some interesting points I thought I’d share. The work done here is not really rocket science but may, in some ways, be slightly counter-intuitive and therefore seemed worthy of posting. Summary: for those who don’t like to read detailed posts or don’t have time, the synopsis is that if you are uploading data to Azure, block your data (even down to 1MB) and upload in parallel. Set your block size based on your source file size, but if you must choose a fixed value, use 1MB. Following the above will result in significant performance gains… upwards of 10x-24x and a reduction in overall file transfer time of upwards of 90% (eg, uploading a 1GB file averaged 46.37 minutes prior to optimizations and averaged 1.86 minutes afterwards). Detail: For those of you who want more detail, or think that the claims at the end of the preceding paragraph are over-reaching, what follows is information and code supporting these claims. As the title would indicate, these tests were run from our research facility pointing to the Azure cloud (specifically US North Central as it is physically closest to us) and do not represent intra-cloud results… we have performed intra-cloud tests and the overall results are similar in notion but the data rates are significantly different as well as the tipping points for the various block sizes… this will be detailed separately). We started by building a very simple console application that would loop through a directory and upload each file to Azure storage. This application used the shipping storage client library from the 1.1 version of the azure tools. The only real variation from the client library is that we added code to collect and record the duration (in ms) and size (in bytes) for each file transferred. The code is available here. We then created a directory that had a collection of files for the following sizes: 2KB, 32KB, 64KB, 128KB, 512KB, 1MB, 5MB, 10MB, 25MB, 50MB, 100MB, 250MB, 500MB, 750MB, and 1GB (50 files for each size listed). These files contained randomly-generated binary data and do not benefit from compression (a separate discussion topic). Our file generation tool is available here. The baseline was established by running the application described above against the directory containing all of the data files. This application uploads the files in a random order so as to avoid transferring all of the files of a given size sequentially and thereby spreading the affects of periodic Internet delays across the collection of results.  We then ran some scripts to split the resulting data and generate some reports. The raw data collected for our non-optimized tests is available via the links in the Related Resources section at the bottom of this post. For each file size, we calculated the average upload time (and standard deviation) and the average transfer rate (and standard deviation). As you likely are aware, transferring data across the Internet is susceptible to many transient delays which can cause anomalies in the resulting data. It is for this reason that we randomized the order of source file processing as well as executed the tests 50x for each file size. We expect that these steps will yield a sufficiently balanced set of results. Once the baseline was collected and analyzed, we updated the test harness application with some methods to split the source file into user-defined block sizes and then to upload those blocks in parallel (using the PutBlock() method of Azure storage). The parallelization was handled by simply relying on the Parallel Extensions to .NET to provide a Parallel.For loop (see linked source for specific implementation details in Program.cs, line 173 and following… less than 100 lines total). Once all of the blocks were uploaded, we called PutBlockList() to assemble/commit the file in Azure storage. For each block transferred, the MD5 was calculated and sent ensuring that the bits that arrived matched was was intended. The timer for the blocked/parallelized transfer method wraps the entire process (source file splitting, block transfer, MD5 validation, file committal). A diagram of the process is as follows: We then tested the affects of blocking & parallelizing the transfers by running the updated application against the same source set and did a parameter sweep on the block size including 256KB, 512KB, 1MB, 2MB, and 4MB (our assumption was that anything lower than 256KB wasn’t worth the trouble and 4MB is the maximum size of a block supported by Azure). The raw data for the parallel tests is available via the links in the Related Resources section at the bottom of this post. This data was processed and then compared against the single-threaded / non-optimized transfer numbers and the results were encouraging. The Excel version of the results is available here. Two semi-obvious points need to be made prior to reviewing the data. The first is that if the block size is larger than the source file size you will end up with a “negative optimization” due to the overhead of attempting to block and parallelize. The second is that as the files get smaller, the clock-time cost of blocking and parallelizing (overhead) is more apparent and can tend towards negative optimizations. For this reason (and is supported in the raw data provided in the linked worksheet) the charts and dialog below ignore source file sizes less than 1MB. (click chart for full size image) The chart above illustrates some interesting points about the results: When the block size is smaller than the source file, performance increases but as the block size approaches and then passes the source file size, you see decreasing benefit to the point of negative gains (see the values for the 1MB file size) For some of the moderately-sized source files, small blocks (256KB) are best As the size of the source file gets larger (see values for 50MB and up), the smallest block size is not the most efficient (presumably due, at least in part, to the increased number of blocks, increased number of individual transfer requests, and reassembly/committal costs). Once you pass the 250MB source file size, the difference in rate for 1MB to 4MB blocks is more-or-less constant The 1MB block size gives the best average improvement (~16x) but the optimal approach would be to vary the block size based on the size of the source file.    (click chart for full size image) The above is another view of the same data as the prior chart just with the axis changed (x-axis represents file size and plotted data shows improvement by block size). It again highlights the fact that the 1MB block size is probably the best overall size but highlights the benefits of some of the other block sizes at different source file sizes. This last chart shows the change in total duration of the file uploads based on different block sizes for the source file sizes. Nothing really new here other than this view of the data highlights the negative affects of poorly choosing a block size for smaller files.   Summary What we have found so far is that blocking your file uploads and uploading them in parallel results in significant performance improvements. Further, utilizing extension methods and the Task Parallel Library (.NET 4.0) make short work of altering the shipping client library to provide this functionality while minimizing the amount of change to existing applications that might be using the client library for other interactions.   Related Resources Source code for upload test application Source code for random file generator ODatas feed of raw data from non-optimized transfer tests Experiment Metadata Experiment Datasets 2KB Uploads 32KB Uploads 64KB Uploads 128KB Uploads 256KB Uploads 512KB Uploads 1MB Uploads 5MB Uploads 10MB Uploads 25MB Uploads 50MB Uploads 100MB Uploads 250MB Uploads 500MB Uploads 750MB Uploads 1GB Uploads Raw Data OData feeds of raw data from blocked/parallelized transfer tests Experiment Metadata Experiment Datasets Raw Data 256KB Blocks 512KB Blocks 1MB Blocks 2MB Blocks 4MB Blocks Excel worksheet showing summarizations and comparisons

    Read the article

  • I see no LOBs!

    - by Paul White
    Is it possible to see LOB (large object) logical reads from STATISTICS IO output on a table with no LOB columns? I was asked this question today by someone who had spent a good fraction of their afternoon trying to work out why this was occurring – even going so far as to re-run DBCC CHECKDB to see if any corruption had taken place.  The table in question wasn’t particularly pretty – it had grown somewhat organically over time, with new columns being added every so often as the need arose.  Nevertheless, it remained a simple structure with no LOB columns – no TEXT or IMAGE, no XML, no MAX types – nothing aside from ordinary INT, MONEY, VARCHAR, and DATETIME types.  To add to the air of mystery, not every query that ran against the table would report LOB logical reads – just sometimes – but when it did, the query often took much longer to execute. Ok, enough of the pre-amble.  I can’t reproduce the exact structure here, but the following script creates a table that will serve to demonstrate the effect: IF OBJECT_ID(N'dbo.Test', N'U') IS NOT NULL DROP TABLE dbo.Test GO CREATE TABLE dbo.Test ( row_id NUMERIC IDENTITY NOT NULL,   col01 NVARCHAR(450) NOT NULL, col02 NVARCHAR(450) NOT NULL, col03 NVARCHAR(450) NOT NULL, col04 NVARCHAR(450) NOT NULL, col05 NVARCHAR(450) NOT NULL, col06 NVARCHAR(450) NOT NULL, col07 NVARCHAR(450) NOT NULL, col08 NVARCHAR(450) NOT NULL, col09 NVARCHAR(450) NOT NULL, col10 NVARCHAR(450) NOT NULL, CONSTRAINT [PK dbo.Test row_id] PRIMARY KEY CLUSTERED (row_id) ) ; The next script loads the ten variable-length character columns with one-character strings in the first row, two-character strings in the second row, and so on down to the 450th row: WITH Numbers AS ( -- Generates numbers 1 - 450 inclusive SELECT TOP (450) n = ROW_NUMBER() OVER (ORDER BY (SELECT 0)) FROM master.sys.columns C1, master.sys.columns C2, master.sys.columns C3 ORDER BY n ASC ) INSERT dbo.Test WITH (TABLOCKX) SELECT REPLICATE(N'A', N.n), REPLICATE(N'B', N.n), REPLICATE(N'C', N.n), REPLICATE(N'D', N.n), REPLICATE(N'E', N.n), REPLICATE(N'F', N.n), REPLICATE(N'G', N.n), REPLICATE(N'H', N.n), REPLICATE(N'I', N.n), REPLICATE(N'J', N.n) FROM Numbers AS N ORDER BY N.n ASC ; Once those two scripts have run, the table contains 450 rows and 10 columns of data like this: Most of the time, when we query data from this table, we don’t see any LOB logical reads, for example: -- Find the maximum length of the data in -- column 5 for a range of rows SELECT result = MAX(DATALENGTH(T.col05)) FROM dbo.Test AS T WHERE row_id BETWEEN 50 AND 100 ; But with a different query… -- Read all the data in column 1 SELECT result = MAX(DATALENGTH(T.col01)) FROM dbo.Test AS T ; …suddenly we have 49 LOB logical reads, as well as the ‘normal’ logical reads we would expect. The Explanation If we had tried to create this table in SQL Server 2000, we would have received a warning message to say that future INSERT or UPDATE operations on the table might fail if the resulting row exceeded the in-row storage limit of 8060 bytes.  If we needed to store more data than would fit in an 8060 byte row (including internal overhead) we had to use a LOB column – TEXT, NTEXT, or IMAGE.  These special data types store the large data values in a separate structure, with just a small pointer left in the original row. Row Overflow SQL Server 2005 introduced a feature called row overflow, which allows one or more variable-length columns in a row to move to off-row storage if the data in a particular row would otherwise exceed 8060 bytes.  You no longer receive a warning when creating (or altering) a table that might need more than 8060 bytes of in-row storage; if SQL Server finds that it can no longer fit a variable-length column in a particular row, it will silently move one or more of these columns off the row into a separate allocation unit. Only variable-length columns can be moved in this way (for example the (N)VARCHAR, VARBINARY, and SQL_VARIANT types).  Fixed-length columns (like INTEGER and DATETIME for example) never move into ‘row overflow’ storage.  The decision to move a column off-row is done on a row-by-row basis – so data in a particular column might be stored in-row for some table records, and off-row for others. In general, if SQL Server finds that it needs to move a column into row-overflow storage, it moves the largest variable-length column record for that row.  Note that in the case of an UPDATE statement that results in the 8060 byte limit being exceeded, it might not be the column that grew that is moved! Sneaky LOBs Anyway, that’s all very interesting but I don’t want to get too carried away with the intricacies of row-overflow storage internals.  The point is that it is now possible to define a table with non-LOB columns that will silently exceed the old row-size limit and result in ordinary variable-length columns being moved to off-row storage.  Adding new columns to a table, expanding an existing column definition, or simply storing more data in a column than you used to – all these things can result in one or more variable-length columns being moved off the row. Note that row-overflow storage is logically quite different from old-style LOB and new-style MAX data type storage – individual variable-length columns are still limited to 8000 bytes each – you can just have more of them now.  Having said that, the physical mechanisms involved are very similar to full LOB storage – a column moved to row-overflow leaves a 24-byte pointer record in the row, and the ‘separate storage’ I have been talking about is structured very similarly to both old-style LOBs and new-style MAX types.  The disadvantages are also the same: when SQL Server needs a row-overflow column value it needs to follow the in-row pointer a navigate another chain of pages, just like retrieving a traditional LOB. And Finally… In the example script presented above, the rows with row_id values from 402 to 450 inclusive all exceed the total in-row storage limit of 8060 bytes.  A SELECT that references a column in one of those rows that has moved to off-row storage will incur one or more lob logical reads as the storage engine locates the data.  The results on your system might vary slightly depending on your settings, of course; but in my tests only column 1 in rows 402-450 moved off-row.  You might like to play around with the script – updating columns, changing data type lengths, and so on – to see the effect on lob logical reads and which columns get moved when.  You might even see row-overflow columns moving back in-row if they are updated to be smaller (hint: reduce the size of a column entry by at least 1000 bytes if you hope to see this). Be aware that SQL Server will not warn you when it moves ‘ordinary’ variable-length columns into overflow storage, and it can have dramatic effects on performance.  It makes more sense than ever to choose column data types sensibly.  If you make every column a VARCHAR(8000) or NVARCHAR(4000), and someone stores data that results in a row needing more than 8060 bytes, SQL Server might turn some of your column data into pseudo-LOBs – all without saying a word. Finally, some people make a distinction between ordinary LOBs (those that can hold up to 2GB of data) and the LOB-like structures created by row-overflow (where columns are still limited to 8000 bytes) by referring to row-overflow LOBs as SLOBs.  I find that quite appealing, but the ‘S’ stands for ‘small’, which makes expanding the whole acronym a little daft-sounding…small large objects anyone? © Paul White 2011 email: [email protected] twitter: @SQL_Kiwi

    Read the article

  • Finally, upgrade from Nokia X3 to Samsung Galaxy S III

    This time, something slightly different but nonetheless not less interesting, hopefully. Living on a remote island like Mauritius, ill-praised 'Cyber Island' in the Indian Ocean, has its advantages in life style and relaxed environment to life in but in terms of technological aspects it can be quite a nightmare. Well, I guess this might be different story to report about... one day. Cyber Island Mauritius Despite it's shiny advertisement as Cyber Island and business in ICT hub to Africa, Mauritius is not on the latest track of available models in computer hardware or, in the context of this article, cellulars or smart-phone, or communication technology in general. Okay, I have to admit that this statement is only partly true. Money can buy, even here in Mauritius. Luckily, there are ways and ways to deal with this outcry of modern, read: technological, civilisation issues. Online shopping you might think? Yes, for sure, until you discover in your checkout procedure that a small island in the Indian Ocean isn't a preferred destination for delivery and the precious time you spent on putting your items into your cart and feeding your personal level of anticipation gets ruined on the last stint. Ordering from abroad saves you money Anyway, I got in touch with my personal courier and luckily there were some extra-kilos left in the luggage. First obstacle sorted, we have a Transporter! Okay, on the next occasion off to Amazon online and using their Prime service for fast delivery. Actually, the order was placed on Saturday evening and everything got delivered on Tuesday morning - nice job in less than 72 hours. Okay, among the items of that shopping rush I ordered a shiny Samsung Galaxy S III 16GB in oceanic blue - did I mention, that you hardly get a blue model in Mauritius? - for my BWE. Interesting side-notes: First, Amazon Germany dropped the prices for roughly 30% on the S3, and we got the 16GB model for less than 500 Euro (or approx. Rs. 19.500,-) compared to the usual Rs. 27.000,- on the local market. It even varies whether the local price is inclusive or exclusive VAT (15%). Second, since a while she was bothering me to get an iPhone and an iPad for her, fair enough I thought, decent hardware, posh design and reliable services. Until we watched the 'magical' introduction of Samsung's new models at the IFA exhibition, she read the bashing comments on Google+ on the iPhone 5 and I gave her a brief summary on the law suit between Apple and Samsung in the USA. So, yes, Samsung USA is right, the next big thing is already here - literally. My BWE loves the look and touch of the Galaxy S3. And for me it was more cost-effective in terms of purchases done at the App Store, ups, Play Store. Transfer of contacts, text messages and media files Okay, now that the hardware is in place, how to transfer all those contacts, text messages, media files, etc. between those two devices? In the past, I used to use the Nokia Communication Suite between various models but now for Android? Well, as usual Google and Bing are reliable friends and among the first hits I came across an article about How to Transfer Contacts from Nokia to Android. Couldn't be easier, right? Well, sort of... my main Windows systems are already running on Windows 8, and this actually caused problems with the mobile/smart-phone device drivers. The article provides the download for an older version 1.10 which upgrades to 2.11 (as time of writing this entry) but both couldn't get the Galaxy S3 and the Nokia connected. Shame on me... the product page clearly doesn't mention Windows 8 (for now) and Windows 8 isn't available for the general audience at all... After I took a spare machine running on Windows Vista everything went smooth. Software installed, upgrade done, device drivers for Android automatically downloaded and installed, and the same painless routine for the Nokia part. I think, I rebooted the system twice during the whole setup procedure but hey, it was more or less a distraction while coding some stuff in ASP.NET MVC and Telerik Kendo UI. The transfer of contacts and text messages was done via Wondershare MobileGo for Android, and all media files by moving the additional microSD card from one device to the other. But even without an external SD card, it would have been very easy to copy the files via Windows Explorer directly. Little catch and excellent service Fine, we are almost done and the only step left is to shift the SIM card... Ouch, gotcha! The X3 uses a standard size SIM card while the S III only accepts microSIM form factor. What an irony, bigger smartphone needs smaller SIM card. Luckily, the next showroom of Emtel is just 5 mins away up the road, and the service staff over there know their job. Finally, after roughly 10 mins of paper work, activation and small chit-chat, the S3 came to life on the mobile network. Owning a smart-phone now and knowing that my BWE would like to interact more on social networks away from home, especially to upload pictures and provide local 'check-ins', I activated a data package for her in advance, too. Even that it is Saturday, everything was already done and ready to be used. Nice bonus: The Emtel clerk directly offered me to set up the configuration for the Emtel data services, yes sure, go ahead, this saves me to search for that in the settings. Okay, spoiler-alert here, setting a static APN to access the Emtel network and the internet wouldn't be a challenge. But hey, she already had the phone in her hands and I could keep my eyes on the children. Well done, Emtel! Resume Thanks to the useful software package by Wondershare is was a hands-free experience to transfer all the data from a Nokia mobile on Symbian S60 to a Samsung Galaxy S III on Android Ice Cream Sandwich (ICS). In the future, this wont be a serious issue at all anymore thanks to synchronisation services and cloud storage. And for now, I'm only waiting for the official upgrades for Jelly Bean.

    Read the article

  • When is a Seek not a Seek?

    - by Paul White
    The following script creates a single-column clustered table containing the integers from 1 to 1,000 inclusive. IF OBJECT_ID(N'tempdb..#Test', N'U') IS NOT NULL DROP TABLE #Test ; GO CREATE TABLE #Test ( id INTEGER PRIMARY KEY CLUSTERED ); ; INSERT #Test (id) SELECT V.number FROM master.dbo.spt_values AS V WHERE V.[type] = N'P' AND V.number BETWEEN 1 AND 1000 ; Let’s say we need to find the rows with values from 100 to 170, excluding any values that divide exactly by 10.  One way to write that query would be: SELECT T.id FROM #Test AS T WHERE T.id IN ( 101,102,103,104,105,106,107,108,109, 111,112,113,114,115,116,117,118,119, 121,122,123,124,125,126,127,128,129, 131,132,133,134,135,136,137,138,139, 141,142,143,144,145,146,147,148,149, 151,152,153,154,155,156,157,158,159, 161,162,163,164,165,166,167,168,169 ) ; That query produces a pretty efficient-looking query plan: Knowing that the source column is defined as an INTEGER, we could also express the query this way: SELECT T.id FROM #Test AS T WHERE T.id >= 101 AND T.id <= 169 AND T.id % 10 > 0 ; We get a similar-looking plan: If you look closely, you might notice that the line connecting the two icons is a little thinner than before.  The first query is estimated to produce 61.9167 rows – very close to the 63 rows we know the query will return.  The second query presents a tougher challenge for SQL Server because it doesn’t know how to predict the selectivity of the modulo expression (T.id % 10 > 0).  Without that last line, the second query is estimated to produce 68.1667 rows – a slight overestimate.  Adding the opaque modulo expression results in SQL Server guessing at the selectivity.  As you may know, the selectivity guess for a greater-than operation is 30%, so the final estimate is 30% of 68.1667, which comes to 20.45 rows. The second difference is that the Clustered Index Seek is costed at 99% of the estimated total for the statement.  For some reason, the final SELECT operator is assigned a small cost of 0.0000484 units; I have absolutely no idea why this is so, or what it models.  Nevertheless, we can compare the total cost for both queries: the first one comes in at 0.0033501 units, and the second at 0.0034054.  The important point is that the second query is costed very slightly higher than the first, even though it is expected to produce many fewer rows (20.45 versus 61.9167). If you run the two queries, they produce exactly the same results, and both complete so quickly that it is impossible to measure CPU usage for a single execution.  We can, however, compare the I/O statistics for a single run by running the queries with STATISTICS IO ON: Table '#Test'. Scan count 63, logical reads 126, physical reads 0. Table '#Test'. Scan count 01, logical reads 002, physical reads 0. The query with the IN list uses 126 logical reads (and has a ‘scan count’ of 63), while the second query form completes with just 2 logical reads (and a ‘scan count’ of 1).  It is no coincidence that 126 = 63 * 2, by the way.  It is almost as if the first query is doing 63 seeks, compared to one for the second query. In fact, that is exactly what it is doing.  There is no indication of this in the graphical plan, or the tool-tip that appears when you hover your mouse over the Clustered Index Seek icon.  To see the 63 seek operations, you have click on the Seek icon and look in the Properties window (press F4, or right-click and choose from the menu): The Seek Predicates list shows a total of 63 seek operations – one for each of the values from the IN list contained in the first query.  I have expanded the first seek node to show the details; it is seeking down the clustered index to find the entry with the value 101.  Each of the other 62 nodes expands similarly, and the same information is contained (even more verbosely) in the XML form of the plan. Each of the 63 seek operations starts at the root of the clustered index B-tree and navigates down to the leaf page that contains the sought key value.  Our table is just large enough to need a separate root page, so each seek incurs 2 logical reads (one for the root, and one for the leaf).  We can see the index depth using the INDEXPROPERTY function, or by using the a DMV: SELECT S.index_type_desc, S.index_depth FROM sys.dm_db_index_physical_stats ( DB_ID(N'tempdb'), OBJECT_ID(N'tempdb..#Test', N'U'), 1, 1, DEFAULT ) AS S ; Let’s look now at the Properties window when the Clustered Index Seek from the second query is selected: There is just one seek operation, which starts at the root of the index and navigates the B-tree looking for the first key that matches the Start range condition (id >= 101).  It then continues to read records at the leaf level of the index (following links between leaf-level pages if necessary) until it finds a row that does not meet the End range condition (id <= 169).  Every row that meets the seek range condition is also tested against the Residual Predicate highlighted above (id % 10 > 0), and is only returned if it matches that as well. You will not be surprised that the single seek (with a range scan and residual predicate) is much more efficient than 63 singleton seeks.  It is not 63 times more efficient (as the logical reads comparison would suggest), but it is around three times faster.  Let’s run both query forms 10,000 times and measure the elapsed time: DECLARE @i INTEGER, @n INTEGER = 10000, @s DATETIME = GETDATE() ; SET NOCOUNT ON; SET STATISTICS XML OFF; ; WHILE @n > 0 BEGIN SELECT @i = T.id FROM #Test AS T WHERE T.id IN ( 101,102,103,104,105,106,107,108,109, 111,112,113,114,115,116,117,118,119, 121,122,123,124,125,126,127,128,129, 131,132,133,134,135,136,137,138,139, 141,142,143,144,145,146,147,148,149, 151,152,153,154,155,156,157,158,159, 161,162,163,164,165,166,167,168,169 ) ; SET @n -= 1; END ; PRINT DATEDIFF(MILLISECOND, @s, GETDATE()) ; GO DECLARE @i INTEGER, @n INTEGER = 10000, @s DATETIME = GETDATE() ; SET NOCOUNT ON ; WHILE @n > 0 BEGIN SELECT @i = T.id FROM #Test AS T WHERE T.id >= 101 AND T.id <= 169 AND T.id % 10 > 0 ; SET @n -= 1; END ; PRINT DATEDIFF(MILLISECOND, @s, GETDATE()) ; On my laptop, running SQL Server 2008 build 4272 (SP2 CU2), the IN form of the query takes around 830ms and the range query about 300ms.  The main point of this post is not performance, however – it is meant as an introduction to the next few parts in this mini-series that will continue to explore scans and seeks in detail. When is a seek not a seek?  When it is 63 seeks © Paul White 2011 email: [email protected] twitter: @SQL_kiwi

    Read the article

  • DTracing TCP congestion control

    - by user12820842
    In a previous post, I showed how we can use DTrace to probe TCP receive and send window events. TCP receive and send windows are in effect both about flow-controlling how much data can be received - the receive window reflects how much data the local TCP is prepared to receive, while the send window simply reflects the size of the receive window of the peer TCP. Both then represent flow control as imposed by the receiver. However, consider that without the sender imposing flow control, and a slow link to a peer, TCP will simply fill up it's window with sent segments. Dealing with multiple TCP implementations filling their peer TCP's receive windows in this manner, busy intermediate routers may drop some of these segments, leading to timeout and retransmission, which may again lead to drops. This is termed congestion, and TCP has multiple congestion control strategies. We can see that in this example, we need to have some way of adjusting how much data we send depending on how quickly we receive acknowledgement - if we get ACKs quickly, we can safely send more segments, but if acknowledgements come slowly, we should proceed with more caution. More generally, we need to implement flow control on the send side also. Slow Start and Congestion Avoidance From RFC2581, let's examine the relevant variables: "The congestion window (cwnd) is a sender-side limit on the amount of data the sender can transmit into the network before receiving an acknowledgment (ACK). Another state variable, the slow start threshold (ssthresh), is used to determine whether the slow start or congestion avoidance algorithm is used to control data transmission" Slow start is used to probe the network's ability to handle transmission bursts both when a connection is first created and when retransmission timers fire. The latter case is important, as the fact that we have effectively lost TCP data acts as a motivator for re-probing how much data the network can handle from the sending TCP. The congestion window (cwnd) is initialized to a relatively small value, generally a low multiple of the sending maximum segment size. When slow start kicks in, we will only send that number of bytes before waiting for acknowledgement. When acknowledgements are received, the congestion window is increased in size until cwnd reaches the slow start threshold ssthresh value. For most congestion control algorithms the window increases exponentially under slow start, assuming we receive acknowledgements. We send 1 segment, receive an ACK, increase the cwnd by 1 MSS to 2*MSS, send 2 segments, receive 2 ACKs, increase the cwnd by 2*MSS to 4*MSS, send 4 segments etc. When the congestion window exceeds the slow start threshold, congestion avoidance is used instead of slow start. During congestion avoidance, the congestion window is generally updated by one MSS for each round-trip-time as opposed to each ACK, and so cwnd growth is linear instead of exponential (we may receive multiple ACKs within a single RTT). This continues until congestion is detected. If a retransmit timer fires, congestion is assumed and the ssthresh value is reset. It is reset to a fraction of the number of bytes outstanding (unacknowledged) in the network. At the same time the congestion window is reset to a single max segment size. Thus, we initiate slow start until we start receiving acknowledgements again, at which point we can eventually flip over to congestion avoidance when cwnd ssthresh. Congestion control algorithms differ most in how they handle the other indication of congestion - duplicate ACKs. A duplicate ACK is a strong indication that data has been lost, since they often come from a receiver explicitly asking for a retransmission. In some cases, a duplicate ACK may be generated at the receiver as a result of packets arriving out-of-order, so it is sensible to wait for multiple duplicate ACKs before assuming packet loss rather than out-of-order delivery. This is termed fast retransmit (i.e. retransmit without waiting for the retransmission timer to expire). Note that on Oracle Solaris 11, the congestion control method used can be customized. See here for more details. In general, 3 or more duplicate ACKs indicate packet loss and should trigger fast retransmit . It's best not to revert to slow start in this case, as the fact that the receiver knew it was missing data suggests it has received data with a higher sequence number, so we know traffic is still flowing. Falling back to slow start would be excessive therefore, so fast recovery is used instead. Observing slow start and congestion avoidance The following script counts TCP segments sent when under slow start (cwnd ssthresh). #!/usr/sbin/dtrace -s #pragma D option quiet tcp:::connect-request / start[args[1]-cs_cid] == 0/ { start[args[1]-cs_cid] = 1; } tcp:::send / start[args[1]-cs_cid] == 1 && args[3]-tcps_cwnd tcps_cwnd_ssthresh / { @c["Slow start", args[2]-ip_daddr, args[4]-tcp_dport] = count(); } tcp:::send / start[args[1]-cs_cid] == 1 && args[3]-tcps_cwnd args[3]-tcps_cwnd_ssthresh / { @c["Congestion avoidance", args[2]-ip_daddr, args[4]-tcp_dport] = count(); } As we can see the script only works on connections initiated since it is started (using the start[] associative array with the connection ID as index to set whether it's a new connection (start[cid] = 1). From there we simply differentiate send events where cwnd ssthresh (congestion avoidance). Here's the output taken when I accessed a YouTube video (where rport is 80) and from an FTP session where I put a large file onto a remote system. # dtrace -s tcp_slow_start.d ^C ALGORITHM RADDR RPORT #SEG Slow start 10.153.125.222 20 6 Slow start 138.3.237.7 80 14 Slow start 10.153.125.222 21 18 Congestion avoidance 10.153.125.222 20 1164 We see that in the case of the YouTube video, slow start was exclusively used. Most of the segments we sent in that case were likely ACKs. Compare this case - where 14 segments were sent using slow start - to the FTP case, where only 6 segments were sent before we switched to congestion avoidance for 1164 segments. In the case of the FTP session, the FTP data on port 20 was predominantly sent with congestion avoidance in operation, while the FTP session relied exclusively on slow start. For the default congestion control algorithm - "newreno" - on Solaris 11, slow start will increase the cwnd by 1 MSS for every acknowledgement received, and by 1 MSS for each RTT in congestion avoidance mode. Different pluggable congestion control algorithms operate slightly differently. For example "highspeed" will update the slow start cwnd by the number of bytes ACKed rather than the MSS. And to finish, here's a neat oneliner to visually display the distribution of congestion window values for all TCP connections to a given remote port using a quantization. In this example, only port 80 is in use and we see the majority of cwnd values for that port are in the 4096-8191 range. # dtrace -n 'tcp:::send { @q[args[4]-tcp_dport] = quantize(args[3]-tcps_cwnd); }' dtrace: description 'tcp:::send ' matched 10 probes ^C 80 value ------------- Distribution ------------- count -1 | 0 0 |@@@@@@ 5 1 | 0 2 | 0 4 | 0 8 | 0 16 | 0 32 | 0 64 | 0 128 | 0 256 | 0 512 | 0 1024 | 0 2048 |@@@@@@@@@ 8 4096 |@@@@@@@@@@@@@@@@@@@@@@@@@@ 23 8192 | 0

    Read the article

< Previous Page | 82 83 84 85 86 87 88 89 90 91 92 93  | Next Page >