Search Results

Search found 11587 results on 464 pages for 'pseudo random numbers'.

Page 233/464 | < Previous Page | 229 230 231 232 233 234 235 236 237 238 239 240  | Next Page >

  • The Partner Perspective from Oracle OpenWorld 2012 - IDC’s Darren Bibby report

    - by Richard Lefebvre
    Below is IDC’s Darren Bibby report on ‘The Partner Perspective from Oracle OpenWorld 2012’. If you missed the 2012 edition, I trust this will give you the willingness to attend next year one! October 26, 2012 I attended my fourth Oracle OpenWorld earlier in October. I always go in with the lens of, "What's in it for partners this year?" Although it's primarily thought of as a customer event - and yes, the bulk of the almost 50,000 attendees are customers - this year's conference was clearly the largest and most important partner event Oracle has ever run. Oracle PartnerNetwork (OPN) Exchange There were more partner attendees than ever, with Oracle citing somewhere around 5000. But the format for partners this year was different. And it was better. Traditionally, Oracle hosts a one-day only Partner Forum on the Sunday before the customer-focused conference begins. This year, the partner content still began on the Sunday, but the worldwide alliances and channels group created an exclusive track throughout the week, just for partners. It featured content specifically targeted towards partners, and was anchored at a nearby hotel. This was a great move for Oracle. The Oracle PartnerNetwork (OPN) team has been in a tricky position for years in that they have enough partners that they need a landmark event in the year, but perhaps not enough to justify a separate, worldwide, large, partner-only event. Coinciding a four day event with Oracle OpenWorld, where anybody who's anybody in the Oracle world attends anyway, is a good solution. The channels leadership team can build from this success for an even better conference next year. It's expected that they will follow a similar strategy. Cloud Announcements for Partners As for the content, it was primarily about the Cloud. For customers, for VARs, for ISVs, for everyone. There were five key Cloud related announcements for partners at the event: Cloud Builder Specialization. This is one of the first broader Specializations that isn't focused on one unique product. It is a designation for partners that offer design and implementation services for private cloud solutions. As such, it will surely be something that nearly every partner will consider, and many will pursue. New Specializations for Cloud Services. Unlike the broad, almost "strategy-level" Specialization above, there are a group of new product-based "merit badges" for many of the new Cloud offerings. Think about a Specialization for the Cloud version of HCM, for instance. Each of these particular specializations will also have Rapid Start implementation methodologies that allow a partner to offer a fixed scope and fixed price bid to customers. Based on the learnings from Oracle Consulting, this means a partner might be able to deliver Cloud HCM in six weeks for a fixed price. In the end, this means more consistent experiences for Oracle customers. Cloud Resale Program. For those partners who achieve one of these Cloud Specializations, it will mean they can actually resell the subscription-based Cloud product. This is important because it has been somewhat of a rarity in the emerging Cloud channel for partners to be able to "take the paper", take the revenue, do the billing, be first line of support etc. This is an important step for Oracle and one the partners will be happy to see. Cloud Referral Program. For those partners who are not as engaged with these specific Cloud products that the Specializations revolve around, there is a new referral program that provides an incentive to recommend Oracle Cloud products. This one-two punch of referral and resale programs is similar in many ways to other vendors who allow more committed partners to resell, while more casual partners can collect fees. It's the model that seems to work. The key to allow a company to resell a subscription product - something that is inherently delivered directly between the vendor and customer - is trust. Achieving a specialization is a good bar to have to meet. Platform as a Service for ISVs. Leveraging some of the overall announcements made by CEO Larry Ellison around a cloud version of its famous database, Oracle also outlined a new ability for ISVs to build cloud services on its new PaaS offering. Details were less available for this announcement, though it's an expected and fitting play for ISVs comfortable with Oracle technology who can now more easily build out cloud applications. There wasn't much talk of an app store to go along with this, but surely it's in the works. Specializations And "The Gap" Coming back to Specializations, Oracle PartnerNetwork (OPN) has 4600 partners worldwide that hold 20,000 Specializations. These are impressive numbers just three years into the new OPN framework. The actual number of Specializations has also grown significantly, up to 111 today and soon around 125 or so with the new Cloud designations. Oracle may need to look at grouping some of these and creating higher level, broader designations that partners could achieve by earning several Specializations in that group. At 125 and growing, this is a lot. On the top of the pyramid, Hitachi Ltd. successfully became the eleventh partner to make it to the highly prestigious Diamond level. Partner programs partially exist in order to recognize capable partners. And it's more than abundantly clear that the Diamond level does this. But I think Oracle has a gap. Specializations show capability in a very specific product area, and all sizes of partners can achieve these. The next level at which to show a level of expertise is the Advanced Specialization. However, this is a massive step up from the regular Specialization. The advanced level requires 50 people to have certification in that particular product area. Most other industry programs have similar higher level statuses, but none are even close to that number. Whereas a customer who sees an Oracle partner with an advanced specialization can be very sure of capability, there is a gap in that there are hundreds or even thousands of 20-50 person solution providers who are top notch in their area of expertise. They will never get to Advanced due to numbers alone. These boutique partners don't really have a way of showing off their talents in the current program. Advanced may not need to be so high to really show that a company has deep expertise. Overall it was a very successful Oracle OpenWorld for Oracle partners of all sizes. There was progress made on making it a bigger and more relevant event. And also on catching up and maybe even leading in some cases with cloud opportunities for partners.

    Read the article

  • Interface contracts – forcing code contracts through interfaces

    - by DigiMortal
    Sometimes we need a way to make different implementations of same interface follow same rules. One option is to duplicate contracts to all implementation but this is not good option because we have duplicated code then. The other option is to force contracts to all implementations at interface level. In this posting I will show you how to do it using interface contracts and contracts class. Using code from previous example about unit testing code with code contracts I will go further and force contracts at interface level. Here is the code from previous example. Take a careful look at it because I will talk about some modifications to this code soon. public interface IRandomGenerator {     int Next(int min, int max); }   public class RandomGenerator : IRandomGenerator {     private Random _random = new Random();       public int Next(int min, int max)     {         return _random.Next(min, max);     } }    public class Randomizer {     private IRandomGenerator _generator;       private Randomizer()     {         _generator = new RandomGenerator();     }       public Randomizer(IRandomGenerator generator)     {         _generator = generator;     }       public int GetRandomFromRangeContracted(int min, int max)     {         Contract.Requires<ArgumentOutOfRangeException>(             min < max,             "Min must be less than max"         );           Contract.Ensures(             Contract.Result<int>() >= min &&             Contract.Result<int>() <= max,             "Return value is out of range"         );           return _generator.Next(min, max);     } } If we look at the GetRandomFromRangeContracted() method we can see that contracts set in this method are applicable to all implementations of IRandomGenerator interface. Although we can write new implementations as we want these implementations need exactly the same contracts. If we are using generators somewhere else then code contracts are not with them anymore. To solve the problem we will force code contracts at interface level. NB! To make the following code work you must enable Contract Reference Assembly building from project settings. Interface contracts and contracts class Interface contains no code – only definitions of members that implementing type must have. But code contracts must be defined in body of member they are part of. To get over this limitation, code contracts are defined in separate contracts class. Interface is bound to this class by special attribute and contracts class refers to interface through special attribute. Here is the IRandomGenerator with contracts and contracts class. Also I write simple fake so we can test contracts easily based only on interface mock. [ContractClass(typeof(RandomGeneratorContracts))] public interface IRandomGenerator {     int Next(int min, int max); }   [ContractClassFor(typeof(IRandomGenerator))] internal sealed class RandomGeneratorContracts : IRandomGenerator {     int IRandomGenerator.Next(int min, int max)     {         Contract.Requires<ArgumentOutOfRangeException>(                 min < max,                 "Min must be less than max"             );           Contract.Ensures(             Contract.Result<int>() >= min &&             Contract.Result<int>() <= max,             "Return value is out of range"         );           return default(int);     } }   public class RandomFake : IRandomGenerator {     private int _testValue;       public RandomGen(int testValue)     {         _testValue = testValue;     }       public int Next(int min, int max)     {         return _testValue;     } } To try out these changes use the following code. var gen = new RandomFake(3);   try {     gen.Next(10, 1); } catch(Exception ex) {     Debug.WriteLine(ex.Message); }   try {     gen.Next(5, 10); } catch(Exception ex) {     Debug.WriteLine(ex.Message); } Now we can force code contracts to all types that implement our IRandomGenerator interface and we must test only the interface to make sure that contracts are defined correctly.

    Read the article

  • Video on Architecture and Code Quality using Visual Studio 2012&ndash;interview with Marcel de Vries and Terje Sandstrom by Adam Cogan

    - by terje
    Find the video HERE. Adam Cogan did a great Web TV interview with Marcel de Vries and myself on the topics of architecture and code quality.  It was real fun participating in this session.  Although we know each other from the MVP ALM community,  Marcel, Adam and I haven’t worked together before. It was very interesting to see how we agreed on so many terms, and how alike we where thinking.  The basics of ensuring you have a good architecture and how you could document it is one thing.  Also, the same agreement on the importance of having a high quality code base, and how we used the Visual Studio 2012 tools, and some others (NDepend for example)  to measure and ensure that the code quality was where it should be.  As the tools, methods and thinking popped up during the interview it was a lot of “Hey !  I do that too!”.  The tools are not only for “after the fact” work, but we use them during the coding.  That way the tools becomes an integrated part of our coding work, and helps us to find issues we may have overlooked.  The video has a bunch of call outs, pinpointing important things to remember. These are also listed on the corresponding web page. I haven’t seen that touch before, but really liked this way of doing it – it makes it much easier to spot the highlights.  Titus Maclaren and Raj Dhatt from SSW have done a terrific job producing this video.  And thanks to Lei Xu for doing the camera and recording job.  Thanks guys ! Also, if you are at TechEd Amsterdam 2012, go and listen to Adam Cogan in his session on “A modern architecture review: Using the new code review tools” Friday 29th, 10.15-11.30 and Marcel de Vries session on “Intellitrace, what is it and how can I use it to my benefit” Wednesday 27th, 5-6.15 The highlights points out some important practices.  I’ll elaborate on a few of them here: Add instructions on how to compile the solution.  You do this by adding a text file with instructions to the solution, and keep it under source control.  These instructions should contain what is needed on top of a standard install of Visual Studio.  I do a lot of code reviews, and more often that not, I am not even able to compile the program, because they have used some tool or library that needs to be installed.  The same applies to any new developer who enters into the team, so do this to increase your productivity when the team changes, or a team member switches computer. Don’t forget to document what you have to configure on the computer, the IIS being a common one. The more automatic you can do this, the better.  Use NuGet to get down libraries. When the text document gets more than say, half a page, with a bunch of different things to do, convert it into a powershell script instead.  The metrics warning levels.  These are very conservatively set by Microsoft.  You rarely see anything but green, and besides, you should have color scales for each of the metrics.  I have a blog post describing a more appropriate set of levels, based on both research work and industry “best practices”.  The essential limits are: Cyclomatic complexity and coupling:  Higher numbers are worse On method levels: Green :  From 0 to 10 Yellow:  From 10 to 20  (some say 15).   Acceptable, but have a look to see if there is something unneeded here. Red: From 20 to 40:   Action required, get these down. Bleeding Red: Above 40   This is the real red alert.  Immediate action!  (My invention, as people have asked what do I do when I have cyclomatic complexity of 150.  The only answer I could think of was: RUN! ) Maintainability index:  Lower numbers are worse, scale from 0 to 100. On method levels: Green:  60 to 100 Yellow:  40 – 60.    You will always have methods here too, accept the higher ones, take a look at those who are down to the lower limit.  Check up against the other metrics.) Red:  20 – 40:  Action required, fix these. Bleeding red:  Below 20.  Immediate action required. When doing metrics analysis, you should leave the generated code out.  You do this by adding attributes, unfortunately Microsoft has “forgotten” to add these to all their stuff, so you might have to add them to some of the code.  It most cases it can be done so that it is not overwritten by a new round of code generation.  Take a look a my blog post here for details on how to do that. Class level metrics might also be useful, at least for coupling and maintenance.  But it is much more difficult to set any fixed limits on those.  Any metric aggregations on higher level tend to be pretty useless, as the number of methods vary pretty much, and there are little science on what number of methods can be regarded as good or bad.  NDepend have a recommendation, but they say it may vary too.  And in these days of data binding, the number might be pretty high, as properties counts as methods.  However, if you take the worst case situations, classes with more than 20 methods are suspicious, and coupling and cyclomatic complexity go red above 20, so any classes with more than 20x20 = 400 for these measures should be checked over. In the video we mention the SOLID principles, coined by “Uncle Bob” (Richard Martin). One of them, the Dependency Inversion principle we discuss in the video.  It is important to note that this principle is NOT on whether you should use a Dependency Inversion Container or not, it is about how you design the interfaces and interactions between your classes.  The Dependency Inversion Container is just one technique which is based on this principle, but which main purpose is to isolate things you would like to change at runtime, for example if you implement a plug in architecture.  Overuse of a Dependency Inversion Container is however, NOT a good thing.  It should be used for a purpose and not as a general DI solution.  The general DI solution and thinking however is useful far beyond the DIC.   You should always “program to an abstraction”, and not to the concreteness.  We also talk a bit about the GRASP patterns, a term coined by Craig Larman in his book Applying UML and design patterns. GRASP patterns stand for General Responsibility Assignment Software Patterns and describe fundamental principles of object design and responsibility assignment.  What I find great with these patterns is that they is another way to focus on the responsibility of a class.  One of the things I most often found that is broken in software designs, is that the class lack responsibility, and as a result there are a lot of classes mucking around in the internals of the other classes.  We also discuss the term “Code Smells”.  This term was invented by Kent Beck and Martin Fowler when they worked with Fowler’s “Refactoring” book. A code smell is a set of “bad” coding practices, which are the drivers behind a corresponding set of refactorings.  Here is a good list of the smells, and their corresponding refactor patterns. See also this.

    Read the article

  • CodePlex Daily Summary for Sunday, June 06, 2010

    CodePlex Daily Summary for Sunday, June 06, 2010New ProjectsActive Worlds Dot Net Wrapper (Based on AwSdk): Active Worlds Dot Net Wrapper (Based on AwSdk)Combina: Smart calculator for large combinatorial calculations.Concurrent Cache: ConcurrentCache is a smart output cache library extending OutputCacheProvider. It consists of in memory, cache files and compressed files modes and...Decay: Personal use. For learningFazTalk: FazTalk is a suite of tools and products that are designed to improve collaboration and workflow interactions. FazTalk takes an innovative approach...grouped: A peer to peer text editor, written in C# [update] I wrote this little thing a while back and even forgot about it, I stopped coding for more tha...HitchARide MVC 2 Sample: An MVC 2 sample written as part of the Microsoft 2010 London Web Camp based on the wireframes at http://schematics.earthware.co.uk/hitcharide. Not...Inspiration.Web: Description: A simple (but entertaining) ASP.NET MVC (C#) project to suggest random code names for projects. Intended audience: People who ne...NetFileBrowser - TinyMCE: tinyMCE file plugin with asp.netOil Slick Live Feeds: All live feeds from BP's Remotely Operated VehiclesParticle Lexer: Parser and Tokenizer libraryPdf Form Tool: Pdf Form Tool demonstrates how the iTextSharp library could be used to fill PDF forms. The input data is provided as a csv file. The application ...Planning Poker Windows Mobile 7: This project is a Planning Poker application for Windows Mobile 7 (and later?). RandomRat: RandomRat is a program for generating random sets that meet specific criteriaScience.NET: A scientific library written in managed code. It supports advanced mathematics (algebra system, sequences, statistics, combinatorics...), data stru...Spider Compiler: Spider Compiler parses the input of a spider programming source file and compiles it (with help of csc.exe; the C#-Compiler) to an exe-file. This p...Sununpro: sunun's project for study by team foundation server.TFS Buddy: An application that manipulates your I-Buddy whenever something happens in your Team Foundation ServerValveSoft: ValveSysWiiMote Physics: WiiMote Physics is an application that allows you to retrieve data from your WiiMote or Balance Board and display it in real-time. It has a number...WinGet: WinGet is a download manager for Windows. You can drag links onto the WinGet Widget and it will download a file on the selected folder. It is dev...XProject.NET: A project management and team collaboration platformNew Releases.NET DiscUtils: Version 0.9 Preview: This release is still under development. New features available in this release: Support for accessing short file names stored in WIM files Incr...Active Worlds Dot Net Wrapper (Based on AwSdk): Active World Dot Net Wrapper (0.0.1.85): Based on AwSdk 85AwSdk UnOfficial Wrapper Howto Use: C# using AwWrapper; VB.Net Import AwWrapperAjaxControlToolkit additional extenders: ZhecheAjaxControls for .NET3.5: Used AJAX Control Toolkit Release Notes - April 12th 2010 Release Version 40412. Fixed deadlock in long operation canceling Some other fixesAnyCAD: AnyCAD.v1.2.ENU.Install: http://www.anycad.net Parametric Modeling *3D: Sphere, Box, Cylinder, Cone •2D: Line, Rectangle, Arc, Arch, Circle, Spline, Polygon •Feature: Extr...Community Forums NNTP bridge: Community Forums NNTP Bridge V29: Release of the Community Forums NNTP Bridge to access the social and anwsers MS forums with a single, open source NNTP bridge. This release has ad...Concurrent Cache: 1.0: This is the first release for the ConcurrentCache library.Configuration Section Designer: 2.0.0: This is the first Beta Release for VS 2010 supportDoxygen Browser Addin for VS: Doxygen Browser Addin - v0.1.4 Beta: Support for Visual Studio 2010 improved the logging of errors (Event Logs) Fixed some issues/bugs Hot key for navigation "Control + F1, Contr...Folder Bookmarks: Folder Bookmarks 1.6.2: The latest version of Folder Bookmarks (1.6.2), with new UI changes. Once you have extracted the file, do not delete any files/folders. They are n...HERB.IQ: Beta 0.1 Source code release 5: Beta 0.1 Source code release 5Inspiration.Web: Initial release (deployment package): Initial release (deployment package)NetFileBrowser - TinyMCE: Demo Project: Demo ProjectNetFileBrowser - TinyMCE: NetFileBrowser: NetImageBrowserNLog - Advanced .NET Logging: Nightly Build 2010.06.05.001: Changes since the last build:2010-06-04 23:29:42 Jarek Kowalski Massive update to documentation generator. 2010-05-28 15:41:42 Jarek Kowalski upda...Oil Slick Live Feeds: Oil Slick Live Feeds 0.1: A the first release, with feeds from the MS Skandi, Boa Deep C, Enterprise and Q4000. They are live streams from the ROV's monitoring the damaged...Pcap.Net: Pcap.Net 0.7.0 (46671): Pcap.Net - June 2010 Release Pcap.Net is a .NET wrapper for WinPcap written in C++/CLI and C#. It Features almost all WinPcap features and includes...sqwarea: Sqwarea 0.0.289.0 (alpha): API supportTFS Buddy: TFS Buddy First release (Beta 1): This is the first release of the TFS Buddy.Visual Studio DSite: Looping Animation (Visual C++ 2008): A solider firing a bullet that loops and displays an explosion everytime it hits the edge of the form.WiiMote Physics: WiiMote Physics v4.0: v4.0.0.1 Recovered from existing compiled assembly after hard drive failure Now requires .NET 4.0 (it seems to make it run faster) Added new c...WinGet: Alpha 1: First Alpha of WinGet. It includes all the planned features but it contains many bugs. Packaged using 7-Zip and ClickOnce.Most Popular ProjectsWBFS ManagerRawrAJAX Control ToolkitMicrosoft SQL Server Product Samples: DatabaseSilverlight ToolkitWindows Presentation Foundation (WPF)PHPExcelpatterns & practices – Enterprise LibraryMicrosoft SQL Server Community & SamplesASP.NETMost Active ProjectsCommunity Forums NNTP bridgeRawrpatterns & practices – Enterprise LibraryGMap.NET - Great Maps for Windows Forms & PresentationN2 CMSIonics Isapi Rewrite FilterStyleCopsmark C# LibraryFarseer Physics Enginepatterns & practices: Composite WPF and Silverlight

    Read the article

  • Giving a Bomberman AI intelligent bomb placement

    - by Paul Manta
    I'm trying to implement an AI algorithm for Bomberman. Currently I have a working but not very smart rudimentary implementation (the current AI is overzealous in placing bombs). This is the first AI I've ever tried implementing and I'm a bit stuck. The more sophisticated algorithms I have in mind (the ones that I expect to make better decisions) are too convoluted to be good solutions. What general tips do you have for implementing a Bomberman AI? Are there radically different approaches for making the bot either more defensive or offensive? Edit: Current algorithm My current algorithm goes something like this (pseudo-code): 1) Try to place a bomb and then find a cell that is safe from all the bombs, including the one that you just placed. To find that cell, iterate over the four directions; if you can find any safe divergent cell and reach it in time (eg. if the direction is up or down, look for a cell that is found to the left or right of this path), then it's safe to place a bomb and move in that direction. 2) If you can't find and safe divergent cells, try NOT placing a bomb and look again. This time you'll only need to look for a safe cell in only one direction (you don't have to diverge from it). 3) If you still can't find a safe cell, don't do anything. for $(direction) in (up, down, left, right): place bomb at current location if (can find and reach divergent safe cell in current $(direction)): bomb = true move = $(direction) return for $(direction) in (up, down, left, right): do not place bomb at current location if (any safe cell in the current $(direction)): bomb = false move = $(direction) return else: bomb = false move = stay_put This algorithm makes the bot very trigger-happy (it'll place bombs very frequently). It doesn't kill itself, but it does have a habit of making itself vulnerable by going into dead ends where it can be blocked and killed by the other players. Do you have any suggestions on how I might improve this algorithm? Or maybe I should try something completely different? One of the problems with this algorithm is that it tends to leave the bot with very few (frequently just one) safe cells on which it can stand. This is because the bot leaves a trail of bombs behind it, as long as it doesn't kill itself. However, leaving a trail of bombs behind leaves few places where you can hide. If one of the other players or bots decide to place a bomb somewhere near you, it often happens that you have no place to hide and you die. I need a better way to decide when to place bombs.

    Read the article

  • Use Autoruns to Manually Clean an Infected PC

    - by Mark Virtue
    There are many anti-malware programs out there that will clean your system of nasties, but what happens if you’re not able to use such a program?  Autoruns, from SysInternals (recently acquired by Microsoft), is indispensable when removing malware manually. There are a few reasons why you may need to remove viruses and spyware manually: Perhaps you can’t abide running resource-hungry and invasive anti-malware programs on your PC You might need to clean your mom’s computer (or someone else who doesn’t understand that a big flashing sign on a website that says “Your computer is infected with a virus – click HERE to remove it” is not a message that can necessarily be trusted) The malware is so aggressive that it resists all attempts to automatically remove it, or won’t even allow you to install anti-malware software Part of your geek credo is the belief that anti-spyware utilities are for wimps Autoruns is an invaluable addition to any geek’s software toolkit.  It allows you to track and control all programs (and program components) that start automatically with Windows (or with Internet Explorer).  Virtually all malware is designed to start automatically, so there’s a very strong chance that it can be detected and removed with the help of Autoruns. We have covered how to use Autoruns in an earlier article, which you should read if you need to first familiarize yourself with the program. Autoruns is a standalone utility that does not need to be installed on your computer.  It can be simply downloaded, unzipped and run (link below).  This makes is ideally suited for adding to your portable utility collection on your flash drive. When you start Autoruns for the first time on a computer, you are presented with the license agreement: After agreeing to the terms, the main Autoruns window opens, showing you the complete list of all software that will run when your computer starts, when you log in, or when you open Internet Explorer: To temporarily disable a program from launching, uncheck the box next to it’s entry.  Note:  This does not terminate the program if it is running at the time – it merely prevents it from starting next time.  To permanently prevent a program from launching, delete the entry altogether (use the Delete key, or right-click and choose Delete from the context-menu)).  Note:  This does not remove the program from your computer – to remove it completely you need to uninstall the program (or otherwise delete it from your hard disk). Suspicious Software It can take a fair bit of experience (read “trial and error”) to become adept at identifying what is malware and what is not.  Most of the entries presented in Autoruns are legitimate programs, even if their names are unfamiliar to you.  Here are some tips to help you differentiate the malware from the legitimate software: If an entry is digitally signed by a software publisher (i.e. there’s an entry in the Publisher column) or has a “Description”, then there’s a good chance that it’s legitimate If you recognize the software’s name, then it’s usually okay.  Note that occasionally malware will “impersonate” legitimate software, but adopting a name that’s identical or similar to software you’re familiar with (e.g. “AcrobatLauncher” or “PhotoshopBrowser”).  Also, be aware that many malware programs adopt generic or innocuous-sounding names, such as “Diskfix” or “SearchHelper” (both mentioned below). Malware entries usually appear on the Logon tab of Autoruns (but not always!) If you open up the folder that contains the EXE or DLL file (more on this below), an examine the “last modified” date, the dates are often from the last few days (assuming that your infection is fairly recent) Malware is often located in the C:\Windows folder or the C:\Windows\System32 folder Malware often only has a generic icon (to the left of the name of the entry) If in doubt, right-click the entry and select Search Online… The list below shows two suspicious looking entries:  Diskfix and SearchHelper These entries, highlighted above, are fairly typical of malware infections: They have neither descriptions nor publishers They have generic names The files are located in C:\Windows\System32 They have generic icons The filenames are random strings of characters If you look in the C:\Windows\System32 folder and locate the files, you’ll see that they are some of the most recently modified files in the folder (see below) Double-clicking on the items will take you to their corresponding registry keys: Removing the Malware Once you’ve identified the entries you believe to be suspicious, you now need to decide what you want to do with them.  Your choices include: Temporarily disable the Autorun entry Permanently delete the Autorun entry Locate the running process (using Task Manager or similar) and terminating it Delete the EXE or DLL file from your disk (or at least move it to a folder where it won’t be automatically started) or all of the above, depending upon how certain you are that the program is malware. To see if your changes succeeded, you will need to reboot your machine, and check any or all of the following: Autoruns – to see if the entry has returned Task Manager (or similar) – to see if the program was started again after the reboot Check the behavior that led you to believe that your PC was infected in the first place.  If it’s no longer happening, chances are that your PC is now clean Conclusion This solution isn’t for everyone and is most likely geared to advanced users. Usually using a quality Antivirus application does the trick, but if not Autoruns is a valuable tool in your Anti-Malware kit. Keep in mind that some malware is harder to remove than others.  Sometimes you need several iterations of the steps above, with each iteration requiring you to look more carefully at each Autorun entry.  Sometimes the instant that you remove the Autorun entry, the malware that is running replaces the entry.  When this happens, we need to become more aggressive in our assassination of the malware, including terminating programs (even legitimate programs like Explorer.exe) that are infected with malware DLLs. Shortly we will be publishing an article on how to identify, locate and terminate processes that represent legitimate programs but are running infected DLLs, in order that those DLLs can be deleted from the system. Download Autoruns from SysInternals Similar Articles Productive Geek Tips Using Autoruns Tool to Track Startup Applications and Add-onsHow To Get Detailed Information About Your PCSUPERAntiSpyware Portable is the Must-Have Spyware Removal Tool You NeedQuick Tip: Windows Vista Temp Files DirectoryClear Recent Commands From the Run Dialog in Windows XP TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips Revo Uninstaller Pro Registry Mechanic 9 for Windows PC Tools Internet Security Suite 2010 PCmover Professional 15 Great Illustrations by Chow Hon Lam Easily Sync Files & Folders with Friends & Family Amazon Free Kindle for PC Download Stretch popurls.com with a Stylish Script (Firefox) OldTvShows.org – Find episodes of Hitchcock, Soaps, Game Shows and more Download Microsoft Office Help tab

    Read the article

  • C#/.NET Little Wonders: The Timeout static class

    - by James Michael Hare
    Once again, in this series of posts I look at the parts of the .NET Framework that may seem trivial, but can help improve your code by making it easier to write and maintain. The index of all my past little wonders posts can be found here. When I started the “Little Wonders” series, I really wanted to pay homage to parts of the .NET Framework that are often small but can help in big ways.  The item I have to discuss today really is a very small item in the .NET BCL, but once again I feel it can help make the intention of code much clearer and thus is worthy of note. The Problem - Magic numbers aren’t very readable or maintainable In my first Little Wonders Post (Five Little Wonders That Make Code Better) I mention the TimeSpan factory methods which, I feel, really help the readability of constructed TimeSpan instances. Just to quickly recap that discussion, ask yourself what the TimeSpan specified in each case below is 1: // Five minutes? Five Seconds? 2: var fiveWhat1 = new TimeSpan(0, 0, 5); 3: var fiveWhat2 = new TimeSpan(0, 0, 5, 0); 4: var fiveWhat3 = new TimeSpan(0, 0, 5, 0, 0); You’d think they’d all be the same unit of time, right?  After all, most overloads tend to tack additional arguments on the end.  But this is not the case with TimeSpan, where the constructor forms are:     TimeSpan(int hours, int minutes, int seconds);     TimeSpan(int days, int hours, int minutes, int seconds);     TimeSpan(int days, int hours, int minutes, int seconds, int milliseconds); Notice how in the 4 and 5 parameter version we suddenly have the parameter days slipping in front of hours?  This can make reading constructors like those above much harder.  Fortunately, there are TimeSpan factory methods to help make your intention crystal clear: 1: // Ah! Much clearer! 2: var fiveSeconds = TimeSpan.FromSeconds(5); These are great because they remove all ambiguity from the reader!  So in short, magic numbers in constructors and methods can be ambiguous, and anything we can do to clean up the intention of the developer will make the code much easier to read and maintain. Timeout – Readable identifiers for infinite timeout values In a similar way to TimeSpan, let’s consider specifying timeouts for some of .NET’s (or our own) many methods that allow you to specify timeout periods. For example, in the TPL Task class, there is a family of Wait() methods that can take TimeSpan or int for timeouts.  Typically, if you want to specify an infinite timeout, you’d just call the version that doesn’t take a timeout parameter at all: 1: myTask.Wait(); // infinite wait But there are versions that take the int or TimeSpan for timeout as well: 1: // Wait for 100 ms 2: myTask.Wait(100); 3:  4: // Wait for 5 seconds 5: myTask.Wait(TimeSpan.FromSeconds(5); Now, if we want to specify an infinite timeout to wait on the Task, we could pass –1 (or a TimeSpan set to –1 ms), which what the .NET BCL methods with timeouts use to represent an infinite timeout: 1: // Also infinite timeouts, but harder to read/maintain 2: myTask.Wait(-1); 3: myTask.Wait(TimeSpan.FromMilliseconds(-1)); However, these are not as readable or maintainable.  If you were writing this code, you might make the mistake of thinking 0 or int.MaxValue was an infinite timeout, and you’d be incorrect.  Also, reading the code above it isn’t as clear that –1 is infinite unless you happen to know that is the specified behavior. To make the code like this easier to read and maintain, there is a static class called Timeout in the System.Threading namespace which contains definition for infinite timeouts specified as both int and TimeSpan forms: Timeout.Infinite An integer constant with a value of –1 Timeout.InfiniteTimeSpan A static readonly TimeSpan which represents –1 ms (only available in .NET 4.5+) This makes our calls to Task.Wait() (or any other calls with timeouts) much more clear: 1: // intention to wait indefinitely is quite clear now 2: myTask.Wait(Timeout.Infinite); 3: myTask.Wait(Timeout.InfiniteTimeSpan); But wait, you may say, why would we care at all?  Why not use the version of Wait() that takes no arguments?  Good question!  When you’re directly calling the method with an infinite timeout that’s what you’d most likely do, but what if you are just passing along a timeout specified by a caller from higher up?  Or perhaps storing a timeout value from a configuration file, and want to default it to infinite? For example, perhaps you are designing a communications module and want to be able to shutdown gracefully, but if you can’t gracefully finish in a specified amount of time you want to force the connection closed.  You could create a Shutdown() method in your class, and take a TimeSpan or an int for the amount of time to wait for a clean shutdown – perhaps waiting for client to acknowledge – before terminating the connection.  So, assume we had a pub/sub system with a class to broadcast messages: 1: // Some class to broadcast messages to connected clients 2: public class Broadcaster 3: { 4: // ... 5:  6: // Shutdown connection to clients, wait for ack back from clients 7: // until all acks received or timeout, whichever happens first 8: public void Shutdown(int timeout) 9: { 10: // Kick off a task here to send shutdown request to clients and wait 11: // for the task to finish below for the specified time... 12:  13: if (!shutdownTask.Wait(timeout)) 14: { 15: // If Wait() returns false, we timed out and task 16: // did not join in time. 17: } 18: } 19: } We could even add an overload to allow us to use TimeSpan instead of int, to give our callers the flexibility to specify timeouts either way: 1: // overload to allow them to specify Timeout in TimeSpan, would 2: // just call the int version passing in the TotalMilliseconds... 3: public void Shutdown(TimeSpan timeout) 4: { 5: Shutdown(timeout.TotalMilliseconds); 6: } Notice in case of this class, we don’t assume the caller wants infinite timeouts, we choose to rely on them to tell us how long to wait.  So now, if they choose an infinite timeout, they could use the –1, which is more cryptic, or use Timeout class to make the intention clear: 1: // shutdown the broadcaster, waiting until all clients ack back 2: // without timing out. 3: myBroadcaster.Shutdown(Timeout.Infinite); We could even add a default argument using the int parameter version so that specifying no arguments to Shutdown() assumes an infinite timeout: 1: // Modified original Shutdown() method to add a default of 2: // Timeout.Infinite, works because Timeout.Infinite is a compile 3: // time constant. 4: public void Shutdown(int timeout = Timeout.Infinite) 5: { 6: // same code as before 7: } Note that you can’t default the ShutDown(TimeSpan) overload with Timeout.InfiniteTimeSpan since it is not a compile-time constant.  The only acceptable default for a TimeSpan parameter would be default(TimeSpan) which is zero milliseconds, which specified no wait, not infinite wait. Summary While Timeout.Infinite and Timeout.InfiniteTimeSpan are not earth-shattering classes in terms of functionality, they do give you very handy and readable constant values that you can use in your programs to help increase readability and maintainability when specifying infinite timeouts for various timeouts in the BCL and your own applications. Technorati Tags: C#,CSharp,.NET,Little Wonders,Timeout,Task

    Read the article

  • Building KPIs to monitor your business Its not really about the Technology

    When I have discussions with people about Business Intelligence, one of the questions the inevitably come up is about building KPIs and how to accomplish that. From a technical level the concept of a KPI is very simple, almost too simple in that it is like the tip of an iceberg floating above the water. The key to that iceberg is not really the tip, but the mass of the iceberg that is hidden beneath the surface upon which the tip sits. The analogy of the iceberg is not meant to indicate that the foundation of the KPI is overly difficult or complex. The disparity in size in meant to indicate that the larger thing that needs to be defined is not the technical tip, but the underlying business definition of what the KPI means. From a technical perspective the KPI consists of primarily the following items: Actual Value This is the actual value data point that is being measured. An example would be something like the amount of sales. Target Value This is the target goal for the KPI. This is a number that can be measured against Actual Value. An example would be $10,000 in monthly sales. Target Indicator Range This is the definition of ranges that define what type of indicator the user will see comparing the Actual Value to the Target Value. Most often this is defined by stoplight, but can be any indicator that is going to show a status in a quick fashion to the user. Typically this would be something like: Red Light = Actual Value more than 5% below target; Yellow Light = Within 5% of target either direction; Green Light = More than 5% higher than Target Value Status\Trend Indicator This is an optional attribute of a KPI that is typically used to show some kind of trend. The vast majority of these indicators are used to show some type of progress against a previous period. As an example, the status indicator might be used to show how the monthly sales compare to last month. With this type of indicator there needs to be not only a definition of what the ranges are for your status indictor, but then also what value the number needs to be compared against. So now we have an idea of what data points a KPI consists of from a technical perspective lets talk a bit about tools. As you can see technically there is not a whole lot to them and the choice of technology is not as important as the definition of the KPIs, which we will get to in a minute. There are many different types of tools in the Microsoft BI stack that you can use to expose your KPI to the business. These include Performance Point, SharePoint, Excel, and SQL Reporting Services. There are pluses and minuses to each technology and the right technology is based a lot on your goals and how you want to deliver the information to the users. Additionally, there are other non-Microsoft tools that can be used to expose KPI indicators to your business users. Regardless of the technology used as your front end, the heavy lifting of KPI is in the business definition of the values and benchmarks for that KPI. The discussion about KPIs is very dependent on the history of an organization and how much they are exposed to the attributes of a KPI. Often times when discussing KPIs with a business contact who has not been exposed to KPIs the discussion tends to also be a session educating the business user about what a KPI is and what goes into the definition of a KPI. The majority of times the business user has an idea of what their actual values are and they have been tracking those numbers for some time, generally in Excel and all manually. So they will know the amount of sales last month along with sales two years ago in the same month. Where the conversation tends to get stuck is when you start discussing what the target value should be. The actual value is answering the What and How much questions. When you are talking about the Target values you are asking the question Is this number good or bad. Typically, the user will know whether or not the value is good or bad, but most of the time they are not able to quantify what is good or bad. Their response is usually something like I just know. Because they have been watching the sales quantity for years now, they can tell you that a 5% decrease in sales this month might actually be a good thing, maybe because the salespeople are all waiting until next month when the new versions come out. It can sometimes be very hard to break the business people of this habit. One of the fears generally is that the status indicator is not subjective. Thus, in the scenario above, the business user is going to be fearful that their boss, just looking at a negative red indicator, is going to haul them out to the woodshed for a bad month. But, on the flip side, if all you are displaying is the amount of sales, only a person with knowledge of last month sales and the target amount for this month would have any idea if $10,000 in sales is good or not. Here is where a key point about KPIs needs to be communicated to both the business user and any user who might be viewing the results of that KPI. The KPI is just one tool that is used to report on business performance. The KPI is meant as a quick indicator of one business statistic. It is not meant to tell the entire story. It does not answer the question Why. Its primary purpose is to objectively and quickly expose an area of the business that might warrant more review. There is always going to be the need to do further analysis on any potential negative or neutral KPI. So, hopefully, once you have convinced your business user to come up with some target numbers and ranges for status indicators, you then need to take the next step and help them answer the Why question. The main question here to ask is, Okay, you see the indicator and you need to discover why the number is what is, where do you go?. The answer is usually a combination of sources. A sales manager might have some of the following items at their disposal (Marketing report showing a decrease in the promotional discounts for the month, Pricing Report showing the reduction of prices of older models, an Inventory Report showing the discontinuation of a particular product line, or a memo showing the ending of a large affiliate partnership. The answers to the question Why are never as simple as a single indicator value. Bring able to quickly get to this information is all about designing how a user accesses the KPIs and then also how easily they can get to the additional information they need. This is where a Dashboard mentality can come in handy. For example, the business user can have a dashboard that shows their KPIs, but also has links to some of the common reports that they run regarding Sales Data. The users boss may have the same KPIs on their dashboard, but instead of links to individual reports they are going to have a link to a status report that was created by the user that pulls together all the data about the KPI in a summary format the users boss can review. So some of the key things to think about when building or evaluating KPIs for your organization: Technology should not be the driving factor KPIs are of little value without some indicator for whether a value is good, bad or neutral. KPIs only give an answer to the Is this number good\bad? question Make sure the ability to drill into the Why of a KPI is close at hand and relevant to the user who is viewing the KPI. The KPI is a key business tool when defined properly to help monitor business performance across the enterprise in an objective and consistent manner. At times it might feel like the process of defining the business aspects of a KPI can sometimes be arduous, the payoff in the end can far outweigh the costs. Some of the benefits of going through this process are a better understanding of the key metrics for an organization and the measure of those metrics and a consistent snapshot of business performance that can be utilized across the organization. And I think that these are benefits to any organization regardless of the technology or the implementation.Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

    Read the article

  • Using MVP, how to create a view from another view, linked with the same model object

    - by Dinaiz
    Background We use the Model-View-Presenter design pattern along with the abstract factory pattern and the "signal/slot" pattern in our application, to fullfill 2 main requirements Enhance testability (very lightweight GUI, every action can be simulated in unit tests) Make the "view" totally independant from the rest, so we can change the actual view implementation, without changing anything else In order to do so our code is divided in 4 layers : Core : which holds the model Presenter : which manages interactions between the view interfaces (see bellow) and the core View Interfaces : they define the signals and slots for a View, but not the implementation Views : the actual implementation of the views When the presenter creates or deals with views, it uses an abstract factory and only knows about the view interfaces. It does the signal/slot binding between views interfaces. It doesn't care about the actual implementation. In the "views" layer, we have a concrete factory which deals with implementations. The signal/slot mechanism is implemented using a custom framework built upon boost::function. Really, what we have is something like that : http://martinfowler.com/eaaDev/PassiveScreen.html Everything works fine. The problem However, there's a problem I don't know how to solve. Let's take for example a very simple drag and drop example. I have two ContainersViews (ContainerView1, ContainerView2). ContainerView1 has an ItemView1. I drag the ItemView1 from ContainerView1 to ContainerView2. ContainerView2 must create an ItemView2, of a different type, but which "points" to the same model object as ItemView1. So the ContainerView2 gets a callback called for the drop action with ItemView1 as a parameter. It calls ContainerPresenterB passing it ItemViewB In this case we are only dealing with views. In MVP-PV, views aren't supposed to know anything about the presenter nor the model, right ? How can I create the ItemView2 from the ItemView1, not knowing which model object is ItemView1 representing ? I thought about adding an "itemId" to every view, this id being the id of the core object the view represents. So in pseudo code, ContainerPresenter2 would do something like itemView2=abstractWidgetFactory.createItemView2(); this.add(itemView2,itemView1.getCoreObjectId()) I don't get too much into details. That just work. The problem I have here is that those itemIds are just like pointers. And pointers can be dangling. Imagine that by mistake, I delete itemView1, and this deletes coreObject1. The itemView2 will have a coreObjectId which represents an invalid coreObject. Isn't there a more elegant and "bulletproof" solution ? Even though I never did ObjectiveC or macOSX programming, I couldn't help but notice that our framework is very similar to Cocoa framework. How do they deal with this kind of problem ? Couldn't find more in-depth information about that on google. If someone could shed some light on this. I hope this question isn't too confusing ...

    Read the article

  • The blocking nature of aggregates

    - by Rob Farley
    I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here.  In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

    Read the article

  • The blocking nature of aggregates

    - by Rob Farley
    I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here.  In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

    Read the article

  • How To Approach 360 Degree Snake

    - by Austin Brunkhorst
    I've recently gotten into XNA and must say I love it. As sort of a hello world game I decided to create the classic game "Snake". The 90 degree version was very simple and easy to implement. But as I try to make a version of it that allows 360 degree rotation using left and right arrows, I've come into sort of a problem. What i'm doing now stems from the 90 degree version: Iterating through each snake body part beginning at the tail, and ending right before the head. This works great when moving every 100 milliseconds. The problem with this is that it makes for a choppy style of gameplay as technically the game progresses at only 6 fps rather than it's potential 60. I would like to move the snake every game loop. But unfortunately because the snake moves at the rate of it's head's size it goes way too fast. This would mean that the head would need to move at a much smaller increment such as (2, 2) in it's direction rather than what I have now (32, 32). Because I've been working on this game off and on for a couple of weeks while managing school I think that I've been thinking too hard on how to accomplish this. It's probably a simple solution, i'm just not catching it. Here's some pseudo code for what I've tried based off of what makes sense to me. I can't really think of another way to do it. for(int i = SnakeLength - 1; i > 0; i--){ current = SnakePart[i], next = SnakePart[i - 1]; current.x = next.x - (current.width * cos(next.angle)); current.y = next.y - (current.height * sin(next.angle)); current.angle = next.angle; } SnakeHead.x += cos(SnakeAngle) * SnakeSpeed; SnakeHead.y += sin(SnakeAngle) * SnakeSpeed; This produces something like this: Code in Action. As you can see each part always stays behind the head and doesn't make a "Trail" effect. A perfect example of what i'm going for can be found here: Data Worm. Not the viewport rotation but the trailing effect of the triangles. Thanks for any help!

    Read the article

  • SortedDictionary and SortedList

    - by Simon Cooper
    Apart from Dictionary<TKey, TValue>, there's two other dictionaries in the BCL - SortedDictionary<TKey, TValue> and SortedList<TKey, TValue>. On the face of it, these two classes do the same thing - provide an IDictionary<TKey, TValue> interface where the iterator returns the items sorted by the key. So what's the difference between them, and when should you use one rather than the other? (as in my previous post, I'll assume you have some basic algorithm & datastructure knowledge) SortedDictionary We'll first cover SortedDictionary. This is implemented as a special sort of binary tree called a red-black tree. Essentially, it's a binary tree that uses various constraints on how the nodes of the tree can be arranged to ensure the tree is always roughly balanced (for more gory algorithmical details, see the wikipedia link above). What I'm concerned about in this post is how the .NET SortedDictionary is actually implemented. In .NET 4, behind the scenes, the actual implementation of the tree is delegated to a SortedSet<KeyValuePair<TKey, TValue>>. One example tree might look like this: Each node in the above tree is stored as a separate SortedSet<T>.Node object (remember, in a SortedDictionary, T is instantiated to KeyValuePair<TKey, TValue>): class Node { public bool IsRed; public T Item; public SortedSet<T>.Node Left; public SortedSet<T>.Node Right; } The SortedSet only stores a reference to the root node; all the data in the tree is accessed by traversing the Left and Right node references until you reach the node you're looking for. Each individual node can be physically stored anywhere in memory; what's important is the relationship between the nodes. This is also why there is no constructor to SortedDictionary or SortedSet that takes an integer representing the capacity; there are no internal arrays that need to be created and resized. This may seen trivial, but it's an important distinction between SortedDictionary and SortedList that I'll cover later on. And that's pretty much it; it's a standard red-black tree. Plenty of webpages and datastructure books cover the algorithms behind the tree itself far better than I could. What's interesting is the comparions between SortedDictionary and SortedList, which I'll cover at the end. As a side point, SortedDictionary has existed in the BCL ever since .NET 2. That means that, all through .NET 2, 3, and 3.5, there has been a bona-fide sorted set class in the BCL (called TreeSet). However, it was internal, so it couldn't be used outside System.dll. Only in .NET 4 was this class exposed as SortedSet. SortedList Whereas SortedDictionary didn't use any backing arrays, SortedList does. It is implemented just as the name suggests; two arrays, one containing the keys, and one the values (I've just used random letters for the values): The items in the keys array are always guarenteed to be stored in sorted order, and the value corresponding to each key is stored in the same index as the key in the values array. In this example, the value for key item 5 is 'z', and for key item 8 is 'm'. Whenever an item is inserted or removed from the SortedList, a binary search is run on the keys array to find the correct index, then all the items in the arrays are shifted to accomodate the new or removed item. For example, if the key 3 was removed, a binary search would be run to find the array index the item was at, then everything above that index would be moved down by one: and then if the key/value pair {7, 'f'} was added, a binary search would be run on the keys to find the index to insert the new item, and everything above that index would be moved up to accomodate the new item: If another item was then added, both arrays would be resized (to a length of 10) before the new item was added to the arrays. As you can see, any insertions or removals in the middle of the list require a proportion of the array contents to be moved; an O(n) operation. However, if the insertion or removal is at the end of the array (ie the largest key), then it's only O(log n); the cost of the binary search to determine it does actually need to be added to the end (excluding the occasional O(n) cost of resizing the arrays to fit more items). As a side effect of using backing arrays, SortedList offers IList Keys and Values views that simply use the backing keys or values arrays, as well as various methods utilising the array index of stored items, which SortedDictionary does not (and cannot) offer. The Comparison So, when should you use one and not the other? Well, here's the important differences: Memory usage SortedDictionary and SortedList have got very different memory profiles. SortedDictionary... has a memory overhead of one object instance, a bool, and two references per item. On 64-bit systems, this adds up to ~40 bytes, not including the stored item and the reference to it from the Node object. stores the items in separate objects that can be spread all over the heap. This helps to keep memory fragmentation low, as the individual node objects can be allocated wherever there's a spare 60 bytes. In contrast, SortedList... has no additional overhead per item (only the reference to it in the array entries), however the backing arrays can be significantly larger than you need; every time the arrays are resized they double in size. That means that if you add 513 items to a SortedList, the backing arrays will each have a length of 1024. To conteract this, the TrimExcess method resizes the arrays back down to the actual size needed, or you can simply assign list.Capacity = list.Count. stores its items in a continuous block in memory. If the list stores thousands of items, this can cause significant problems with Large Object Heap memory fragmentation as the array resizes, which SortedDictionary doesn't have. Performance Operations on a SortedDictionary always have O(log n) performance, regardless of where in the collection you're adding or removing items. In contrast, SortedList has O(n) performance when you're altering the middle of the collection. If you're adding or removing from the end (ie the largest item), then performance is O(log n), same as SortedDictionary (in practice, it will likely be slightly faster, due to the array items all being in the same area in memory, also called locality of reference). So, when should you use one and not the other? As always with these sort of things, there are no hard-and-fast rules. But generally, if you: need to access items using their index within the collection are populating the dictionary all at once from sorted data aren't adding or removing keys once it's populated then use a SortedList. But if you: don't know how many items are going to be in the dictionary are populating the dictionary from random, unsorted data are adding & removing items randomly then use a SortedDictionary. The default (again, there's no definite rules on these sort of things!) should be to use SortedDictionary, unless there's a good reason to use SortedList, due to the bad performance of SortedList when altering the middle of the collection.

    Read the article

  • SQL SERVER – Weekly Series – Memory Lane – #050

    - by Pinal Dave
    Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 Executing Remote Stored Procedure – Calling Stored Procedure on Linked Server In this example we see two different methods of how to call Stored Procedures remotely.  Connection Property of SQL Server Management Studio SSMS A very simple example of the how to build connection properties for SQL Server with the help of SSMS. Sample Example of RANKING Functions – ROW_NUMBER, RANK, DENSE_RANK, NTILE SQL Server has a total of 4 ranking functions. Ranking functions return a ranking value for each row in a partition. All the ranking functions are non-deterministic. T-SQL Script to Add Clustered Primary Key Jr. DBA asked me three times in a day, how to create Clustered Primary Key. I gave him following sample example. That was the last time he asked “How to create Clustered Primary Key to table?” 2008 2008 – TRIM() Function – User Defined Function SQL Server does not have functions which can trim leading or trailing spaces of any string at the same time. SQL does have LTRIM() and RTRIM() which can trim leading and trailing spaces respectively. SQL Server 2008 also does not have TRIM() function. User can easily use LTRIM() and RTRIM() together and simulate TRIM() functionality. http://www.youtube.com/watch?v=1-hhApy6MHM 2009 Earlier I have written two different articles on the subject Remove Bookmark Lookup. This article is as part 3 of original article. Please read the first two articles here before continuing reading this article. Query Optimization – Remove Bookmark Lookup – Remove RID Lookup – Remove Key Lookup Query Optimization – Remove Bookmark Lookup – Remove RID Lookup – Remove Key Lookup – Part 2 Query Optimization – Remove Bookmark Lookup – Remove RID Lookup – Remove Key Lookup – Part 3 Interesting Observation – Query Hint – FORCE ORDER SQL Server never stops to amaze me. As regular readers of this blog already know that besides conducting corporate training, I work on large-scale projects on query optimizations and server tuning projects. In one of the recent projects, I have noticed that a Junior Database Developer used the query hint Force Order; when I asked for details, I found out that the basic concept was not properly understood by him. Queries Waiting for Memory Allocation to Execute In one of the recent projects, I was asked to create a report of queries that are waiting for memory allocation. The reason was that we were doubtful regarding whether the memory was sufficient for the application. The following query can be useful in similar cases. Queries that do not have to wait on a memory grant will not appear in the result set of following query. 2010 Quickest Way to Identify Blocking Query and Resolution – Dirty Solution As the title suggests, this is quite a dirty solution; it’s not as elegant as you expect. However, it works totally fine. Simple Explanation of Data Type Precedence While I was working on creating a question for SQL SERVER – SQL Quiz – The View, The Table and The Clustered Index Confusion, I had actually created yet another question along with this question. However, I felt that the one which is posted on the SQL Quiz is much better than this one because what makes that more challenging question is that it has a multiple answer. Encrypted Stored Procedure and Activity Monitor I recently had received questionable if any stored procedure is encrypted can we see its definition in Activity Monitor.Answer is - No. Let us do a quick test. Let us create following Stored Procedure and then launch the Activity Monitor and check the text. Indexed View always Use Index on Table A single table can have maximum 249 non clustered indexes and 1 clustered index. In SQL Server 2008, a single table can have maximum 999 non clustered indexes and 1 clustered index. It is widely believed that a table can have only 1 clustered index, and this belief is true. I have some questions for all of you. Let us assume that I am creating view from the table itself and then create a clustered index on it. In my view, I am selecting the complete table itself. 2011 Detecting Database Case Sensitive Property using fn_helpcollations() I received a question on how to determine the case sensitivity of the database. The quick answer to this is to identify the collation of the database and check the properties of the collation. I have previously written how one can identify database collation. Once you have figured out the collation of the database, you can put that in the WHERE condition of the following T-SQL and then check the case sensitivity from the description. Server Side Paging in SQL Server CE (Compact Edition) SQL Server Denali is coming up with new T-SQL of Paging. I have written about the same earlier.SQL SERVER – Server Side Paging in SQL Server Denali – A Better Alternative,  SQL SERVER – Server Side Paging in SQL Server Denali Performance Comparison, SQL SERVER – Server Side Paging in SQL Server Denali – Part2 What is very interesting is that SQL Server CE 4.0 have the same feature introduced. Here is the quick example of the same. To run the script in the example, you will have to do installWebmatrix 4.0 and download sample database. Once done you can run following script. Why I am Going to Attend PASS Summit Unite 2011 The four-day event will be marked by a lot of learning, sharing, and networking, which will help me increase both my knowledge and contacts. Every year, PASS Summit provides me a golden opportunity to build my network as well as to identify and meet potential customers or employees. 2012 Manage Help Settings – CTRL + ALT + F1 This is very interesting read as my daughter once accidently came across a screen in SQL Server Management Studio. It took me 2-3 minutes to figure out how she has created the same screen. Recover the Accidentally Renamed Table “I accidentally renamed table in my SSMS. I was scrolling very fast and I made mistakes. It was either because I double clicked or clicked on F2 (shortcut key for renaming). However, I have made the mistake and now I have no idea how to fix this. If you have renamed the table, I think you pretty much is out of luck. Here are few things which you can do which can give you an idea about what your table name can be if you are lucky. Identify Numbers of Non Clustered Index on Tables for Entire Database Here is the script which will give you numbers of non clustered indexes on any table in entire database. Identify Most Resource Intensive Queries – SQL in Sixty Seconds #029 – Video Here is the complete complete script which I have used in the SQL in Sixty Seconds Video. Thanks Harsh for important Tip in the comment. http://www.youtube.com/watch?v=3kDHC_Tjrns Advanced Data Quality Services with Melissa Data – Azure Data Market For the purposes of the review, I used a database I had in an Excel spreadsheet with name and address information. Upon a cursory inspection, there are miscellaneous problems with these records; some addresses are missing ZIP codes, others missing a city, and some records are slightly misspelled or have unparsed suites. With DQS, I can easily add a knowledge base to help standardize my values, such as for state abbreviations. But how do I know that my address is correct? Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • How to design a scalable notification system?

    - by Trent
    I need to write a notification system manager. Here is my requirements: I need to be able to send a Notification on different platforms, which may be totally different (for exemple, I need to be able to send either an SMS or an E-mail). Sometimes the notification may be the same for all recipients for a given platform, but sometimes it may be a notification per recipients (or several) per platform. Each notification can contain platform specific payload (for exemple an MMS can contains a sound or an image). The system need to be scalable, I need to be able to send a very large amount of notification without crashing either the application or the server. It is a two step process, first a customer may type a message and choose a platform to send to, and the notification(s) should be created to be processed either real-time either later. Then the system needs to send the notification to the platform provider. For now, I end up with some though but I don't know how scalable it will be or if it is a good design. I've though of the following objects (in a pseudo language): a generic Notification object: class Notification { String $message; Payload $payload; Collection<Recipient> $recipients; } The problem with the following objects is what if I've 1.000.000 recipients ? Even if the Recipient object is very small, it'll take too much memory. I could also create one Notification per recipient, but some platform providers requires me to send it in batch, meaning I need to define one Notification with several Recipients. Each created notification could be stored in a persistent storage like a DB or Redis. Would it be a good it to aggregate this later to make sure it is scalable? On the second step, I need to process this notification. But how could I distinguish the notification to the right platform provider? Should I use an object like MMSNotification extending an abstract Notification? or something like Notification.setType('MMS')? To allow to process a lot of notification at the same time, I think a messaging queue system like RabbitMQ may be the right tool. Is it? It would allow me to queue a lot of notification and have several worker to pop notification and process them. But what if I need to batch the recipients as seen above? Then I imagine a NotificationProcessor object for which I could I add NotificationHandler each NotificationHandler would be in charge to connect the platform provider and perform notification. I can also use an EventManager to allow pluggable behavior. Any feedbacks or ideas? Thanks for giving your time. Note: I'm used to work in PHP and it is likely the language of my choice.

    Read the article

  • PostSharp, Obfuscation, and IL

    - by Simon Cooper
    Aspect-oriented programming (AOP) is a relatively new programming paradigm. Originating at Xerox PARC in 1994, the paradigm was first made available for general-purpose development as an extension to Java in 2001. From there, it has quickly been adapted for use in all the common languages used today. In the .NET world, one of the primary AOP toolkits is PostSharp. Attributes and AOP Normally, attributes in .NET are entirely a metadata construct. Apart from a few special attributes in the .NET framework, they have no effect whatsoever on how a class or method executes within the CLR. Only by using reflection at runtime can you access any attributes declared on a type or type member. PostSharp changes this. By declaring a custom attribute that derives from PostSharp.Aspects.Aspect, applying it to types and type members, and running the resulting assembly through the PostSharp postprocessor, you can essentially declare 'clever' attributes that change the behaviour of whatever the aspect has been applied to at runtime. A simple example of this is logging. By declaring a TraceAttribute that derives from OnMethodBoundaryAspect, you can automatically log when a method has been executed: public class TraceAttribute : PostSharp.Aspects.OnMethodBoundaryAspect { public override void OnEntry(MethodExecutionArgs args) { MethodBase method = args.Method; System.Diagnostics.Trace.WriteLine( String.Format( "Entering {0}.{1}.", method.DeclaringType.FullName, method.Name)); } public override void OnExit(MethodExecutionArgs args) { MethodBase method = args.Method; System.Diagnostics.Trace.WriteLine( String.Format( "Leaving {0}.{1}.", method.DeclaringType.FullName, method.Name)); } } [Trace] public void MethodToLog() { ... } Now, whenever MethodToLog is executed, the aspect will automatically log entry and exit, without having to add the logging code to MethodToLog itself. PostSharp Performance Now this does introduce a performance overhead - as you can see, the aspect allows access to the MethodBase of the method the aspect has been applied to. If you were limited to C#, you would be forced to retrieve each MethodBase instance using Type.GetMethod(), matching on the method name and signature. This is slow. Fortunately, PostSharp is not limited to C#. It can use any instruction available in IL. And in IL, you can do some very neat things. Ldtoken C# allows you to get the Type object corresponding to a specific type name using the typeof operator: Type t = typeof(Random); The C# compiler compiles this operator to the following IL: ldtoken [mscorlib]System.Random call class [mscorlib]System.Type [mscorlib]System.Type::GetTypeFromHandle( valuetype [mscorlib]System.RuntimeTypeHandle) The ldtoken instruction obtains a special handle to a type called a RuntimeTypeHandle, and from that, the Type object can be obtained using GetTypeFromHandle. These are both relatively fast operations - no string lookup is required, only direct assembly and CLR constructs are used. However, a little-known feature is that ldtoken is not just limited to types; it can also get information on methods and fields, encapsulated in a RuntimeMethodHandle or RuntimeFieldHandle: // get a MethodBase for String.EndsWith(string) ldtoken method instance bool [mscorlib]System.String::EndsWith(string) call class [mscorlib]System.Reflection.MethodBase [mscorlib]System.Reflection.MethodBase::GetMethodFromHandle( valuetype [mscorlib]System.RuntimeMethodHandle) // get a FieldInfo for the String.Empty field ldtoken field string [mscorlib]System.String::Empty call class [mscorlib]System.Reflection.FieldInfo [mscorlib]System.Reflection.FieldInfo::GetFieldFromHandle( valuetype [mscorlib]System.RuntimeFieldHandle) These usages of ldtoken aren't usable from C# or VB, and aren't likely to be added anytime soon (Eric Lippert's done a blog post on the possibility of adding infoof, methodof or fieldof operators to C#). However, PostSharp deals directly with IL, and so can use ldtoken to get MethodBase objects quickly and cheaply, without having to resort to string lookups. The kicker However, there are problems. Because ldtoken for methods or fields isn't accessible from C# or VB, it hasn't been as well-tested as ldtoken for types. This has resulted in various obscure bugs in most versions of the CLR when dealing with ldtoken and methods, and specifically, generic methods and methods of generic types. This means that PostSharp was behaving incorrectly, or just plain crashing, when aspects were applied to methods that were generic in some way. So, PostSharp has to work around this. Without using the metadata tokens directly, the only way to get the MethodBase of generic methods is to use reflection: Type.GetMethod(), passing in the method name as a string along with information on the signature. Now, this works fine. It's slower than using ldtoken directly, but it works, and this only has to be done for generic methods. Unfortunately, this poses problems when the assembly is obfuscated. PostSharp and Obfuscation When using ldtoken, obfuscators don't affect how PostSharp operates. Because the ldtoken instruction directly references the type, method or field within the assembly, it is unaffected if the name of the object is changed by an obfuscator. However, the indirect loading used for generic methods was breaking, because that uses the name of the method when the assembly is put through the PostSharp postprocessor to lookup the MethodBase at runtime. If the name then changes, PostSharp can't find it anymore, and the assembly breaks. So, PostSharp needs to know about any changes an obfuscator does to an assembly. The way PostSharp does this is by adding another layer of indirection. When PostSharp obfuscation support is enabled, it includes an extra 'name table' resource in the assembly, consisting of a series of method & type names. When PostSharp needs to lookup a method using reflection, instead of encoding the method name directly, it looks up the method name at a fixed offset inside that name table: MethodBase genericMethod = typeof(ContainingClass).GetMethod(GetNameAtIndex(22)); PostSharp.NameTable resource: ... 20: get_Prop1 21: set_Prop1 22: DoFoo 23: GetWibble When the assembly is later processed by an obfuscator, the obfuscator can replace all the method and type names within the name table with their new name. That way, the reflection lookups performed by PostSharp will now use the new names, and everything will work as expected: MethodBase genericMethod = typeof(#kGy).GetMethod(GetNameAtIndex(22)); PostSharp.NameTable resource: ... 20: #kkA 21: #zAb 22: #EF5a 23: #2tg As you can see, this requires direct support by an obfuscator in order to perform these rewrites. Dotfuscator supports it, and now, starting with SmartAssembly 6.6.4, SmartAssembly does too. So, a relatively simple solution to a tricky problem, with some CLR bugs thrown in for good measure. You don't see those every day!

    Read the article

  • It's called College.

    - by jeffreyabecker
    Today I saw yet another 'GUID vs int as your primary key' article. Like most of the ones I've read this was filled with technical misrepresentations and out-right fallices. Chef's famous line that "There's a time and a place for everything children" applies here. GUIDs have distinct advantages and disadvantages which should be considered when choosing a data type for the primary key. Fallacy 1: "Its easier" An integer data type(tinyint, smallint, int, bigint) is a better artifical key than a GUID because its easier to remember. I'm a firm believer that your artifical primary keys should be opaque gibberish. PK's are an implementation detail which should never be exposed to the user or relied on for business logic. If you want things to come back in an order, add and ORDER BY clause and SortOrder fields. If you want a human-usable look-up add a business key with a unique constraint. If you want to know what order things were inserted into a table add a timestamp. Fallacy 2: "Size Matters" For many applications, the size of the artifical primary key is going to be irrelevant. The particular article which kicked this post off stated repeatedly that joining against an int has better performance than joining against a GUID. In computer science the performance of your algorithm is always a function of the number of data points. This still holds true for databases. Unless your table is very large, the performance difference between an int and a guid probably isnt going to be mesurable let alone noticeable. My personal experience is that the performance becomes an issue when you start having billions of rows in the table. At this point, you should probably start looking to move from int to bigint so the effective space/performance gain isnt as much as you'd think. GUID Advantages: Insert-ability / Mergeability: You can reliably insert guids into tables without key collisions. Database Independence: Saving entities to the database often requires knowing ids. With identity based ids the id must be selected back after every insert. GUIDs can be generated application-side allowing much faster inserts. GUID Disadvantages: Generatability: You can calculate the next id for an integer pk pretty easily in your head but will need a program to generate GUIDs. Solution: "Select top 100 newid() from sysobjects" Fragmentation: most GUID generation algorithms generate pseudo random GUIDs. This can cause inserts into the middle of your clustered index. Solutions: add a default of newsequentialid() or use GuidComb in NHibernate.

    Read the article

  • Testing Workflows &ndash; Test-After

    - by Timothy Klenke
    Originally posted on: http://geekswithblogs.net/TimothyK/archive/2014/05/30/testing-workflows-ndash-test-after.aspxIn this post I’m going to outline a few common methods that can be used to increase the coverage of of your test suite.  This won’t be yet another post on why you should be doing testing; there are plenty of those types of posts already out there.  Assuming you know you should be testing, then comes the problem of how do I actual fit that into my day job.  When the opportunity to automate testing comes do you take it, or do you even recognize it? There are a lot of ways (workflows) to go about creating automated tests, just like there are many workflows to writing a program.  When writing a program you can do it from a top-down approach where you write the main skeleton of the algorithm and call out to dummy stub functions, or a bottom-up approach where the low level functionality is fully implement before it is quickly wired together at the end.  Both approaches are perfectly valid under certain contexts. Each approach you are skilled at applying is another tool in your tool belt.  The more vectors of attack you have on a problem – the better.  So here is a short, incomplete list of some of the workflows that can be applied to increasing the amount of automation in your testing and level of quality in general.  Think of each workflow as an opportunity that is available for you to take. Test workflows basically fall into 2 categories:  test first or test after.  Test first is the best approach.  However, this post isn’t about the one and only best approach.  I want to focus more on the lesser known, less ideal approaches that still provide an opportunity for adding tests.  In this post I’ll enumerate some test-after workflows.  In my next post I’ll cover test-first. Bug Reporting When someone calls you up or forwards you a email with a vague description of a bug its usually standard procedure to create or verify a reproduction plan for the bug via manual testing and log that in a bug tracking system.  This can be problematic.  Often reproduction plans when written down might skip a step that seemed obvious to the tester at the time or they might be missing some crucial environment setting. Instead of data entry into a bug tracking system, try opening up the test project and adding a failing unit test to prove the bug.  The test project guarantees that all aspects of the environment are setup properly and no steps are missing.  The language in the test project is much more precise than the English that goes into a bug tracking system. This workflow can easily be extended for Enhancement Requests as well as Bug Reporting. Exploratory Testing Exploratory testing comes in when you aren’t sure how the system will behave in a new scenario.  The scenario wasn’t planned for in the initial system requirements and there isn’t an existing test for it.  By definition the system behaviour is “undefined”. So write a new unit test to define that behaviour.  Add assertions to the tests to confirm your assumptions.  The new test becomes part of the living system specification that is kept up to date with the test suite. Examples This workflow is especially good when developing APIs.  When you are finally done your production API then comes the job of writing documentation on how to consume the API.  Good documentation will also include code examples.  Don’t let these code examples merely exist in some accompanying manual; implement them in a test suite. Example tests and documentation do not have to be created after the production API is complete.  It is best to write the example code (tests) as you go just before the production code. Smoke Tests Every system has a typical use case.  This represents the basic, core functionality of the system.  If this fails after an upgrade the end users will be hosed and they will be scratching their heads as to how it could be possible that an update got released with this core functionality broken. The tests for this core functionality are referred to as “smoke tests”.  It is a good idea to have them automated and run with each build in order to avoid extreme embarrassment and angry customers. Coverage Analysis Code coverage analysis is a tool that reports how much of the production code base is exercised by the test suite.  In Visual Studio this can be found under the Test main menu item. The tool will report a total number for the code coverage, which can be anywhere between 0 and 100%.  Coverage Analysis shouldn’t be used strictly for numbers reporting.  Companies shouldn’t set minimum coverage targets that mandate that all projects must have at least 80% or 100% test coverage.  These arbitrary requirements just invite gaming of the coverage analysis, which makes the numbers useless. The analysis tool will break down the coverage by the various classes and methods in projects.  Instead of focusing on the total number, drill down into this view and see which classes have high or low coverage.  It you are surprised by a low number on a class this is an opportunity to add tests. When drilling through the classes there will be generally two types of reaction to a surprising low test coverage number.  The first reaction type is a recognition that there is low hanging fruit to be picked.  There may be some classes or methods that aren’t being tested, which could easy be.  The other reaction type is “OMG”.  This were you find a critical piece of code that isn’t under test.  In both cases, go and add the missing tests. Test Refactoring The general theme of this post up to this point has been how to add more and more tests to a test suite.  I’ll step back from that a bit and remind that every line of code is a liability.  Each line of code has to be read and maintained, which costs money.  This is true regardless whether the code is production code or test code. Remember that the primary goal of the test suite is that it be easy to read so that people can easily determine the specifications of the system.  Make sure that adding more and more tests doesn’t interfere with this primary goal. Perform code reviews on the test suite as often as on production code.  Hold the test code up to the same high readability standards as the production code.  If the tests are hard to read then change them.  Look to remove duplication.  Duplicate setup code between two or more test methods that can be moved to a shared function.  Entire test methods can be removed if it is found that the scenario it tests is covered by other tests.  Its OK to delete a test that isn’t pulling its own weight anymore. Remember to only start refactoring when all the test are green.  Don’t refactor the tests and the production code at the same time.  An automated test suite can be thought of as a double entry book keeping system.  The unchanging, passing production code serves as the tests for the test suite while refactoring the tests. As with all refactoring, it is best to fit this into your regular work rather than asking for time later to get it done.  Fit this into the standard red-green-refactor cycle.  The refactor step no only applies to production code but also the tests, but not at the same time.  Perhaps the cycle should be called red-green-refactor production-refactor tests (not quite as catchy).   That about covers most of the test-after workflows I can think of.  In my next post I’ll get into test-first workflows.

    Read the article

  • Deterministic/Consistent Unique Masking

    - by Dinesh Rajasekharan-Oracle
    One of the key requirements while masking data in large databases or multi database environment is to consistently mask some columns, i.e. for a given input the output should always be the same. At the same time the masked output should not be predictable. Deterministic masking also eliminates the need to spend enormous amount of time spent in identifying data relationships, i.e. parent and child relationships among columns defined in the application tables. In this blog post I will explain different ways of consistently masking the data across databases using Oracle Data Masking and Subsetting The readers of post should have minimal knowledge on Oracle Enterprise Manager 12c, Application Data Modeling, Data Masking concepts. For more information on these concepts, please refer to Oracle Data Masking and Subsetting document Oracle Data Masking and Subsetting 12c provides four methods using which users can consistently yet irreversibly mask their inputs. 1. Substitute 2. SQL Expression 3. Encrypt 4. User Defined Function SUBSTITUTE The substitute masking format replaces the original value with a value from a pre-created database table. As the method uses a hash based algorithm in the back end the mappings are consistent. For example consider DEPARTMENT_ID in EMPLOYEES table is replaced with FAKE_DEPARTMENT_ID from FAKE_TABLE. The substitute masking transformation that all occurrences of DEPARTMENT_ID say ‘101’ will be replaced with ‘502’ provided same substitution table and column is used , i.e. FAKE_TABLE.FAKE_DEPARTMENT_ID. The following screen shot shows the usage of the Substitute masking format with in a masking definition: Note that the uniqueness of the masked value depends on the number of columns being used in the substitution table i.e. if the original table contains 50000 unique values, then for the masked output to be unique and deterministic the substitution column should also contain 50000 unique values without which only consistency is maintained but not uniqueness. SQL EXPRESSION SQL Expression replaces an existing value with the output of a specified SQL Expression. For example while masking an EMPLOYEES table the EMAIL_ID of an employee has to be in the format EMPLOYEE’s [email protected] while FIRST_NAME and LAST_NAME are the actual column names of the EMPLOYEES table then the corresponding SQL Expression will look like %FIRST_NAME%||’.’||%LAST_NAME%||’@COMPANY.COM’. The advantage of this technique is that if you are masking FIRST_NAME and LAST_NAME of the EMPLOYEES table than the corresponding EMAIL ID will be replaced accordingly by the masking scripts. One of the interesting aspect’s of a SQL Expressions is that you can use sub SQL expressions, which means that you can write a nested SQL and use it as SQL Expression to address a complex masking business use cases. SQL Expression can also be used to consistently replace value with hashed value using Oracle’s PL/SQL function ORA_HASH. The following SQL Expression will help in the previous example for replacing the DEPARTMENT_IDs with a hashed number ORA_HASH (%DEPARTMENT_ID%, 1000) The following screen shot shows the usage of encrypt masking format with in the masking definition: ORA_HASH takes three arguments: 1. Expression which can be of any data type except LONG, LOB, User Defined Type [nested table type is allowed]. In the above example I used the Original value as expression. 2. Number of hash buckets which can be number between 0 and 4294967295. The default value is 4294967295. You can also co-relate the number of hash buckets to a range of numbers. In the above example above the bucket value is specified as 1000, so the end result will be a hashed number in between 0 and 1000. 3. Seed, can be any number which decides the consistency, i.e. for a given seed value the output will always be same. The default seed is 0. In the above SQL Expression a seed in not specified, so it to 0. If you have to use a non default seed then the function will look like. ORA_HASH (%DEPARTMENT_ID%, 1000, 1234 The uniqueness depends on the input and the number of hash buckets used. However as ORA_HASH uses a 32 bit algorithm, considering birthday paradox or pigeonhole principle there is a 0.5 probability of collision after 232-1 unique values. ENCRYPT Encrypt masking format uses a blend of 3DES encryption algorithm, hashing, and regular expression to produce a deterministic and unique masked output. The format of the masked output corresponds to the specified regular expression. As this technique uses a key [string] to encrypt the data, the same string can be used to decrypt the data. The key also acts as seed to maintain consistent outputs for a given input. The following screen shot shows the usage of encrypt masking format with in the masking definition: Regular Expressions may look complex for the first time users but you will soon realize that it’s a simple language. There are many resources in internet, oracle documentation, oracle learning library, my oracle support on writing a Regular Expressions, out of all the following My Oracle Support document helped me to get started with Regular Expressions: Oracle SQL Support for Regular Expressions[Video](Doc ID 1369668.1) USER DEFINED FUNCTION [UDF] User Defined Function or UDF provides flexibility for the users to code their own masking logic in PL/SQL, which can be called from masking Defintion. The standard format of an UDF in Oracle Data Masking and Subsetting is: Function udf_func (rowid varchar2, column_name varchar2, original_value varchar2) returns varchar2; Where • rowid is the row identifier of the column that needs to be masked • column_name is the name of the column that needs to be masked • original_value is the column value that needs to be masked You can achieve deterministic masking by using Oracle’s built in hash functions like, ORA_HASH, DBMS_CRYPTO.MD4, DBMS_CRYPTO.MD5, DBMS_UTILITY. GET_HASH_VALUE.Please refers to the Oracle Database Documentation for more information on the Oracle Hash functions. For example the following masking UDF generate deterministic unique hexadecimal values for a given string input: CREATE OR REPLACE FUNCTION RD_DUX (rid varchar2, column_name varchar2, orig_val VARCHAR2) RETURN VARCHAR2 DETERMINISTIC PARALLEL_ENABLE IS stext varchar2 (26); no_of_characters number(2); BEGIN no_of_characters:=6; stext:=substr(RAWTOHEX(DBMS_CRYPTO.HASH(UTL_RAW.CAST_TO_RAW(text),1)),0,no_of_characters); RETURN stext; END; The uniqueness depends on the input and length of the string and number of bits used by hash algorithm. In the above function MD4 hash is used [denoted by argument 1 in the DBMS_CRYPTO.HASH function which is a 128 bit algorithm which produces 2^128-1 unique hashed values , however this is limited by the length of the input string which is 6, so only 6^6 unique values will be generated. Also do not forget about the birthday paradox/pigeonhole principle mentioned earlier in this post. An another example is to consistently replace characters or numbers preserving the length and special characters as shown below: CREATE OR REPLACE FUNCTION RD_DUS(rid varchar2,column_name varchar2,orig_val VARCHAR2) RETURN VARCHAR2 DETERMINISTIC PARALLEL_ENABLE IS stext varchar2(26); BEGIN DBMS_RANDOM.SEED(orig_val); stext:=TRANSLATE(orig_val,'ABCDEFGHILKLMNOPQRSTUVWXYZ',DBMS_RANDOM.STRING('U',26)); stext:=TRANSLATE(stext,'abcdefghijklmnopqrstuvwxyz',DBMS_RANDOM.STRING('L',26)); stext:=TRANSLATE(stext,'0123456789',to_char(DBMS_RANDOM.VALUE(1,9))); stext:=REPLACE(stext,'.','0'); RETURN stext; END; The following screen shot shows the usage of an UDF with in a masking definition: To summarize, Oracle Data Masking and Subsetting helps you to consistently mask data across databases using one or all of the methods described in this post. It saves the hassle of identifying the parent-child relationships defined in the application table. Happy Masking

    Read the article

  • Profiling Startup Of VS2012 &ndash; SpeedTrace Profiler

    - by Alois Kraus
    SpeedTrace is a relatively unknown profiler made a company called Ipcas. A single professional license does cost 449€+VAT. For the test I did use SpeedTrace 4.5 which is currently Beta. Although it is cheaper than dotTrace it has by far the most options to influence how profiling does work. First you need to create a tracing project which does configure tracing for one process type. You can start the application directly from the profiler or (much more interesting) it does attach to a specific process when it is started. For this you need to check “Trace the specified …” radio button and enter the process name in the “Process Name of the Trace” edit box. You can even selectively enable tracing for processes with a specific command line. Then you need to activate the trace project by pressing the Activate Project button and you are ready to start VS as usual. If you want to profile the next 10 VS instances that you start you can set the Number of Processes counter to e.g. 10. This is immensely helpful if you are trying to profile only the next 5 started processes. As you can see there are many more tabs which do allow to influence tracing in a much more sophisticated way. SpeedTrace is the only profiler which does not rely entirely on the profiling Api of .NET. Instead it does modify the IL code (instrumentation on the fly) to write tracing information to disc which can later be analyzed. This approach is not only very fast but it does give you unprecedented analysis capabilities. Once the traces are collected they do show up in your workspace where you can open the trace viewer. I do skip the other windows because this view is by far the most useful one. You can sort the methods not only by Wall Clock time but also by CPU consumption and wait time which none of the other products support in their views at the same time. If you want to optimize for CPU consumption sort by CPU time. If you want to find out where most time is spent you need Clock Total time and Clock Waiting. There you can directly see if the method did take long because it did wait on something or it did really execute stuff that did take so long. Once you have found a method you want to drill deeper you can double click on a method to get to the Caller/Callee view which is similar to the JetBrains Method Grid view. But this time you do see much more. In the middle is the clicked method. Above are the methods that call you and below are the methods that you do directly call. Normally you would then start digging deeper to find the end of the chain where the slow method worth optimizing is located. But there is a shortcut. You can press the magic   button to calculate the aggregation of all called methods. This is displayed in the lower left window where you can see each method call and how long it did take. There you can also sort to see if this call stack does only contain methods (e.g. WCF connect calls which you cannot make faster) not worth optimizing. YourKit has a similar feature where it is called Callees List. In the Functions tab you have in the context menu also many other useful analysis options One really outstanding feature is the View Call History Drilldown. When you select this one you get not a sum of all method invocations but a list with the duration of each method call. This is not surprising since SpeedTrace does use tracing to get its timings. There you can get many useful graphs how this method did behave over time. Did it become slower at some point in time or was only the first call slow? The diagrams and the list will tell you that. That is all fine but what should I do when one method call was slow? I want to see from where it was coming from. No problem select the method in the list hit F10 and you get the call stack. This is a life saver if you e.g. search for serialization problems. Today Serializers are used everywhere. You want to find out from where the 5s XmlSerializer.Deserialize call did come from? Hit F10 and you get the call stack which did invoke the 5s Deserialize call. The CPU timeline tab is also useful to find out where long pauses or excessive CPU consumption did happen. Click in the graph to get the Thread Stacks window where you can get a quick overview what all threads were doing at this time. This does look like the Stack Traces feature in YourKit. Only this time you get the last called method first which helps to quickly see what all threads were executing at this moment. YourKit does generate a rather long list which can be hard to go through when you have many threads. The thread list in the middle does not give you call stacks or anything like that but you see which methods were found most often executing code by the profiler which is a good indication for methods consuming most CPU time. This does sound too good to be true? I have not told you the best part yet. The best thing about this profiler is the staff behind it. When I do see a crash or some other odd behavior I send a mail to Ipcas and I do get usually the next day a mail that the problem has been fixed and a download link to the new version. The guys at Ipcas are even so helpful to log in to your machine via a Citrix Client to help you to get started profiling your actual application you want to profile. After a 2h telco I was converted from a hater to a believer of this tool. The fast response time might also have something to do with the fact that they are actively working on 4.5 to get out of the door. But still the support is by far the best I have encountered so far. The only downside is that you should instrument your assemblies including the .NET Framework to get most accurate numbers. You can profile without doing it but then you will see very high JIT times in your process which can severely affect the correctness of the measured timings. If you do not care about exact numbers you can also enable in the main UI in the Data Trace tab logging of method arguments of primitive types. If you need to know what files at which times were opened by your application you can find it out without a debugger. Since SpeedTrace does read huge trace files in its reader you should perhaps use a 64 bit machine to be able to analyze bigger traces as well. The memory consumption of the trace reader is too high for my taste. But they did promise for the next version to come up with something much improved.

    Read the article

  • External File Upload Optimizations for Windows Azure

    - by rgillen
    [Cross posted from here: http://rob.gillenfamily.net/post/External-File-Upload-Optimizations-for-Windows-Azure.aspx] I’m wrapping up a bit of the work we’ve been doing on data movement optimizations for cloud computing and the latest set of data yielded some interesting points I thought I’d share. The work done here is not really rocket science but may, in some ways, be slightly counter-intuitive and therefore seemed worthy of posting. Summary: for those who don’t like to read detailed posts or don’t have time, the synopsis is that if you are uploading data to Azure, block your data (even down to 1MB) and upload in parallel. Set your block size based on your source file size, but if you must choose a fixed value, use 1MB. Following the above will result in significant performance gains… upwards of 10x-24x and a reduction in overall file transfer time of upwards of 90% (eg, uploading a 1GB file averaged 46.37 minutes prior to optimizations and averaged 1.86 minutes afterwards). Detail: For those of you who want more detail, or think that the claims at the end of the preceding paragraph are over-reaching, what follows is information and code supporting these claims. As the title would indicate, these tests were run from our research facility pointing to the Azure cloud (specifically US North Central as it is physically closest to us) and do not represent intra-cloud results… we have performed intra-cloud tests and the overall results are similar in notion but the data rates are significantly different as well as the tipping points for the various block sizes… this will be detailed separately). We started by building a very simple console application that would loop through a directory and upload each file to Azure storage. This application used the shipping storage client library from the 1.1 version of the azure tools. The only real variation from the client library is that we added code to collect and record the duration (in ms) and size (in bytes) for each file transferred. The code is available here. We then created a directory that had a collection of files for the following sizes: 2KB, 32KB, 64KB, 128KB, 512KB, 1MB, 5MB, 10MB, 25MB, 50MB, 100MB, 250MB, 500MB, 750MB, and 1GB (50 files for each size listed). These files contained randomly-generated binary data and do not benefit from compression (a separate discussion topic). Our file generation tool is available here. The baseline was established by running the application described above against the directory containing all of the data files. This application uploads the files in a random order so as to avoid transferring all of the files of a given size sequentially and thereby spreading the affects of periodic Internet delays across the collection of results.  We then ran some scripts to split the resulting data and generate some reports. The raw data collected for our non-optimized tests is available via the links in the Related Resources section at the bottom of this post. For each file size, we calculated the average upload time (and standard deviation) and the average transfer rate (and standard deviation). As you likely are aware, transferring data across the Internet is susceptible to many transient delays which can cause anomalies in the resulting data. It is for this reason that we randomized the order of source file processing as well as executed the tests 50x for each file size. We expect that these steps will yield a sufficiently balanced set of results. Once the baseline was collected and analyzed, we updated the test harness application with some methods to split the source file into user-defined block sizes and then to upload those blocks in parallel (using the PutBlock() method of Azure storage). The parallelization was handled by simply relying on the Parallel Extensions to .NET to provide a Parallel.For loop (see linked source for specific implementation details in Program.cs, line 173 and following… less than 100 lines total). Once all of the blocks were uploaded, we called PutBlockList() to assemble/commit the file in Azure storage. For each block transferred, the MD5 was calculated and sent ensuring that the bits that arrived matched was was intended. The timer for the blocked/parallelized transfer method wraps the entire process (source file splitting, block transfer, MD5 validation, file committal). A diagram of the process is as follows: We then tested the affects of blocking & parallelizing the transfers by running the updated application against the same source set and did a parameter sweep on the block size including 256KB, 512KB, 1MB, 2MB, and 4MB (our assumption was that anything lower than 256KB wasn’t worth the trouble and 4MB is the maximum size of a block supported by Azure). The raw data for the parallel tests is available via the links in the Related Resources section at the bottom of this post. This data was processed and then compared against the single-threaded / non-optimized transfer numbers and the results were encouraging. The Excel version of the results is available here. Two semi-obvious points need to be made prior to reviewing the data. The first is that if the block size is larger than the source file size you will end up with a “negative optimization” due to the overhead of attempting to block and parallelize. The second is that as the files get smaller, the clock-time cost of blocking and parallelizing (overhead) is more apparent and can tend towards negative optimizations. For this reason (and is supported in the raw data provided in the linked worksheet) the charts and dialog below ignore source file sizes less than 1MB. (click chart for full size image) The chart above illustrates some interesting points about the results: When the block size is smaller than the source file, performance increases but as the block size approaches and then passes the source file size, you see decreasing benefit to the point of negative gains (see the values for the 1MB file size) For some of the moderately-sized source files, small blocks (256KB) are best As the size of the source file gets larger (see values for 50MB and up), the smallest block size is not the most efficient (presumably due, at least in part, to the increased number of blocks, increased number of individual transfer requests, and reassembly/committal costs). Once you pass the 250MB source file size, the difference in rate for 1MB to 4MB blocks is more-or-less constant The 1MB block size gives the best average improvement (~16x) but the optimal approach would be to vary the block size based on the size of the source file.    (click chart for full size image) The above is another view of the same data as the prior chart just with the axis changed (x-axis represents file size and plotted data shows improvement by block size). It again highlights the fact that the 1MB block size is probably the best overall size but highlights the benefits of some of the other block sizes at different source file sizes. This last chart shows the change in total duration of the file uploads based on different block sizes for the source file sizes. Nothing really new here other than this view of the data highlights the negative affects of poorly choosing a block size for smaller files.   Summary What we have found so far is that blocking your file uploads and uploading them in parallel results in significant performance improvements. Further, utilizing extension methods and the Task Parallel Library (.NET 4.0) make short work of altering the shipping client library to provide this functionality while minimizing the amount of change to existing applications that might be using the client library for other interactions.   Related Resources Source code for upload test application Source code for random file generator ODatas feed of raw data from non-optimized transfer tests Experiment Metadata Experiment Datasets 2KB Uploads 32KB Uploads 64KB Uploads 128KB Uploads 256KB Uploads 512KB Uploads 1MB Uploads 5MB Uploads 10MB Uploads 25MB Uploads 50MB Uploads 100MB Uploads 250MB Uploads 500MB Uploads 750MB Uploads 1GB Uploads Raw Data OData feeds of raw data from blocked/parallelized transfer tests Experiment Metadata Experiment Datasets Raw Data 256KB Blocks 512KB Blocks 1MB Blocks 2MB Blocks 4MB Blocks Excel worksheet showing summarizations and comparisons

    Read the article

  • ArchBeat Link-o-Rama for 2012-07-11

    - by Bob Rhubart
    Is the future of retail showrooming? | GigaOm "The digital shopper isn’t just digital and she expects to be served seamlessly across all channels, physical and digital," reports GigaOm. Twenty years into the Internet era and the changes just keep coming. Solution architects take note... Agile Bureaucracy: When Practices become Principles | Jim Highsmith.com "Principles and values are a critical part of keeping individuals in organizations aligned and engaged," says Agile guru Jim Highsmith, "but the more pseudo-principles are piled on top of principles, the less and less organizations are able to adapt." Oracle Fusion Applications 11g Basics | Michel Schildmeijer "We are trying to build up a Oracle Fusion Apps environment on a Exalogic system, though still on bare metal, because officially there still is no Oracle VM available yet on Exalogic," says Michel Schildmeijer, an Oracle Fusion Middleware Architect at Qualogy. "It is a bit of a challenge, but getting to know the basics and which components the install, build and configure phase use, might bring you a step further on the way." Process Centric Banking: Loan Origination Solution | Manish Palaparthy This interesting, detailed post by Manish Palaparthy explains the process behind the execution of a proof-of-concept for a Fusion Middleware-based loan-origination solution for a bank. The solution incorporates Oracle BPM Suite, Webcenter, and ADF technolgies in a SOA infrastructure. How eBay and Facebook are Cleaning Up Data Centers | Amy Gallo - HBR The Cloud has needs! As reported by Amy Gallo in an article in the Harvard Business Review, "The electricity demand of data centers and the telecommunications network is rivaling that of most nations. If the cloud were itself a country, it would rank fifth in the world on energy demand behind the U.S., China, Russia, and Japan." Do WebLogic configuration from ANT | Edwin Biemond "With WebLogic WLST you can script the creation of all your Application DataSources or SOA Integration artifacts( like JMS etc)," says Oracle ACE Edwin Biemond. "This is necessary if your domain contains many WebLogic artifacts or you have more then one WebLogic environment. If so, you want to script this so you can configure a new WebLogic domain in minutes and you can repeat this task with always the same result." Oracle Special-Edition E-Book: Cloud Architecture for Dummies Learn how to architect and model your cloud implementation to drive efficiency and leverage economies of scale with Cloud Architecture for Dummies, a free Oracle e-book. (Registration required.) Thought for the Day "One of the best things to come out of the home computer revolution could be the general and widespread understanding of how severely limited logic really is." — Frank Herbert Source: SoftwareQuotes.com

    Read the article

  • Benchmarking MySQL Replication with Multi-Threaded Slaves

    - by Mat Keep
    0 0 1 1145 6530 Homework 54 15 7660 14.0 Normal 0 false false false EN-US JA X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:12.0pt; font-family:Cambria; mso-ascii-font-family:Cambria; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Cambria; mso-hansi-theme-font:minor-latin; mso-ansi-language:EN-US;} The objective of this benchmark is to measure the performance improvement achieved when enabling the Multi-Threaded Slave enhancement delivered as a part MySQL 5.6. As the results demonstrate, Multi-Threaded Slaves delivers 5x higher replication performance based on a configuration with 10 databases/schemas. For real-world deployments, higher replication performance directly translates to: · Improved consistency of reads from slaves (i.e. reduced risk of reading "stale" data) · Reduced risk of data loss should the master fail before replicating all events in its binary log (binlog) The multi-threaded slave splits processing between worker threads based on schema, allowing updates to be applied in parallel, rather than sequentially. This delivers benefits to those workloads that isolate application data using databases - e.g. multi-tenant systems deployed in cloud environments. Multi-Threaded Slaves are just one of many enhancements to replication previewed as part of the MySQL 5.6 Development Release, which include: · Global Transaction Identifiers coupled with MySQL utilities for automatic failover / switchover and slave promotion · Crash Safe Slaves and Binlog · Optimized Row Based Replication · Replication Event Checksums · Time Delayed Replication These and many more are discussed in the “MySQL 5.6 Replication: Enabling the Next Generation of Web & Cloud Services” Developer Zone article  Back to the benchmark - details are as follows. Environment The test environment consisted of two Linux servers: · one running the replication master · one running the replication slave. Only the slave was involved in the actual measurements, and was based on the following configuration: - Hardware: Oracle Sun Fire X4170 M2 Server - CPU: 2 sockets, 6 cores with hyper-threading, 2930 MHz. - OS: 64-bit Oracle Enterprise Linux 6.1 - Memory: 48 GB Test Procedure Initial Setup: Two MySQL servers were started on two different hosts, configured as replication master and slave. 10 sysbench schemas were created, each with a single table: CREATE TABLE `sbtest` (    `id` int(10) unsigned NOT NULL AUTO_INCREMENT,    `k` int(10) unsigned NOT NULL DEFAULT '0',    `c` char(120) NOT NULL DEFAULT '',    `pad` char(60) NOT NULL DEFAULT '',    PRIMARY KEY (`id`),    KEY `k` (`k`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 10,000 rows were inserted in each of the 10 tables, for a total of 100,000 rows. When the inserts had replicated to the slave, the slave threads were stopped. The slave data directory was copied to a backup location and the slave threads position in the master binlog noted. 10 sysbench clients, each configured with 10 threads, were spawned at the same time to generate a random schema load against each of the 10 schemas on the master. Each sysbench client executed 10,000 "update key" statements: UPDATE sbtest set k=k+1 WHERE id = <random row> In total, this generated 100,000 update statements to later replicate during the test itself. Test Methodology: The number of slave workers to test with was configured using: SET GLOBAL slave_parallel_workers=<workers> Then the slave IO thread was started and the test waited for all the update queries to be copied over to the relay log on the slave. The benchmark clock was started and then the slave SQL thread was started. The test waited for the slave SQL thread to finish executing the 100k update queries, doing "select master_pos_wait()". When master_pos_wait() returned, the benchmark clock was stopped and the duration calculated. The calculated duration from the benchmark clock should be close to the time it took for the SQL thread to execute the 100,000 update queries. The 100k queries divided by this duration gave the benchmark metric, reported as Queries Per Second (QPS). Test Reset: The test-reset cycle was implemented as follows: · the slave was stopped · the slave data directory replaced with the previous backup · the slave restarted with the slave threads replication pointer repositioned to the point before the update queries in the binlog. The test could then be repeated with identical set of queries but a different number of slave worker threads, enabling a fair comparison. The Test-Reset cycle was repeated 3 times for 0-24 number of workers and the QPS metric calculated and averaged for each worker count. MySQL Configuration The relevant configuration settings used for MySQL are as follows: binlog-format=STATEMENT relay-log-info-repository=TABLE master-info-repository=TABLE As described in the test procedure, the slave_parallel_workers setting was modified as part of the test logic. The consequence of changing this setting is: 0 worker threads:    - current (i.e. single threaded) sequential mode    - 1 x IO thread and 1 x SQL thread    - SQL thread both reads and executes the events 1 worker thread:    - sequential mode    - 1 x IO thread, 1 x Coordinator SQL thread and 1 x Worker thread    - coordinator reads the event and hands it to the worker who executes 2+ worker threads:    - parallel execution    - 1 x IO thread, 1 x Coordinator SQL thread and 2+ Worker threads    - coordinator reads events and hands them to the workers who execute them Results Figure 1 below shows that Multi-Threaded Slaves deliver ~5x higher replication performance when configured with 10 worker threads, with the load evenly distributed across our 10 x schemas. This result is compared to the current replication implementation which is based on a single SQL thread only (i.e. zero worker threads). Figure 1: 5x Higher Performance with Multi-Threaded Slaves The following figure shows more detailed results, with QPS sampled and reported as the worker threads are incremented. The raw numbers behind this graph are reported in the Appendix section of this post. Figure 2: Detailed Results As the results above show, the configuration does not scale noticably from 5 to 9 worker threads. When configured with 10 worker threads however, scalability increases significantly. The conclusion therefore is that it is desirable to configure the same number of worker threads as schemas. Other conclusions from the results: · Running with 1 worker compared to zero workers just introduces overhead without the benefit of parallel execution. · As expected, having more workers than schemas adds no visible benefit. Aside from what is shown in the results above, testing also demonstrated that the following settings had a very positive effect on slave performance: relay-log-info-repository=TABLE master-info-repository=TABLE For 5+ workers, it was up to 2.3 times as fast to run with TABLE compared to FILE. Conclusion As the results demonstrate, Multi-Threaded Slaves deliver significant performance increases to MySQL replication when handling multiple schemas. This, and the other replication enhancements introduced in MySQL 5.6 are fully available for you to download and evaluate now from the MySQL Developer site (select Development Release tab). You can learn more about MySQL 5.6 from the documentation  Please don’t hesitate to comment on this or other replication blogs with feedback and questions. Appendix – Detailed Results

    Read the article

  • A star algorithm implementation problems

    - by bryan226
    I’m having some trouble implementing the A* algorithm in a 2D tile based game. The problem is basically that the algorithm gets stuck when something gets in its direct way (e.g. walls) Note that it only allows Horizontal and Vertical movement. Here's a picture as it works fine across the map without something in its direct way: (Green tile = destination, Blue = In closed list, Green = in open list) This is what happens if I try to walk 'around' a wall: I calculate costs with the F = G + H formula: G = 1 Cost per Step H = 10 Cost per Step //Count how many tiles are between current-tile & destination-tile The functions: short c_astar::GuessH(short Startx,short Starty,short Destinationx,short Destinationy) { hgeVector Start, Destination; Start.x = Startx; Start.y = Starty; Destination.x = Destinationx; Destination.y = Destinationy; short a = 0; short b = 0; if(Start.x > Destination.x) a = Start.x - Destination.x; else a = Destination.x - Start.x; if(Start.y > Destination.y) b = Start.y - Destination.y; else b = Destination.y - Start.y; return (a+b)*10; } short c_astar::GuessG(short Startx,short Starty,short Currentx,short Currenty) { hgeVector Start, Destination; Start.x = Startx; Start.y = Starty; Destination.x = Currentx; Destination.y = Currenty; short a = 0; short b = 0; if(Start.x > Destination.x) a = Start.x - Destination.x; else a = Destination.x - Start.x; if(Start.y > Destination.y) b = Start.y - Destination.y; else b = Destination.y - Start.y; return (a+b); } At the end of the loop I check which tile is the cheapest to go according to its F value: Then some quick checks are done for each tile (UP,DOWN,LEFT,RIGHT): //...CX are holding the F value of the TILE specified // Info: C0 = Center (Current) // C1 = UP // C2 = DOWN // C3 = LEFT // C4 = RIGHT //Quick checks if(((C1 < C2) && (C1 < C3) && (C1 < C4))) { Current.y -= 1; bSimilar = false; if(DEBUG) hge->System_Log("C1 < ALL"); } //.. same for C2,C3 & C4 If there are multiple tiles with the same F value: It’s actually a switch for DOWNLEFT,UPRIGHT.. etc. Here’s one of it: case UPRIGHT: { //UP Temporary = Current; Temporary.y -= 1; bTileStatus[0] = IsTileWalkable(Temporary.x,Temporary.y); if(bTileStatus[0]) { //Proceed normal we are OK & walkable Tilex.Tile = map.at(Temporary.y).at(Temporary.x); //Search in lists if(SearchInClosedList(Tilex.Tile.ID,C0)) bFoundInClosedList[0] = true; if(SearchInOpenList(Tilex.Tile.ID,C0)) bFoundInOpenList[0] = true; //RIGHT Temporary = Current; Temporary.x += 1; bTileStatus[1] = IsTileWalkable(Temporary.x,Temporary.y); if(bTileStatus[1]) { //Proceed normal we are OK & walkable Tilex.Tile = map.at(Temporary.y).at(Temporary.x); //Search in lists if(SearchInClosedList(Tilex.Tile.ID,C0)) bFoundInClosedList[1] = true; if(SearchInOpenList(Tilex.Tile.ID,C0)) bFoundInOpenList[1] = true; //************************************************* // Purpose: ClosedList behavior //************************************************* if(bFoundInClosedList[0] && !bFoundInClosedList[1]) { //UP found in ClosedList. Go RIGHT return RIGHT; } if(!bFoundInClosedList[0] && bFoundInClosedList[1]) { //RIGHT found in ClosedList. Go UP return UP; } if(bFoundInClosedList[0] && bFoundInClosedList[1]) { //Both found in ClosedList. Random value switch(hge->Random_Int(8,9)) { case 8: return UP; break; case 9: return RIGHT; break; } } //************************************************* // Purpose: OpenList behavior //************************************************* if(bFoundInOpenList[0] && !bFoundInOpenList[1]) { //UP found in OpenList. Go RIGHT return RIGHT; } if(!bFoundInOpenList[0] && bFoundInOpenList[1]) { //RIGHT found in OpenList. Go UP return UP; } if(bFoundInOpenList[0] && bFoundInOpenList[1]) { //Both found in OpenList. Random value switch(hge->Random_Int(8,9)) { case 8: return UP; break; case 9: return RIGHT; break; } } } else if(!bTileStatus[1]) { //RIGHT is not walkable OR out of range //Choose UP return UP; } } else if(!bTileStatus[0]) { //UP is not walkable OR out of range //Fast check RIGHT Temporary = Current; Temporary.x += 1; bTileStatus[1] = IsTileWalkable(Temporary.x,Temporary.y); if(bTileStatus[1]) { return RIGHT; } else return FAILED; //Failed, no valid path found! } } break; A log for the second picture: (Cut down to ten passes, because it’s just repeating itself) ----------------------------------------------------- PASS: 1 | C1: 211 | C2: 191 | C3: 211 | C4: 191 DOWN + RIGHT SIMILAR Going DOWN ----------------------------------------------------- PASS: 2 | C1: 200 | C2: 182 | C3: 202 | C4: 182 DOWN + RIGHT SIMILAR Going DOWN ----------------------------------------------------- PASS: 3 | C1: 191 | C2: 193 | C3: 193 | C4: 173 C4 < ALL Tile(12.000000,6.000000) not walkable. MAX_F_VALUE set. ----------------------------------------------------- PASS: 4 | C1: 182 | C2: 184 | C3: 182 | C4: 999 UP + LEFT SIMILAR Going UP Tile(12.000000,5.000000) not walkable. MAX_F_VALUE set. ----------------------------------------------------- PASS: 5 | C1: 191 | C2: 173 | C3: 191 | C4: 999 C2 < ALL Tile(12.000000,6.000000) not walkable. MAX_F_VALUE set. ----------------------------------------------------- PASS: 6 | C1: 182 | C2: 184 | C3: 182 | C4: 999 UP + LEFT SIMILAR Going UP Tile(12.000000,5.000000) not walkable. MAX_F_VALUE set. ----------------------------------------------------- PASS: 7 | C1: 191 | C2: 173 | C3: 191 | C4: 999 C2 < ALL Tile(12.000000,6.000000) not walkable. MAX_F_VALUE set. ----------------------------------------------------- PASS: 8 | C1: 182 | C2: 184 | C3: 182 | C4: 999 UP + LEFT SIMILAR Going LEFT ----------------------------------------------------- PASS: 9 | C1: 191 | C2: 193 | C3: 193 | C4: 173 C4 < ALL Tile(12.000000,6.000000) not walkable. MAX_F_VALUE set. ----------------------------------------------------- PASS: 10 | C1: 182 | C2: 184 | C3: 182 | C4: 999 UP + LEFT SIMILAR Going LEFT ----------------------------------------------------- Its always going after the cheapest F value, which seems to be wrong. If someone could point me to the right direction I'd be thankful. Regards, bryan226

    Read the article

  • PostSharp, Obfuscation, and IL

    - by Simon Cooper
    Aspect-oriented programming (AOP) is a relatively new programming paradigm. Originating at Xerox PARC in 1994, the paradigm was first made available for general-purpose development as an extension to Java in 2001. From there, it has quickly been adapted for use in all the common languages used today. In the .NET world, one of the primary AOP toolkits is PostSharp. Attributes and AOP Normally, attributes in .NET are entirely a metadata construct. Apart from a few special attributes in the .NET framework, they have no effect whatsoever on how a class or method executes within the CLR. Only by using reflection at runtime can you access any attributes declared on a type or type member. PostSharp changes this. By declaring a custom attribute that derives from PostSharp.Aspects.Aspect, applying it to types and type members, and running the resulting assembly through the PostSharp postprocessor, you can essentially declare 'clever' attributes that change the behaviour of whatever the aspect has been applied to at runtime. A simple example of this is logging. By declaring a TraceAttribute that derives from OnMethodBoundaryAspect, you can automatically log when a method has been executed: public class TraceAttribute : PostSharp.Aspects.OnMethodBoundaryAspect { public override void OnEntry(MethodExecutionArgs args) { MethodBase method = args.Method; System.Diagnostics.Trace.WriteLine( String.Format( "Entering {0}.{1}.", method.DeclaringType.FullName, method.Name)); } public override void OnExit(MethodExecutionArgs args) { MethodBase method = args.Method; System.Diagnostics.Trace.WriteLine( String.Format( "Leaving {0}.{1}.", method.DeclaringType.FullName, method.Name)); } } [Trace] public void MethodToLog() { ... } Now, whenever MethodToLog is executed, the aspect will automatically log entry and exit, without having to add the logging code to MethodToLog itself. PostSharp Performance Now this does introduce a performance overhead - as you can see, the aspect allows access to the MethodBase of the method the aspect has been applied to. If you were limited to C#, you would be forced to retrieve each MethodBase instance using Type.GetMethod(), matching on the method name and signature. This is slow. Fortunately, PostSharp is not limited to C#. It can use any instruction available in IL. And in IL, you can do some very neat things. Ldtoken C# allows you to get the Type object corresponding to a specific type name using the typeof operator: Type t = typeof(Random); The C# compiler compiles this operator to the following IL: ldtoken [mscorlib]System.Random call class [mscorlib]System.Type [mscorlib]System.Type::GetTypeFromHandle( valuetype [mscorlib]System.RuntimeTypeHandle) The ldtoken instruction obtains a special handle to a type called a RuntimeTypeHandle, and from that, the Type object can be obtained using GetTypeFromHandle. These are both relatively fast operations - no string lookup is required, only direct assembly and CLR constructs are used. However, a little-known feature is that ldtoken is not just limited to types; it can also get information on methods and fields, encapsulated in a RuntimeMethodHandle or RuntimeFieldHandle: // get a MethodBase for String.EndsWith(string) ldtoken method instance bool [mscorlib]System.String::EndsWith(string) call class [mscorlib]System.Reflection.MethodBase [mscorlib]System.Reflection.MethodBase::GetMethodFromHandle( valuetype [mscorlib]System.RuntimeMethodHandle) // get a FieldInfo for the String.Empty field ldtoken field string [mscorlib]System.String::Empty call class [mscorlib]System.Reflection.FieldInfo [mscorlib]System.Reflection.FieldInfo::GetFieldFromHandle( valuetype [mscorlib]System.RuntimeFieldHandle) These usages of ldtoken aren't usable from C# or VB, and aren't likely to be added anytime soon (Eric Lippert's done a blog post on the possibility of adding infoof, methodof or fieldof operators to C#). However, PostSharp deals directly with IL, and so can use ldtoken to get MethodBase objects quickly and cheaply, without having to resort to string lookups. The kicker However, there are problems. Because ldtoken for methods or fields isn't accessible from C# or VB, it hasn't been as well-tested as ldtoken for types. This has resulted in various obscure bugs in most versions of the CLR when dealing with ldtoken and methods, and specifically, generic methods and methods of generic types. This means that PostSharp was behaving incorrectly, or just plain crashing, when aspects were applied to methods that were generic in some way. So, PostSharp has to work around this. Without using the metadata tokens directly, the only way to get the MethodBase of generic methods is to use reflection: Type.GetMethod(), passing in the method name as a string along with information on the signature. Now, this works fine. It's slower than using ldtoken directly, but it works, and this only has to be done for generic methods. Unfortunately, this poses problems when the assembly is obfuscated. PostSharp and Obfuscation When using ldtoken, obfuscators don't affect how PostSharp operates. Because the ldtoken instruction directly references the type, method or field within the assembly, it is unaffected if the name of the object is changed by an obfuscator. However, the indirect loading used for generic methods was breaking, because that uses the name of the method when the assembly is put through the PostSharp postprocessor to lookup the MethodBase at runtime. If the name then changes, PostSharp can't find it anymore, and the assembly breaks. So, PostSharp needs to know about any changes an obfuscator does to an assembly. The way PostSharp does this is by adding another layer of indirection. When PostSharp obfuscation support is enabled, it includes an extra 'name table' resource in the assembly, consisting of a series of method & type names. When PostSharp needs to lookup a method using reflection, instead of encoding the method name directly, it looks up the method name at a fixed offset inside that name table: MethodBase genericMethod = typeof(ContainingClass).GetMethod(GetNameAtIndex(22)); PostSharp.NameTable resource: ... 20: get_Prop1 21: set_Prop1 22: DoFoo 23: GetWibble When the assembly is later processed by an obfuscator, the obfuscator can replace all the method and type names within the name table with their new name. That way, the reflection lookups performed by PostSharp will now use the new names, and everything will work as expected: MethodBase genericMethod = typeof(#kGy).GetMethod(GetNameAtIndex(22)); PostSharp.NameTable resource: ... 20: #kkA 21: #zAb 22: #EF5a 23: #2tg As you can see, this requires direct support by an obfuscator in order to perform these rewrites. Dotfuscator supports it, and now, starting with SmartAssembly 6.6.4, SmartAssembly does too. So, a relatively simple solution to a tricky problem, with some CLR bugs thrown in for good measure. You don't see those every day!

    Read the article

< Previous Page | 229 230 231 232 233 234 235 236 237 238 239 240  | Next Page >