Search Results

Search found 2108 results on 85 pages for 'differences'.

Page 79/85 | < Previous Page | 75 76 77 78 79 80 81 82 83 84 85 | Next Page >

CodePlex Daily Summary for Sunday, May 09, 2010

CodePlex Daily Summary for Sunday, May 09, 2010New ProjectsArtificial Spy: ASPX, C#, XML, This is big project for creation application for: People Search. People connection. Background check. Crime Prevention. Socia...Chef Framework: CHEF: CSS, HTML, Events, & Functions. Is a collection of libraries to help build concerns separated websites.Crabit Full File Manger: File manager SystemEPiMVC - EPiServer CMS with ASP.NET MVC: A framework for using EPiServer CMS with ASP.NET MVC.Fimyid IX: all My projects In ONE!Hosting Folder Sizes: When you run a hosting environment with the ability to upload files, and you're charging per gigabyte per month, you need quick statistics about di...Let's Up: A tool that breaks you every 50 minutes to protect your health.MediaXenter: MediaXenterMSClub BY: Microsoft community web-site projectOrbisArca: Windows Mobile gameSpatial Gateway: A common and standardized way of accessing spatial data stored in different datastores. This also includes functionality for replication across dif...Squiggle - A Free open source Lan Messenger: Squiggle is a free lan messenger that does not require a server. Just download and run it and you're ready to talk to everyone on your lan. Squi...Video Downloader: Video Downloader makes it easier for developers to generate download links for videos from You-Tube. You'll no longer have to search through source...ViewModelSupport: This is not an MVVM framework. It is just the base class I use to reduce the friction when writing ViewModels. Making it public to share my ideas.WPF CCTV Surveillance Control with IP Cameras.: Integracion de Video IP en WPF. IP Camera. WPF dinamic Control and Events in Video System. Surveillance system. CCTV system. .NetNew Releases.NET Extensions - Extension Methods Library: Release 2010.07: Added some extension methods for ICollection<T> and IList<T> for demonstrating the differences between both interfaces: - ICollection<T>.AddRangeU...bitly.net: bitly.dll: This is a .net DLL that works with the on-line URL shortening service bit.ly Compiled using .net 4.0 but the source code should run with version 2 ...Crabit Full File Manger: Crabit V1.0: Firts file manager system previewCSharp Intellisense: V1.9: this is a major release that was focus on bug fix, tooltip support and styling.Fimyid IX: fimyid ix 1.0: New Liscence!Gherkin editor: Beta: Added support for i18n (all languages supported by Gherkin/Cucumber are suported). Removed auto-completion of statements like As a user and followi...Grunty OS: GruntyOSAlphaSC: Grunty os sourceHKGolden Express: HKGoldenExpress (Build 201005081830): New features: Users can post new message or reply to a message. Special thanks for help from members 劉佳偉 (ID: 179892) and Maize. (ID: 142974). Bu...iLove SharePoint: Lookup Field with Picker 2010: Just forget the fuc**** dropdowns! Requirements: SharePoint Foundation 2010 Features * Single- and multi-Selection Mode * Search in pick...LazyNet: LazyNet Beta 2: Refresh Network Bug fixed.LazyNet: LazyNEt_Beta3: Beta 3 Release, Better Network RefreshingLazyNet: LazyNetBeta3_SRC: Refresh NetworkLet's Up: 1.0 (Build 100509): This is the first versionLive Distributed Objects: Windows Installer r48444 (2010-05-08): current development snapshotMDownloader: MDownloader-0.15.12.58576: Fixed presenting Hotfile's captcha. Fixed FilesTube searching. Fixed determining Rapidshare postpone period. Fixed minor bugs.NSIS Autorun: NSIS Autorun 0.1.7: This release includes source code, executable binaries and example materials.SharpDevelop: SharpDevelop 3.2: Release notes: http://community.sharpdevelop.net/forums/t/11165.aspxSilverlight SDK for Bing: Silverlight SDK for Bing 1.4: Build for Visual Studio 2010 and Silverlight 4 Issues Addressed10337 10342 10367 10804 10805 10806 10807 DownloadsSilverlight SDK For B...sqwarea: Sqwarea 0.0.252.0 (alpha): This release corrects a critical bug in Persistence.GameProvider.GetNextKingId. We strongly recommend you to upgrade to this version.Stratosphere: Stratosphere 1.0.5.1: Added many features to Amazon Web Services Shell (AwsSh) Improved scalable table reader for SimpleDB multi-valued attributes Added more functio...TechEdOneNoter: TechEdOneNoter verison 2010.5.9.2010: TechEdOneNoter is a utility to create OneNote Pages based on sessions selected in the TechEd North America 2010 Session Builder. This is the first...Video Downloader: Version 1.0: Version 1.0 See Home Page for usage and more information. Please remember changes at You-Tube can prevent this software from working.Visual Studio - Lua Language Support: May 8th Update: Release NotesThis release adds collapsible functions and tables. What's new:Collapsible functions and tables Using latest version of Irony Maj...WPF CCTV Surveillance Control with IP Cameras.: Hungry Foxx CCTV Preview: Este release, corresponde a una muestra del programa completo. La idea es poder contar con una solucion para los integradores de sistemas CCTV o d...XsltDb - DotNetNuke XSLT module: 01.01.08: Bugs fixed: 17204 17203 Many new features, but undocumented yet. I'm going to update docs in a week or two, but...Most Popular ProjectsWBFS ManagerRawrAJAX Control ToolkitMicrosoft SQL Server Product Samples: DatabaseSilverlight ToolkitWindows Presentation Foundation (WPF)patterns & practices – Enterprise LibraryMicrosoft SQL Server Community & SamplesASP.NETPHPExcelMost Active Projectspatterns & practices – Enterprise LibraryRawrThe Information Literacy Education Learning Environment (ILE)AJAX Control FrameworkCaliburn: An Application Framework for WPF and SilverlightMirror Testing SystemjQuery Library for SharePoint Web Servicespatterns & practices - UnityBlogEngine.NETTweetSharp

Read the article
ASP.NET MVC 3 Hosting :: Rolling with Razor in MVC v3 Preview

- by mbridge

Razor is an alternate view engine for asp.net MVC. It was introduced in the “WebMatrix” tool and has now been released as part of the asp.net MVC 3 preview 1. Basically, Razor allows us to replace the clunky <% %> syntax with a much cleaner coding model, which integrates very nicely with HTML. Additionally, it provides some really nice features for master page type scenarios and you don’t lose access to any of the features you are currently familiar with, such as HTML helper methods. First, download and install the ASP.NET MVC Preview 1. You can find this at http://www.microsoft.com/downloads/details.aspx?FamilyID=cb42f741-8fb1-4f43-a5fa-812096f8d1e8&displaylang=en. Now, follow these steps to create your first asp.net mvc project using Razor: 1. Open Visual Studio 2010 2. Create a new project. Select File->New->Project (Shift Control N) 3. You will see the list of project types which should look similar to what’s shown: 4. Select “ASP.NET MVC 3 Web Application (Razor).” Set the application name to RazorTest and the path to c:projectsRazorTest for this tutorial. If you select accidently select ASPX, you will end up with the standard asp.net view engine and template, which isn’t what you want. 5. For this tutorial, and ONLY for this tutorial, select “No, do not create a unit test project.” In general, you should create and use a unit test project. Code without unit tests is kind of like diet ice cream. It just isn’t very good. Now, once we have this done, our brand new project will be created. In all likelihood, Visual Studio will leave you looking at the “HomeController.cs” class, as shown below: Immediately, you should notice one difference. The Index action used to look like: public ActionResult Index () { ViewData[“Message”] = “Welcome to ASP.Net MVC!”; Return View(); } While this will still compile and run just fine, ASP.Net MVC 3 has a much nicer way of doing this: public ActionResult Index() { ViewModel.Message = “Welcome to ASP.Net MVC!”; Return View(); } Instead of using ViewData we are using the new ViewModel object, which uses the new dynamic data typing of .Net 4.0 to allow us to express ourselves much more cleanly. This isn’t a tutorial on ALL of MVC 3, but the ViewModel concept is one we will need as we dig into Razor. What comes in the box? When we create a project using the ASP.Net MVC 3 Template with Razor, we get a standard project setup, just like we did in ASP.NET MVC 2.0 but with some differences. Instead of seeing “.aspx” view files and “.ascx” files, we see files with the “.cshtml” which is the default razor extension. Before we discuss the details of a razor file, one thing to keep in mind is that since this is an extremely early preview, intellisense is not currently enabled with the razor view engine. This is promised as an updated before the final release. Just like with the aspx view engine, the convention of the folder name for a set of views matching the controller name without the word “Controller” still stands. Similarly, each action in the controller will usually have a corresponding view file in the appropriate view directory. Remember, in asp.net MVC, convention over configuration is key to successful development! The initial template organizes views in the following folders, located in the project under Views: - Account – The default account management views used by the Account controller. Each file represents a distinct view. - Home – Views corresponding to the appropriate actions within the home controller. - Shared – This contains common view objects used by multiple views. Within here, master pages are stored, as well as partial page views (user controls). By convention, these partial views are named “_XXXPartial.cshtml” where XXX is the appropriate name, such as _LogonPartial.cshtml. Additionally, display templates are stored under here. With this in mind, let us take a look at the index.cshtml file under the home view directory. When you open up index.cshtml you should see 1: @inherits System.Web.Mvc.WebViewPage 2: @{ 3: View.Title = "Home Page"; 4: LayoutPage = "~/Views/Shared/_Layout.cshtml"; 5: } 6: <h2>@View.Message</h2> 7: <p> 8: To learn more about ASP.NET MVC visit <a href="http://asp.net/mvc" title="ASP.NET MVC 9: Website">http://asp.net/mvc</a>. 10: </p> So looking through this, we observe the following facts: Line 1 imports the base page that all views (using Razor) are based on, which is System.Web.Mvc.WebViewPage. Note that this is different than System.Web.MVC.ViewPage which is used by asp.net MVC 2.0 Also note that instead of the <% %> syntax, we use the very simple ‘@’ sign. The View Engine contains enough context sensitive logic that it can even distinguish between @ in code and @ in an email. It’s a very clean markup. Line 2 introduces the idea of a code block in razor. A code block is a scoping mechanism just like it is in a normal C# class. It is designated by @{… } and any C# code can be placed in between. Note that this is all server side code just like it is when using the aspx engine and <% %>. Line 3 allows us to set the page title in the client page’s file. This is a new feature which I’ll talk more about when we get to master pages, but it is another of the nice things razor brings to asp.net mvc development. Line 4 is where we specify our “master” page, but as you can see, you can place it almost anywhere you want, because you tell it where it is located. A Layout Page is similar to a master page, but it gains a bit when it comes to flexibility. Again, we’ll come back to this in a later installment. Line 6 and beyond is where we display the contents of our view. No more using <%: %> intermixed with code. Instead, we get to use very clean syntax such as @View.Message. This is a lot easier to read than <%:@View.Message%> especially when intermixed with html. For example: <p> My name is @View.Name and I live at @View.Address </p> Compare this to the equivalent using the aspx view engine <p> My name is <%:View.Name %> and I live at <%: View.Address %> </p> While not an earth shaking simplification, it is easier on the eyes. As we explore other features, this clean markup will become more and more valuable.

Read the article
Using Lightbox with _Screen

Although, I have to admit that I discovered Bernard Bout's ideas and concepts about implementing a lightbox in Visual FoxPro quite a while ago, there was no "spare" time in active projects that allowed me to have a closer look into his solution(s). Luckily, these days I received a demand to focus a little bit more on this. This article describes the steps about how to integrate and make use of Bernard's lightbox class in combination with _Screen in Visual FoxPro. The requirement in this project was to be able to visually lock the whole application (_Screen area) and guide the user to an information that should not be ignored easily. Depending on the importance any current user activity should be interrupted and focus put onto the notification. Getting the "meat", eh, source code Please check out Bernard's blog on Foxite directly in order to get the latest and greatest version. As time of writing this article I use version 6.0 as described in this blog entry: The Fastest Lightbox Ever The Lightbox class is sub-classed from the imgCanvas class from the GdiPlusX project on VFPx and therefore you need to have the source code of GdiPlusX as well, and integrate it into your development environment. The version I use is available here: Release GDIPlusX 1.20 As soon as you open the bbGdiLightbox class the first it, VFP might ask you to update the reference to the gdiplusx.vcx. As we have the sources, no problem and you have access to Bernard's code. The class itself is pretty easy to understand, some properties that you do not need to change and three methods: Setup(), ShowLightbox() and BeforeDraw() The challenge - _Screen or not? Reading Bernard's article about the fastest lightbox ever, he states the following: "The class will only work on a form. It will not support any other containers" Really? And what about _Screen? Isn't that a form class, too? Yes, of course it is but nonetheless trying to use _Screen directly will fail. Well, let's have look at the code to see why: WITH This .Left = 0 .Top = 0 .Height = ThisForm.Height .Width = ThisForm.Width .ZOrder(0) .Visible = .F.ENDWITH During the setup of the lightbox as well as while capturing the image as replacement for your forms and controls, the object reference Thisform is used. Which is a little bit restrictive to my opinion but let's continue. The second issue lies in the method ShowLightbox() and introduced by the call of .Bitmap.FromScreen(): Lparameters tlVisiblilty* tlVisiblilty - show or hide (T/F)* grab a screen dump with controlsIF tlVisiblilty Local loCaptureBmp As xfcBitmap Local lnTitleHeight, lnLeftBorder, lnTopBorder, lcImage, loImage lnTitleHeight = IIF(ThisForm.TitleBar = 1,Sysmetric(9),0) lnLeftBorder = IIF(ThisForm.BorderStyle < 2,0,Sysmetric(3)) lnTopBorder = IIF(ThisForm.BorderStyle < 2,0,Sysmetric(4)) With _Screen.System.Drawing loCaptureBmp = .Bitmap.FromScreen(ThisForm.HWnd,; lnLeftBorder,; lnTopBorder+lnTitleHeight,; ThisForm.Width ,; ThisForm.Height) ENDWITH * save it to a property This.capturebmp = loCaptureBmp ThisForm.SetAll("Visible",.F.) This.DraW() This.Visible = .T.ELSE ThisForm.SetAll("Visible",.T.) This.Visible = .F.ENDIF My first trials in using the class ended in an exception - GdiPlusError:OutOfMemory - thrown by the Bitmap object. Frankly speaking, this happened mainly because of my lack of knowledge about GdiPlusX. After reading some documentation, especially about the FromScreen() method I experimented a little bit. Capturing the visible area of _Screen actually was not the real problem but the dimensions I specified for the bitmap. The modifications - step by step First of all, it is to get rid of restrictive object references on Thisform and to change them into either This.Parent or more generic into This.oForm (even better: This.oControl). The Lightbox.Setup() method now sets the necessary object reference like so: *====================================================================* Initial setup* Default value: This.oControl = "This.Parent"* Alternative: This.oControl = "_Screen"*====================================================================With This .oControl = Evaluate(.oControl) If Vartype(.oControl) == T_OBJECT .Anchor = 0 .Left = 0 .Top = 0 .Width = .oControl.Width .Height = .oControl.Height .Anchor = 15 .ZOrder(0) .Visible = .F. EndIfEndwith Also, based on other developers' comments in Bernard articles on his lightbox concept and evolution I found the source code to handle the differences between a form and _Screen and goes into Lightbox.ShowLightbox() like this: *====================================================================* tlVisibility - show or hide (T/F)* grab a screen dump with controls*====================================================================Lparameters tlVisibility Local loControl m.loControl = This.oControl If m.tlVisibility Local loCaptureBmp As xfcBitmap Local lnTitleHeight, lnLeftBorder, lnTopBorder, lcImage, loImage lnTitleHeight = Iif(m.loControl.TitleBar = 1,Sysmetric(9),0) lnLeftBorder = Iif(m.loControl.BorderStyle < 2,0,Sysmetric(3)) lnTopBorder = Iif(m.loControl.BorderStyle < 2,0,Sysmetric(4)) With _Screen.System.Drawing If Upper(m.loControl.Name) == Upper("Screen") loCaptureBmp = .Bitmap.FromScreen(m.loControl.HWnd) Else loCaptureBmp = .Bitmap.FromScreen(m.loControl.HWnd,; lnLeftBorder,; lnTopBorder+lnTitleHeight,; m.loControl.Width ,; m.loControl.Height) EndIf Endwith * save it to a property This.CaptureBmp = loCaptureBmp m.loControl.SetAll("Visible",.F.) This.Draw() This.Visible = .T. Else This.CaptureBmp = .Null. m.loControl.SetAll("Visible",.T.) This.Visible = .F. Endif {loadposition content_adsense} Are we done? Almost... Although, Bernard says it clearly in his article: "Just drop the class on a form and call it as shown." It did not come clear to my mind in the first place with _Screen, but, yeah, he is right. Dropping the class on a form provides a permanent link between those two classes, it creates a valid This.Parent object reference. Bearing in mind that the lightbox class can not be "dropped" on the _Screen, we have to create the same type of binding during runtime execution like so: *====================================================================* Create global lightbox component*==================================================================== Local llOk, loException As Exception m.llOk = .F. m.loException = .Null. If Not Vartype(_Screen.Lightbox) == "O" Try _Screen.AddObject("Lightbox", "bbGdiLightbox") Catch To m.loException Assert .F. Message m.loException.Message EndTry EndIf m.llOk = (Vartype(_Screen.Lightbox) == "O")Return m.llOk Through runtime instantiation we create a valid binding to This.Parent in the lightbox object and the code works as expected with _Screen. Ease your life: Use properties instead of constants Having a closer look at the BeforeDraw() method might wet your appetite to simplify the code a little bit. Looking at the sample screenshots in Bernard's article you see several forms in different colors. This got me to modify the code like so: *====================================================================* Apply the actual lightbox effect on the captured bitmap.*====================================================================If Vartype(This.CaptureBmp) == T_OBJECT Local loGfx As xfcGraphics loGfx = This.oGfx With _Screen.System.Drawing loGfx.DrawImage(This.CaptureBmp,This.Rectangle,This.Rectangle,.GraphicsUnit.Pixel) * change the colours as needed here * possible colours are (220,128,0,0),(220,0,0,128) etc. loBrush = .SolidBrush.New(.Color.FromArgb( ; This.Opacity, .Color.FromRGB(This.BorderColor))) loGfx.FillRectangle(loBrush,This.Rectangle) EndwithEndif Create an additional property Opacity to specify the grade of translucency you would like to have without the need to change the code in each instance of the class. This way you only need to change the values of Opacity and BorderColor to tweak the appearance of your lightbox. This could be quite helpful to signalize different levels of importance (ie. green, yellow, orange, red, etc...) of notifications to the users of the application. Final thoughts Using the lightbox concept in combination with _Screen instead of forms is possible. Already Jim Wiggins comments in Bernard's article to loop through the _Screen.Forms collection in order to cascade the lightbox visibility to all active forms. Good idea. But honestly, I believe that instead of looping all forms one could use _Screen.SetAll("ShowLightbox", .T./.F., "Form") with Form.ShowLightbox_Access method to gain more speed. The modifications described above might provide even more features to your applications while consuming less resources and performance. Additionally, the restrictions to capture only forms does not exist anymore. Using _Screen you are able to capture and cover anything. The captured area of _Screen does not include any toolbars, docked windows, or menus. Therefore, it is advised to take this concept on a higher level and to combine it with additional classes that handle the state of toolbars, docked windows and menus. Which I did for the customer's project.

Read the article
SQL SERVER – 3 Online SQL Courses at Pluralsight and Free Learning Resources

- by pinaldave

Usain Bolt is an inspiration for all. He broke his own record multiple times because he wanted to do better! Read more about him on wikipedia. He is great and indeed fastest man on the planet. Usain Bolt – World’s Fastest Man “Can you teach me SQL Server Performance Tuning?” This is one of the most popular questions which I receive all the time. The answer is YES. I would love to do performance tuning training for anyone, anywhere. It is my favorite thing to do, and it is my favorite thing to train others in. If possible, I would love to do training 24 hours a day, 7 days a week, 365 days a year. To me, it doesn’t feel like a job. Of course, as much as I would love to do performance tuning 24/7/365, obviously I am just one human being and can only be in one place t one time. It is also very difficult to train more than one person at a time, and it is difficult to train two or more people at a time, especially when the two people are at different levels. I am also limited by geography. I live in India, and adjust to my own time zone. Trying to teach a live course from India to someone whose time zone is 12 or more hours off of mine is very difficult. If I am trying to teach at 2 am, I am sure I am not at my best! There was only one solution to scale – Online Trainings. I have built 3 different courses on SQL Server Performance Tuning with Pluralsight. Now I have no problem – I am 100% scalable and available 24/7 and 365. You can make me say the same things again and again till you find it right. I am in your mobile, PC as well as on XBOX. This is why I am such a big fan of online courses. I have recorded many performance tuning classes and you can easily access them online, at your own time. And don’t think that just because these aren’t live classes you won’t be able to get any feedback from me. I encourage all my viewers to go ahead and ask me questions by e-mail, Twitter, Facebook, or whatever way you can get a hold of me. Here are details of three of my courses with Pluralsight. I suggest you go over the description of the course. As an author of the course, I have few FREE codes for watching the free courses. Please leave a comment with your valid email address, I will send a few of them to random winners. SQL Server Performance: Introduction to Query Tuning SQL Server performance tuning is an art to master – for developers and DBAs alike. This course takes a systematic approach to planning, analyzing, debugging and troubleshooting common query-related performance problems. This includes an introduction to understanding execution plans inside SQL Server. In this almost four hour course we cover following important concepts. Introduction 10:22 Execution Plan Basics 45:59 Essential Indexing Techniques 20:19 Query Design for Performance 50:16 Performance Tuning Tools 01:15:14 Tips and Tricks 25:53 Checklist: Performance Tuning 07:13 The duration of each module is mentioned besides the name of the module. SQL Server Performance: Indexing Basics This course teaches you how to master the art of performance tuning SQL Server by better understanding indexes. In this almost two hour course we cover following important concepts. Introduction 02:03 Fundamentals of Indexing 22:21 Practical Indexing Implementation Techniques 37:25 Index Maintenance 16:33 Introduction to ColumnstoreIndex 08:06 Indexing Practical Performance Tips and Tricks 24:56 Checklist : Index and Performance 07:29 The duration of each module is mentioned besides the name of the module. SQL Server Questions and Answers This course is designed to help you better understand how to use SQL Server effectively. The course presents many of the common misconceptions about SQL Server, and then carefully debunks those misconceptions with clear explanations and short but compelling demos, showing you how SQL Server really works. In this almost 2 hours and 15 minutes course we cover following important concepts. Introduction 00:54 Retrieving IDENTITY value using @@IDENTITY 08:38 Concepts Related to Identity Values 04:15 Difference between WHERE and HAVING 05:52 Order in WHERE clause 07:29 Concepts Around Temporary Tables and Table Variables 09:03 Are stored procedures pre-compiled? 05:09 UNIQUE INDEX and NULLs problem 06:40 DELETE VS TRUNCATE 06:07 Locks and Duration of Transactions 15:11 Nested Transaction and Rollback 09:16 Understanding Date/Time Datatypes 07:40 Differences between VARCHAR and NVARCHAR datatypes 06:38 Precedence of DENY and GRANT security permissions 05:29 Identify Blocking Process 06:37 NULLS usage with Dynamic SQL 08:03 Appendix Tips and Tricks with Tools 20:44 The duration of each module is mentioned besides the name of the module. SQL in Sixty Seconds You will have to login and to get subscribed to the courses to view them. Here are my free video learning resources SQL in Sixty Seconds. These are 60 second video which I have built on various subjects related to SQL Server. Do let me know what you think about them? Here are three of my latest videos: Identify Most Resource Intensive Queries – SQL in Sixty Seconds #028 Copy Column Headers from Resultset – SQL in Sixty Seconds #027 Effect of Collation on Resultset – SQL in Sixty Seconds #026 You can watch and learn at your own pace. Then you can easily ask me any questions you have. E-mail is easiest, but for really tough questions I’m willing to talk on Skype, Gtalk, or even Facebook chat. Please do watch and then talk with me, I am always available on the internet! Here is the video of the world’s fastest man.Usain St. Leo Bolt inspires us that we all do better than best. We can go the next level of our own record. We all can improve if we have a will and dedication. Watch the video from 5:00 mark. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL in Sixty Seconds, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, SQL Training, SQLServer, T SQL, Technology, Video

Read the article
On StringComparison Values

- by Jesse

When you use the .NET Framework’s String.Equals and String.Compare methods do you use an overloStringComparison enumeration value? If not, you should be because the value provided for that StringComparison argument can have a big impact on the results of your string comparison. The StringComparison enumeration defines values that fall into three different major categories: Culture-sensitive comparison using a specific culture, defaulted to the Thread.CurrentThread.CurrentCulture value (StringComparison.CurrentCulture and StringComparison.CurrentCutlureIgnoreCase) Invariant culture comparison (StringComparison.InvariantCulture and StringComparison.InvariantCultureIgnoreCase) Ordinal (byte-by-byte) comparison of (StringComparison.Ordinal and StringComparison.OrdinalIgnoreCase) There is a lot of great material available that detail the technical ins and outs of these different string comparison approaches. If you’re at all interested in the topic these two MSDN articles are worth a read: Best Practices For Using Strings in the .NET Framework: http://msdn.microsoft.com/en-us/library/dd465121.aspx How To Compare Strings: http://msdn.microsoft.com/en-us/library/cc165449.aspx Those articles cover the technical details of string comparison well enough that I’m not going to reiterate them here other than to say that the upshot is that you typically want to use the culture-sensitive comparison whenever you’re comparing strings that were entered by or will be displayed to users and the ordinal comparison in nearly all other cases. So where does that leave the invariant culture comparisons? The “Best Practices For Using Strings in the .NET Framework” article has the following to say: “On balance, the invariant culture has very few properties that make it useful for comparison. It does comparison in a linguistically relevant manner, which prevents it from guaranteeing full symbolic equivalence, but it is not the choice for display in any culture. One of the few reasons to use StringComparison.InvariantCulture for comparison is to persist ordered data for a cross-culturally identical display. For example, if a large data file that contains a list of sorted identifiers for display accompanies an application, adding to this list would require an insertion with invariant-style sorting.” I don’t know about you, but I feel like that paragraph is a bit lacking. Are there really any “real world” reasons to use the invariant culture comparison? I think the answer to this question is, “yes”, but in order to understand why we should first think about what the invariant culture comparison really does. The invariant culture comparison is really just a culture-sensitive comparison using a special invariant culture (Michael Kaplan has a great post on the history of the invariant culture on his blog: http://blogs.msdn.com/b/michkap/archive/2004/12/29/344136.aspx). This means that the invariant culture comparison will apply the linguistic customs defined by the invariant culture which are guaranteed not to differ between different machines or execution contexts. This sort of consistently does prove useful if you needed to maintain a list of strings that are sorted in a meaningful and consistent way regardless of the user viewing them or the machine on which they are being viewed. Example: Prototype Names Let’s say that you work for a large multi-national toy company with branch offices in 10 different countries. Each year the company would work on 15-25 new toy prototypes each of which is assigned a “code name” while it is under development. Coming up with fun new code names is a big part of the company culture that everyone really enjoys, so to be fair the CEO of the company spent a lot of time coming up with a prototype naming scheme that would be fun for everyone to participate in, fair to all of the different branch locations, and accessible to all members of the organization regardless of the country they were from and the language that they spoke. Each new prototype will get a code name that begins with a letter following the previously created name using the alphabetical order of the Latin/Roman alphabet. Each new year prototype names would start back at “A”. The country that leads the prototype development effort gets to choose the name in their native language. (An appropriate Romanization system will be used for countries where the primary language is not written in the Latin/Roman alphabet. For example, the Pinyin system could be used for Chinese). To avoid repeating names, a list of all current and past prototype names will be maintained on each branch location’s company intranet site. Assuming that maintaining a single pre-sorted list is not feasible among all of the highly distributed intranet implementations, what string comparison method would you use to sort each year’s list of prototype names so that the list is both meaningful and consistent regardless of the country within which the list is being viewed? Sorting the list with a culture-sensitive comparison using the default configured culture on each country’s intranet server the list would probably work most of the time, but subtle differences between cultures could mean that two different people would see a list that was sorted slightly differently. The CEO wants the prototype names to be a unifying aspect of company culture and is adamant that everyone see the the same list sorted in the same order and there’s no way to guarantee a consistent sort across different cultures using the culture-sensitive string comparison rules. The culture-sensitive sort would produce a meaningful list for the specific user viewing it, but it wouldn’t always be consistent between different users. Sorting with the ordinal comparison would certainly be consistent regardless of the user viewing it, but would it be meaningful? Let’s say that the current year’s prototype name list looks like this: Antílope (Spanish) Babouin (French) Cahoun (Czech) Diamond (English) Flosse (German) If you were to sort this list using ordinal rules you’d end up with: Antílope Babouin Diamond Flosse Cahoun This sort is no good because the entry for “C” appears the bottom of the list after “F”. This is because the Czech entry for the letter “C” makes use of a diacritic (accent mark). The ordinal string comparison does a byte-by-byte comparison of the code points that make up each character in the string and the code point for the “C” with the diacritic mark is higher than any letter without a diacritic mark, which pushes that entry to the bottom of the sorted list. The CEO wants each country to be able to create prototype names in their native language, which means we need to allow for names that might begin with letters that have diacritics, so ordinal sorting kills the meaningfulness of the list. As it turns out, this situation is actually well-suited for the invariant culture comparison. The invariant culture accounts for linguistically relevant factors like the use of diacritics but will provide a consistent sort across all machines that perform the sort. Now that we’ve walked through this example, the following line from the “Best Practices For Using Strings in the .NET Framework” makes a lot more sense: One of the few reasons to use StringComparison.InvariantCulture for comparison is to persist ordered data for a cross-culturally identical display That line describes the prototype name example perfectly: we need a way to persist ordered data for a cross-culturally identical display. While this example is 100% made-up, I think it illustrates that there are indeed real-world situations where the invariant culture comparison is useful.

Read the article
Fun with Aggregates

- by Paul White

There are interesting things to be learned from even the simplest queries. For example, imagine you are given the task of writing a query to list AdventureWorks product names where the product has at least one entry in the transaction history table, but fewer than ten. One possible query to meet that specification is: SELECT p.Name FROM Production.Product AS p JOIN Production.TransactionHistory AS th ON p.ProductID = th.ProductID GROUP BY p.ProductID, p.Name HAVING COUNT_BIG(*) < 10; That query correctly returns 23 rows (execution plan and data sample shown below): The execution plan looks a bit different from the written form of the query: the base tables are accessed in reverse order, and the aggregation is performed before the join. The general idea is to read all rows from the history table, compute the count of rows grouped by ProductID, merge join the results to the Product table on ProductID, and finally filter to only return rows where the count is less than ten. This ‘fully-optimized’ plan has an estimated cost of around 0.33 units. The reason for the quote marks there is that this plan is not quite as optimal as it could be – surely it would make sense to push the Filter down past the join too? To answer that, let’s look at some other ways to formulate this query. This being SQL, there are any number of ways to write logically-equivalent query specifications, so we’ll just look at a couple of interesting ones. The first query is an attempt to reverse-engineer T-SQL from the optimized query plan shown above. It joins the result of pre-aggregating the history table to the Product table before filtering: SELECT p.Name FROM ( SELECT th.ProductID, cnt = COUNT_BIG(*) FROM Production.TransactionHistory AS th GROUP BY th.ProductID ) AS q1 JOIN Production.Product AS p ON p.ProductID = q1.ProductID WHERE q1.cnt < 10; Perhaps a little surprisingly, we get a slightly different execution plan: The results are the same (23 rows) but this time the Filter is pushed below the join! The optimizer chooses nested loops for the join, because the cardinality estimate for rows passing the Filter is a bit low (estimate 1 versus 23 actual), though you can force a merge join with a hint and the Filter still appears below the join. In yet another variation, the < 10 predicate can be ‘manually pushed’ by specifying it in a HAVING clause in the “q1” sub-query instead of in the WHERE clause as written above. The reason this predicate can be pushed past the join in this query form, but not in the original formulation is simply an optimizer limitation – it does make efforts (primarily during the simplification phase) to encourage logically-equivalent query specifications to produce the same execution plan, but the implementation is not completely comprehensive. Moving on to a second example, the following query specification results from phrasing the requirement as “list the products where there exists fewer than ten correlated rows in the history table”: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) < 10 ); Unfortunately, this query produces an incorrect result (86 rows): The problem is that it lists products with no history rows, though the reasons are interesting. The COUNT_BIG(*) in the EXISTS clause is a scalar aggregate (meaning there is no GROUP BY clause) and scalar aggregates always produce a value, even when the input is an empty set. In the case of the COUNT aggregate, the result of aggregating the empty set is zero (the other standard aggregates produce a NULL). To make the point really clear, let’s look at product 709, which happens to be one for which no history rows exist: -- Scalar aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709; -- Vector aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709 GROUP BY th.ProductID; The estimated execution plans for these two statements are almost identical: You might expect the Stream Aggregate to have a Group By for the second statement, but this is not the case. The query includes an equality comparison to a constant value (709), so all qualified rows are guaranteed to have the same value for ProductID and the Group By is optimized away. In fact there are some minor differences between the two plans (the first is auto-parameterized and qualifies for trivial plan, whereas the second is not auto-parameterized and requires cost-based optimization), but there is nothing to indicate that one is a scalar aggregate and the other is a vector aggregate. This is something I would like to see exposed in show plan so I suggested it on Connect. Anyway, the results of running the two queries show the difference at runtime: The scalar aggregate (no GROUP BY) returns a result of zero, whereas the vector aggregate (with a GROUP BY clause) returns nothing at all. Returning to our EXISTS query, we could ‘fix’ it by changing the HAVING clause to reject rows where the scalar aggregate returns zero: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) BETWEEN 1 AND 9 ); The query now returns the correct 23 rows: Unfortunately, the execution plan is less efficient now – it has an estimated cost of 0.78 compared to 0.33 for the earlier plans. Let’s try adding a redundant GROUP BY instead of changing the HAVING clause: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY th.ProductID HAVING COUNT_BIG(*) < 10 ); Not only do we now get correct results (23 rows), this is the execution plan: I like to compare that plan to quantum physics: if you don’t find it shocking, you haven’t understood it properly :) The simple addition of a redundant GROUP BY has resulted in the EXISTS form of the query being transformed into exactly the same optimal plan we found earlier. What’s more, in SQL Server 2008 and later, we can replace the odd-looking GROUP BY with an explicit GROUP BY on the empty set: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ); I offer that as an alternative because some people find it more intuitive (and it perhaps has more geek value too). Whichever way you prefer, it’s rather satisfying to note that the result of the sub-query does not exist for a particular correlated value where a vector aggregate is used (the scalar COUNT aggregate always returns a value, even if zero, so it always ‘EXISTS’ regardless which ProductID is logically being evaluated). The following query forms also produce the optimal plan and correct results, so long as a vector aggregate is used (you can probably find more equivalent query forms): WHERE Clause SELECT p.Name FROM Production.Product AS p WHERE ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) < 10; APPLY SELECT p.Name FROM Production.Product AS p CROSS APPLY ( SELECT NULL FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ) AS ca (dummy); FROM Clause SELECT q1.Name FROM ( SELECT p.Name, cnt = ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) FROM Production.Product AS p ) AS q1 WHERE q1.cnt < 10; This last example uses SUM(1) instead of COUNT and does not require a vector aggregate…you should be able to work out why :) SELECT q.Name FROM ( SELECT p.Name, cnt = ( SELECT SUM(1) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID ) FROM Production.Product AS p ) AS q WHERE q.cnt < 10; The semantics of SQL aggregates are rather odd in places. It definitely pays to get to know the rules, and to be careful to check whether your queries are using scalar or vector aggregates. As we have seen, query plans do not show in which ‘mode’ an aggregate is running and getting it wrong can cause poor performance, wrong results, or both. © 2012 Paul White Twitter: @SQL_Kiwi email: [email protected]

Read the article
Conversion of BizTalk Projects to Use the New WCF-SAP Adaptor

- by Geordie

We are in the process of upgrading our BizTalk Environment from BizTalk 2006 R2 to BizTalk 2010. The SAP adaptor in BizTalk 2010 is an all new and more powerful WCF-SAP adaptor. When my colleagues tested out the new adaptor they discovered that the format of the data extracted from SAP was not identical to the old adaptor. This is not a big deal if the structure of the messages from SAP is simple. In this case we were receiving the delivery and invoice iDocs. Both these structures are complex especially the delivery document. Over the past few years I have tweaked the delivery mapping to remove bugs from original mapping. The idea of redoing these maps did not appeal and due to the current work load was not even an option. I opted for a rather crude alternative of pulling in the iDoc in the new typed format and then adding a static map at the start of the orchestration to convert the data to the old schema. Note WCF-SAP data formats (on the binding tab of the configuration dialog box is the ‘RecieiveIdocFormat’ field): Typed: Returns a XML document with the hierarchy represented in XML and all fields being represented by XML tags. RFC: Returns an XML document with the hierarchy represented in XML but the iDoc lines in flat file format. String: This returns the iDoc in a format that is closest to the original flat file format but is still wrapped with some top level XML tags. The files also contained some strange characters at the end of each line. I started with the invoice document and it was quite straight forward to add the mapping but this is where my problems started. The orchestrations for these documents are dynamic and so require the identity of the partner to be able to correctly configure the orchestration. The partner identity is in the EDI_DC40 segment of the iDoc. In the old project the RECPRN node of the segment was promoted. The code to set a variable to the partner ID was now failing. After lot of head scratching I discovered the problem was due to the addition of Namespaces to the fields in the EDI_DC40 segment. To overcome this I needed to use an xPath query with a Namespace Manager. This had to be done in custom code. I now tried to repeat the process with the delivery document. Unfortunately when we tried to get sample typed data from SAP an exception was thrown. The adapter "WCF-SAP" raised an error message. Details "Microsoft.ServiceModel.Channels.Common.XmlReaderGenerationException: The segment or group definition E2EDKA1001 was not found in the IDoc metadata. The UniqueId of the IDoc type is: IDOCTYP/3/DESADV01/ZASNEXT1/640. For Receive operations, the SAP adapter does not support unreleased segments. Our guess is that when the WCF-SAP adaptor tries to down load the data it retrieves a data schema from SAP. For some reason the schema does not match the data. This may be due to the version of SAP we are running or due to a customization. Either way resolving this problem did not look easy. When doing some research on this problem I found an article showing me how to get the data from SAP using the WCF-SAP adaptor without any XML tags. http://blogs.msdn.com/b/adapters/archive/2007/10/05/receiving-idocs-getting-the-raw-idoc-data.aspx Reproduction of Mustansir blog: Since the WCF based SAP Adapter is ... well, WCF based, all data flowing in and out of the adapter is encapsulated within a SOAP message. Which means there are those pesky xml tags all over the place. If you want to receive an Idoc from SAP, you can receive it in "Typed" format (in which case each column in each segment of the idoc appears within its own xml tag), or you can receive it in "String" format (in which case there are just 2 xml tags at the top, the raw xml data in string/flat file format, and the 2 closing xml tags). In "String" format, an incoming idoc (for ORDERS05, containing 5 data records) would look like: <ReceiveIdoc ><idocData>EDI_DC40 8000000000001064985620 E2EDK01005 800000000000106498500000100000001 E2EDK14 8000000000001064985000002000000020111000 E2EDK14 8000000000001064985000003000000020081000 E2EDK14 80000000000010649850000040000000200710 E2EDK14 80000000000010649850000050000000200600</idocData></ReceiveIdoc> (I have trimmed part of the control record so that it fits cleanly here on one line). Now, you're only interested in the IDOC data, and don't care much for the XML tags. It isn't that difficult to write your own pipeline component, or even some logic in the orchestration to remove the tags, right? Well, you don't need to write any extra code at all - the WCF Adapter can help you here! During the configuration of your one-way Receive Location using WCF-Custom, navigate to the Messages tab. Under the section "Inbound BizTalk Messge Body", select the "Path" radio button, and: (a) Enter the body path expression as: /*[local-name()='ReceiveIdoc']/*[local-name()='idocData'] (b) Choose "String" for the Node Encoding. What we've done is, used an XPATH to pull out the value of the "idocData" node from the XML. Your Receive Location will now emit text containing only the idoc data. You can at this point, for example, put the Flat File Pipeline component to convert the flat text into a different xml format based on some other schema you already have, and receive your version of the xml formatted message in your orchestration. This was potentially a much easier solution than adding the static maps to the orchestrations and overcame the issue with ‘Typed’ delivery documents. Not quite so fast… Note: When I followed Mustansir’s blog the characters at the end of each line disappeared. After configuring the adaptor and passing the iDoc data into the original flat file receive pipelines I was receiving exceptions. There was a failure executing the receive pipeline: "PAPINETPipelines.DeliveryFlatFileReceive, CustomerIntegration2.PAPINET.Pipelines, Version=1.0.0.0, Culture=neutral, PublicKeyToken=4ca3635fbf092bbb" Source: "Pipeline " Receive Port: "recSAP_Delivery" URI: "D:\CustomerIntegration2\SAP\Delivery\*.xml" Reason: An error occurred when parsing the incoming document: "Unexpected data found while looking for: 'Z2EDPZ7' The current definition being parsed is E2EDP07GRP. The stream offset where the error occured is 8859. The line number where the error occured is 23. The column where the error occured is 0.". Although the new flat file looked the same as the old one there was a differences. In the original file all lines in the document were exactly 1064 character long. In the new file all lines were truncated to the last alphanumeric character. The final piece of the puzzle was to add a custom pipeline component to pad all the lines to 1064 characters. This component was added to the decode node of the custom delivery and invoice flat file disassembler pipelines. Execute method of the custom pipeline component: public IBaseMessage Execute(IPipelineContext pc, IBaseMessage inmsg) { //Convert Stream to a string Stream s = null; IBaseMessagePart bodyPart = inmsg.BodyPart; // NOTE inmsg.BodyPart.Data is implemented only as a setter in the http adapter API and a //getter and setter for the file adapter. Use GetOriginalDataStream to get data instead. if (bodyPart != null) s = bodyPart.GetOriginalDataStream(); string newMsg = string.Empty; string strLine; try { StreamReader sr = new StreamReader(s); strLine = sr.ReadLine(); while (strLine != null) { //Execute padding code if (strLine != null) strLine = strLine.PadRight(1064, ' ') + "\r\n"; newMsg += strLine; strLine = sr.ReadLine(); } sr.Close(); } catch (IOException ex) { throw new Exception("Error occured trying to pad the message to 1064 charactors"); } //Convert back to stream and set to Data property inmsg.BodyPart.Data = new MemoryStream(Encoding.UTF8.GetBytes(newMsg)); ; //reset the position of the stream to zero inmsg.BodyPart.Data.Position = 0; return inmsg; }

Read the article
SQL SERVER – Faster SQL Server Databases and Applications – Power and Control with SafePeak Caching Options

- by Pinal Dave

Update: This blog post is written based on the SafePeak, which is available for free download. Today, I’d like to examine more closely one of my preferred technologies for accelerating SQL Server databases, SafePeak. Safepeak’s software provides a variety of advanced data caching options, techniques and tools to accelerate the performance and scalability of SQL Server databases and applications. I’d like to look more closely at some of these options, as some of these capabilities could help you address lagging database and performance on your systems. To better understand the available options, it is best to start by understanding the difference between the usual “Basic Caching” vs. SafePeak’s “Dynamic Caching”. Basic Caching Basic Caching (or the stale and static cache) is an ability to put the results from a query into cache for a certain period of time. It is based on TTL, or Time-to-live, and is designed to stay in cache no matter what happens to the data. For example, although the actual data can be modified due to DML commands (update/insert/delete), the cache will still hold the same obsolete query data. Meaning that with the Basic Caching is really static / stale cache. As you can tell, this approach has its limitations. Dynamic Caching Dynamic Caching (or the non-stale cache) is an ability to put the results from a query into cache while maintaining the cache transaction awareness looking for possible data modifications. The modifications can come as a result of: DML commands (update/insert/delete), indirect modifications due to triggers on other tables, executions of stored procedures with internal DML commands complex cases of stored procedures with multiple levels of internal stored procedures logic. When data modification commands arrive, the caching system identifies the related cache items and evicts them from cache immediately. In the dynamic caching option the TTL setting still exists, although its importance is reduced, since the main factor for cache invalidation (or cache eviction) become the actual data updates commands. Now that we have a basic understanding of the differences between “basic” and “dynamic” caching, let’s dive in deeper. SafePeak: A comprehensive and versatile caching platform SafePeak comes with a wide range of caching options. Some of SafePeak’s caching options are automated, while others require manual configuration. Together they provide a complete solution for IT and Data managers to reach excellent performance acceleration and application scalability for a wide range of business cases and applications. Automated caching of SQL Queries: Fully/semi-automated caching of all “read” SQL queries, containing any types of data, including Blobs, XMLs, Texts as well as all other standard data types. SafePeak automatically analyzes the incoming queries, categorizes them into SQL Patterns, identifying directly and indirectly accessed tables, views, functions and stored procedures; Automated caching of Stored Procedures: Fully or semi-automated caching of all read” stored procedures, including procedures with complex sub-procedure logic as well as procedures with complex dynamic SQL code. All procedures are analyzed in advance by SafePeak’s Metadata-Learning process, their SQL schemas are parsed – resulting with a full understanding of the underlying code, objects dependencies (tables, views, functions, sub-procedures) enabling automated or semi-automated (manually review and activate by a mouse-click) cache activation, with full understanding of the transaction logic for cache real-time invalidation; Transaction aware cache: Automated cache awareness for SQL transactions (SQL and in-procs); Dynamic SQL Caching: Procedures with dynamic SQL are pre-parsed, enabling easy cache configuration, eliminating SQL Server load for parsing time and delivering high response time value even in most complicated use-cases; Fully Automated Caching: SQL Patterns (including SQL queries and stored procedures) that are categorized by SafePeak as “read and deterministic” are automatically activated for caching; Semi-Automated Caching: SQL Patterns categorized as “Read and Non deterministic” are patterns of SQL queries and stored procedures that contain reference to non-deterministic functions, like getdate(). Such SQL Patterns are reviewed by the SafePeak administrator and in usually most of them are activated manually for caching (point and click activation); Fully Dynamic Caching: Automated detection of all dependent tables in each SQL Pattern, with automated real-time eviction of the relevant cache items in the event of “write” commands (a DML or a stored procedure) to one of relevant tables. A default setting; Semi Dynamic Caching: A manual cache configuration option enabling reducing the sensitivity of specific SQL Patterns to “write” commands to certain tables/views. An optimization technique relevant for cases when the query data is either known to be static (like archive order details), or when the application sensitivity to fresh data is not critical and can be stale for short period of time (gaining better performance and reduced load); Scheduled Cache Eviction: A manual cache configuration option enabling scheduling SQL Pattern cache eviction based on certain time(s) during a day. A very useful optimization technique when (for example) certain SQL Patterns can be cached but are time sensitive. Example: “select customers that today is their birthday”, an SQL with getdate() function, which can and should be cached, but the data stays relevant only until 00:00 (midnight); Parsing Exceptions Management: Stored procedures that were not fully parsed by SafePeak (due to too complex dynamic SQL or unfamiliar syntax), are signed as “Dynamic Objects” with highest transaction safety settings (such as: Full global cache eviction, DDL Check = lock cache and check for schema changes, and more). The SafePeak solution points the user to the Dynamic Objects that are important for cache effectiveness, provides easy configuration interface, allowing you to improve cache hits and reduce cache global evictions. Usually this is the first configuration in a deployment; Overriding Settings of Stored Procedures: Override the settings of stored procedures (or other object types) for cache optimization. For example, in case a stored procedure SP1 has an “insert” into table T1, it will not be allowed to be cached. However, it is possible that T1 is just a “logging or instrumentation” table left by developers. By overriding the settings a user can allow caching of the problematic stored procedure; Advanced Cache Warm-Up: Creating an XML-based list of queries and stored procedure (with lists of parameters) for periodically automated pre-fetching and caching. An advanced tool allowing you to handle more rare but very performance sensitive queries pre-fetch them into cache allowing high performance for users’ data access; Configuration Driven by Deep SQL Analytics: All SQL queries are continuously logged and analyzed, providing users with deep SQL Analytics and Performance Monitoring. Reduce troubleshooting from days to minutes with database objects and SQL Patterns heat-map. The performance driven configuration helps you to focus on the most important settings that bring you the highest performance gains. Use of SafePeak SQL Analytics allows continuous performance monitoring and analysis, easy identification of bottlenecks of both real-time and historical data; Cloud Ready: Available for instant deployment on Amazon Web Services (AWS). As you can see, there are many options to configure SafePeak’s SQL Server database and application acceleration caching technology to best fit a lot of situations. If you’re not familiar with their technology, they offer free-trial software you can download that comes with a free “help session” to help get you started. You can access the free trial here. Also, SafePeak is available to use on Amazon Cloud. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
Silverlight for Everyone!!

- by subodhnpushpak

Someone asked me to compare Silverlight / HTML development. I realized that the question can be answered in many ways: Below is the high level comparison between a HTML /JavaScript client and Silverlight client and why silverlight was chosen over HTML / JavaScript client (based on type of users and major functionalities provided): 1. For end users Browser compatibility Silverlight is a plug-in and requires installation first. However, it does provides consistent look and feel across all browsers. For HTML / DHTML, there is a need to tweak JavaScript for each of the browser supported. In fact, tags like <span> and <div> works differently on different browser / version. So, HTML works on most of the systems but also requires lot of efforts coding-wise to adhere to all standards/ browsers / versions. Out of browser support No support in HTML. Third party tools like Google gears offers some functionalities but there are lots of issues around platform and accessibility. Out of box support for out-of-browser support. provides features like drag and drop onto application surface. Cut and copy paste in HTML HTML is displayed in browser; which, in turn provides facilities for cut copy and paste. Silverlight (specially 4) provides rich features for cut-copy-paste along with full control over what can be cut copy pasted by end users and .advanced features like visual tree printing. Rich user experience HTML can provide some rich experience by use of some JavaScript libraries like JQuery. However, extensive use of JavaScript combined with various versions of browsers and the supported JavaScript makes the solution cumbersome. Silverlight is meant for RIA experience. User data storage on client end In HTML only small amount of data can be stored that too in cookies. In Silverlight large data may be stored, that too in secure way. This increases the response time. Post back In HTML / JavaScript the post back can be stopped by use of AJAX. Extensive use of AJAX can be a bottleneck as browser stack is used for the calls. Both look and feel and data travel over network. In Silverlight everything run the client side. Calls are made to server ONLY for data; which also reduces network traffic in long run. 2. For Developers Coding effort HTML / JavaScript can take considerable amount to code if features (requirements) are rich. For AJAX like interfaces; knowledge of third party kits like DOJO / Yahoo UI / JQuery is required which has steep learning curve. ASP .Net coding world revolves mostly along <table> tags for alignments whereas most popular tools provide <div> tags; which requires lots of tweaking. AJAX calls can be a bottlenecks for performance, if the calls are many. In Silverlight; coding is in C#, which is managed code. XAML is also very intuitive and Blend can be used to provide look and feel. Event handling is much clean than in JavaScript. Provides for many clean patterns like MVVM and composable application. Each call to server is asynchronous in silverlight. AJAX is in built into silverlight. Threading can be done at the client side itself to provide for better responsiveness; etc. Debugging Debugging in HTML / JavaScript is difficult. As JavaScript is interpreted; there is NO compile time error handling. Debugging in Silverlight is very helpful. As it is compiled; it provides rich features for both compile time and run time error handling. Multi -targeting browsers HTML / JavaScript have different rendering behaviours in different browsers / and their versions. JavaScript have to be written to sublime the differences in browser behaviours. Silverlight works exactly the same in all browsers and works on almost all popular browser. Multi-targeting desktop No support in HTML / JavaScript Silverlight is very close to WPF. Bot the platform may be easily targeted while maintaining the same source code. Rich toolkit HTML /JavaScript have limited toolkit as controls Silverlight provides a rich set of controls including graphs, audio, video, layout, etc. 3. For Architects Design Patterns Silverlight provides for patterns like MVVM (MVC) and rich (fat) client architecture. This segregates the "separation of concern" very clearly. Client (silverlight) does what it is expected to do and server does what it is expected of. In HTML / JavaScript world most of the processing is done on the server side. Extensibility Silverlight provides great deal of extensibility as custom controls may be made. Extensibility is NOT restricted by browser but by the plug-in silverlight runs in. HTML / JavaScript works in a certain way and extensibility is generally done on the server side rather than client end. Client side is restricted by the limitations of the browser. Performance Silverlight provides localized storage which may be used for cached data storage. this reduces the response time. As processing can be done on client side itself; there is no need for server round trips. this decreases the round about time. Look and feel of the application is downloaded ONLY initially, afterwards ONLY data is fetched form the server. Security Silverlight is compiled code downloaded as .XAP; As compared to HTML / JavaScript, it provides more secure sandboxed approach. Cross - scripting is inherently prohibited in silverlight by default. If proper guidelines are followed silverlight provides much robust security mechanism as against HTML / JavaScript world. For example; knowing server Address in obfuscated JavaScript is easier than a compressed compiled obfuscated silverlight .XAP file. Some of these like (offline and Canvas support) will be available in HTML5. However, the timelines are not encouraging at all. According to Ian Hickson, editor of the HTML5 specification, the specification to reach the W3C Candidate Recommendation stage during 2012, and W3C Recommendation in the year 2022 or later. see http://en.wikipedia.org/wiki/HTML5 for details. The above is MY opinion. I will love to hear yours; do let me know via comments. Technorati Tags: Silverlight

Read the article
.NET to iOS: From WinForms to the iPad

- by RobertChipperfield

One of the great things about working at Red Gate is getting to play with new technology - and right now, that means mobile. A few weeks ago, we decided that a little research into the tablet computing arena was due, and purely from a numbers point of view, that suggested the iPad as a good target device. A quick trip to iPhoneDevCon in San Diego later, and Marine and I came back full of ideas, and with some concept of how iOS development was meant to work. Here's how we went from there to the release of Stacks & Heaps, our geeky take on the classic "Snakes & Ladders" game. Step 1: Buy a Mac I've played with many operating systems in my time: from the original BBC Model B, through DOS, Windows, Linux, and others, but I'd so far managed to avoid buying fruit-flavoured computer hardware! If you want to develop for the iPhone, iPad or iPod Touch, that's the first thing that needs to change. If you've not used OS X before, the first thing you'll realise is that everything is different! In the interests of avoiding a flame war in the comments section, I'll only go so far as to say that a lot of my Windows-flavoured muscle memory no longer worked. If you're in the UK, you'll also realise your keyboard is lacking a # key, and that " and @ are the other way around from normal. The wonderful Ukelele keyboard layout editor restores some sanity here, as long as you don't look at the keyboard when you're typing. I couldn't give up the PC entirely, but a handy application called Synergy comes to the rescue - it lets you share a single keyboard and mouse between multiple machines. There's a few limitations: Alt-Tab always seems to go to the Mac, and Windows 7's UAC dialogs require the local mouse for security reasons, but it gets you a long way at least. Step 2: Register as an Apple Developer You can register as an Apple Developer free of charge, and that lets you download XCode and the iOS SDK. You also get the iPhone / iPad emulator, which is handy, since you'll need to be a paid member before you can deploy your apps to a real device. You can either enroll as an individual, or as a company. They both cost the same ($99/year), but there's a few differences between them. If you register as a company, you can add multiple developers to your team (all for the same $99 - not $99 per developer), and you get to use your company name in the App Store. However, you'll need to send off significantly more documentation to Apple, and I suspect the process takes rather longer than for an individual, where they just need to verify some credit card details. Here's a tip: if you're registering as a company, do so as early as possible. The approval process can take a while to complete, so get the application in in plenty of time. Step 3: Learn to love the square brackets! Objective-C is the language of the iPad. C and C++ are also supported, and if you're doing some serious game development, you'll probably spend most of your time in C++ talking OpenGL, but for forms-based apps, you'll be interacting with a lot of the Objective-C SDK. Like shifting from Ctrl-C to Cmd-C, it feels a little odd at first, with the familiar string.format(.) turning into: NSString *myString = [NSString stringWithFormat:@"Hello world, it's %@", [NSDate date]]; Thankfully XCode's auto-complete is normally passable, if not up to Visual Studio's standards, which coupled with a huge amount of content on Stack Overflow means you'll soon get to grips with the API. You'll need to get used to some terminology changes, though; here's an incomplete approximation: Coming from a .NET background, there's some luxuries you no longer have developing Objective C in XCode: Generics! Remember back in .NET 1.1, when all collections were just objects? Yup, we're back there now. ReSharper. Or, more generally, very much refactoring support. The not-many-keystrokes to rename a class, its file, and al references to it in Visual Studio turns into a much more painful experience in XCode. Garbage collection. This is actually rather less of an issue than you might expect: if you follow the rules, the reference counting provided by Objective C gets you a long way without too much pain. Circular references are their usual problematic self, though. Decent exception handling. You do have exceptions, but they're nowhere near as widely used. Generally, if something goes wrong, you get nil (see translation table above) back. Which brings me on to. Calling a method on a nil object isn't a failure - it just returns nil itself! There's many arguments for and against this, but personally I fall into the "stuff should fail as quickly and explicitly as possible" camp. Less specifically, I found that there's more chance of code failing at runtime rather than getting caught at compile-time: using the @selector(.) syntax to pass a method signature isn't (can't be) checked at compile-time, so the first you know about a typo is a crash when you try and call it. The solution to this is of course lots of great testing, both automated and manual, but I still find comfort in provably correct type safety being enforced in addition to testing. Step 4: Submit to the App Store Assuming you want to distribute to more than a handful of devices, you're going to need to submit your app to the Apple App Store. There's a few gotchas in terms of getting builds signed with the right certificates, and you'll be bouncing around between XCode and iTunes Connect a fair bit, but eventually you get everything checked off the to-do list, and are ready to upload your first binary! With some amount of anticipation, I pressed the Upload button in XCode, ready to release our creation into the world, but was instead greeted by an error informing me my XML file was malformed. Uh. A little Googling later, and it turned out that a simple rename from "Stacks&Heaps.app" to "StacksAndHeaps.app" worked around an XML escaping bug, and we were good to go. The next step is to wait for approval (or otherwise). After a couple of weeks of intensive development, this part is agonising. Did we make it? The Apple jury is still out at the moment, but our fingers are firmly crossed! In the meantime, you can see some screenshots and leave us your email address if you'd like us to get in touch when it does go live at the MobileFoo website. Step 5: Profit! Actually, that wasn't the idea here: Stacks & Heaps is free; there's no adverts, and we're not going to sell all your data either. So why did we do it? We wanted to get an idea of what it's like to move from coding for a desktop environment, to something completely different. We don't know whether in a year's time, the iPad will still be the dominant force, or whether Android will have smoothed out some bugs, tweaked the performance, and polished the UI, but I think it's a fairly sure bet that the tablet form factor is here to stay. We want to meet people who are using it, start chatting to them, and find out about some of the pain they're feeling. What better way to do that than do it ourselves, and get to write a cool game in the process?

Read the article
Use BGInfo to Build a Database of System Information of Your Network Computers

- by Sysadmin Geek

One of the more popular tools of the Sysinternals suite among system administrators is BGInfo which tacks real-time system information to your desktop wallpaper when you first login. For obvious reasons, having information such as system memory, available hard drive space and system up time (among others) right in front of you is very convenient when you are managing several systems. A little known feature about this handy utility is the ability to have system information automatically saved to a SQL database or some other data file. With a few minutes of setup work you can easily configure BGInfo to record system information of all your network computers in a centralized storage location. You can then use this data to monitor or report on these systems however you see fit. BGInfo Setup If you are familiar with BGInfo, you can skip this section. However, if you have never used this tool, it takes just a few minutes to setup in order to capture the data you are looking for. When you first open BGInfo, a timer will be counting down in the upper right corner. Click the countdown button to keep the interface up so we can edit the settings. Now edit the information you want to capture from the available fields on the right. Since all the output will be redirected to a central location, don’t worry about configuring the layout or formatting. Configuring the Storage Database BGInfo supports the ability to store information in several database formats: SQL Server Database, Access Database, Excel and Text File. To configure this option, open File > Database. Using a Text File The simplest, and perhaps most practical, option is to store the BGInfo data in a comma separated text file. This format allows for the file to be opened in Excel or imported into a database. To use a text file or any other file system type (Excel or MS Access), simply provide the UNC to the respective file. The account running the task to write to this file will need read/write access to both the share and NTFS file permissions. When using a text file, the only option is to have BGInfo create a new entry each time the capture process is run which will add a new line to the respective CSV text file. Using a SQL Database If you prefer to have the data dropped straight into a SQL Server database, BGInfo support this as well. This requires a bit of additional configuration, but overall it is very easy. The first step is to create a database where the information will be stored. Additionally, you will want to create a user account to fill data into this table (and this table only). For your convenience, this script creates a new database and user account (run this as Administrator on your SQL Server machine): @SET Server=%ComputerName%.@SET Database=BGInfo@SET UserName=BGInfo@SET Password=passwordSQLCMD -S “%Server%” -E -Q “Create Database [%Database%]“SQLCMD -S “%Server%” -E -Q “Create Login [%UserName%] With Password=N’%Password%’, DEFAULT_DATABASE=[%Database%], CHECK_EXPIRATION=OFF, CHECK_POLICY=OFF”SQLCMD -S “%Server%” -E -d “%Database%” -Q “Create User [%UserName%] For Login [%UserName%]“SQLCMD -S “%Server%” -E -d “%Database%” -Q “EXEC sp_addrolemember N’db_owner’, N’%UserName%’” Note the SQL user account must have ‘db_owner’ permissions on the database in order for BGInfo to work correctly. This is why you should have a SQL user account specifically for this database. Next, configure BGInfo to connect to this database by clicking on the SQL button. Fill out the connection properties according to your database settings. Select the option of whether or not to only have one entry per computer or keep a history of each system. The data will then be dropped directly into a table named “BGInfoTable” in the respective database. Configure User Desktop Options While the primary function of BGInfo is to alter the user’s desktop by adding system info as part of the wallpaper, for our use here we want to leave the user’s wallpaper alone so this process runs without altering any of the user’s settings. Click the Desktops button. Configure the Wallpaper modifications to not alter anything. Preparing the Deployment Now we are all set for deploying the configuration to the individual machines so we can start capturing the system data. If you have not done so already, click the Apply button to create the first entry in your data repository. If all is configured correctly, you should be able to open your data file or database and see the entry for the respective machine. Now click the File > Save As menu option and save the configuration as “BGInfoCapture.bgi”. Deploying to Client Machines Deployment to the respective client machines is pretty straightforward. No installation is required as you just need to copy the BGInfo.exe and the BGInfoCapture.bgi to each machine and place them in the same directory. Once in place, just run the command: BGInfo.exe BGInfoCapture.bgi /Timer:0 /Silent /NoLicPrompt Of course, you probably want to schedule the capture process to run on a schedule. This command creates a Scheduled Task to run the capture process at 8 AM every morning and assumes you copied the required files to the root of your C drive: SCHTASKS /Create /SC DAILY /ST 08:00 /TN “System Info” /TR “C:\BGInfo.exe C:\BGInfoCapture.bgi /Timer:0 /Silent /NoLicPrompt” Adjust as needed, but the end result is the scheduled task command should look something like this: Download BGInfo from Sysinternals Latest Features How-To Geek ETC How To Create Your Own Custom ASCII Art from Any Image How To Process Camera Raw Without Paying for Adobe Photoshop How Do You Block Annoying Text Message (SMS) Spam? How to Use and Master the Notoriously Difficult Pen Tool in Photoshop HTG Explains: What Are the Differences Between All Those Audio Formats? How To Use Layer Masks and Vector Masks to Remove Complex Backgrounds in Photoshop Bring Summer Back to Your Desktop with the LandscapeTheme for Chrome and Iron The Prospector – Home Dash Extension Creates a Whole New Browsing Experience in Firefox KinEmote Links Kinect to Windows Why Nobody Reads Web Site Privacy Policies [Infographic] Asian Temple in the Snow Wallpaper 10 Weird Gaming Records from the Guinness Book

Read the article
A Good Developer is So Hard to Find

- by James Michael Hare

Let me start out by saying I want to damn the writers of the Toughest Developer Puzzle Ever – 2. It is eating every last shred of my free time! But as I've been churning through each puzzle and marvelling at the brain teasers and trivia within, I began to think about interviewing developers and why it seems to be so hard to find good ones. The problem is, it seems like no matter how hard we try to find the perfect way to separate the chaff from the wheat, inevitably someone will get hired who falls far short of expectations or someone will get passed over for missing a piece of trivia or a tricky brain teaser that could have been an excellent team member. In shops that are primarily software-producing businesses or other heavily IT-oriented businesses (Microsoft, Amazon, etc) there often exists a much tighter bond between HR and the hiring development staff because development is their life-blood. Unfortunately, many of us work in places where IT is viewed as a cost or just a means to an end. In these shops, too often, HR and development staff may work against each other due to differences in opinion as to what a good developer is or what one is worth. It seems that if you ask two different people what makes a good developer, often you will get three different opinions. With the exception of those shops that are purely development-centric (you guys have it much easier!), most other shops have management who have very little knowledge about the development process. Their view can often be that development is simply a skill that one learns and then once aquired, that developer can produce widgets as good as the next like workers on an assembly-line floor. On the other side, you have many developers that feel that software development is an art unto itself and that the ability to create the most pure design or know the most obscure of keywords or write the shortest-possible obfuscated piece of code is a good coder. So is it a skill? An Art? Or something entirely in between? Saying that software is merely a skill and one just needs to learn the syntax and tools would be akin to saying anyone who knows English and can use Word can write a 300 page book that is accurate, meaningful, and stays true to the point. This just isn't so. It takes more than mere skill to take words and form a sentence, join those sentences into paragraphs, and those paragraphs into a document. I've interviewed candidates who could answer obscure syntax and keyword questions and once they were hired could not code effectively at all. So development must be more than a skill. But on the other end, we have art. Is development an art? Is our end result to produce art? I can marvel at a piece of code -- see it as concise and beautiful -- and yet that code most perform some stated function with accuracy and efficiency and maintainability. None of these three things have anything to do with art, per se. Art is beauty for its own sake and is a wonderful thing. But if you apply that same though to development it just doesn't hold. I've had developers tell me that all that matters is the end result and how you code it is entirely part of the art and I couldn't disagree more. Yes, the end result, the accuracy, is the prime criteria to be met. But if code is not maintainable and efficient, it would be just as useless as a beautiful car that breaks down once a week or that gets 2 miles to the gallon. Yes, it may work in that it moves you from point A to point B and is pretty as hell, but if it can't be maintained or is not efficient, it's not a good solution. So development must be something less than art. In the end, I think I feel like development is a matter of craftsmanship. We use our tools and we use our skills and set about to construct something that satisfies a purpose and yet is also elegant and efficient. There is skill involved, and there is an art, but really it boils down to being able to craft code. Crafting code is far more than writing code. Anyone can write code if they know the syntax, but so few people can actually craft code that solves a purpose and craft it well. So this is what I want to find, I want to find code craftsman! But how? I used to ask coding-trivia questions a long time ago and many people still fall back on this. The thought is that if you ask the candidate some piece of coding trivia and they know the answer it must follow that they can craft good code. For example: What C++ keyword can be applied to a class/struct field to allow it to be changed even from a const-instance of that class/struct? (answer: mutable) So what do we prove if a candidate can answer this? Only that they know what mutable means. One would hope that this would infer that they'd know how to use it, and more importantly when and if it should ever be used! But it rarely does! The problem with triva questions is that you will either: Approve a really good developer who knows what some obscure keyword is (good) Reject a really good developer who never needed to use that keyword or is too inexperienced to know how to use it (bad) Approve a really bad developer who googled "C++ Interview Questions" and studied like hell but can't craft (very bad) Many HR departments love these kind of tests because they are short and easy to defend if a legal issue arrises on hiring decisions. After all it's easy to say a person wasn't hired because they scored 30 out of 100 on some trivia test. But unfortunately, you've eliminated a large part of your potential developer pool and possibly hired a few duds. There are times I've hired candidates who knew every trivia question I could throw out them and couldn't craft. And then there are times I've interviewed candidates who failed all my trivia but who I took a chance on who were my best finds ever. So if not trivia, then what? Brain teasers? The thought is, these type of questions measure the thinking power of a candidate. The problem is, once again, you will either: Approve a good candidate who has never heard the problem and can solve it (good) Reject a good candidate who just happens not to see the "catch" because they're nervous or it may be really obscure (bad) Approve a candidate who has studied enough interview brain teasers (once again, you can google em) to recognize the "catch" or knows the answer already (bad). Once again, you're eliminating good candidates and possibly accepting bad candidates. In these cases, I think testing someone with brain teasers only tests their ability to answer brain teasers, not the ability to craft code. So how do we measure someone's ability to craft code? Here's a novel idea: have them code! Give them a computer and a compiler, or a whiteboard and a pen, or paper and pencil and have them construct a piece of code. It just makes sense that if we're going to hire someone to code we should actually watch them code. When they're done, we can judge them on several criteria: Correctness - does the candidate's solution accurately solve the problem proposed? Accuracy - is the candidate's solution reasonably syntactically correct? Efficiency - did the candidate write or use the more efficient data structures or algorithms for the job? Maintainability - was the candidate's code free of obfuscation and clever tricks that diminish readability? Persona - are they eager and willing or aloof and egotistical? Will they work well within your team? It may sound simple, or it may sound crazy, but when I'm looking to hire a developer, I want to see them actually develop well-crafted code.

Read the article
Oracle's Global Single Schema

- by david.butler(at)oracle.com

Maximizing business process efficiencies in a heterogeneous environment is very difficult. The difficulty stems from the fact that the various applications across the Information Technology (IT) landscape employ different integration standards, different message passing strategies, and different workflow engines. Vendors such as Oracle and others are delivering tools to help IT organizations manage the complexities introduced by these differences. But the one remaining intractable problem impacting efficient operations is the fact that these applications have different definitions for the same business data. Business data is your business information codified for computer programs to use. A good data model will represent the way your organization does business. The computer applications your organization deploys to improve operational efficiency are built to operate on the business data organized into this schema. If the schema does not represent how you do business, the applications on that schema cannot provide the features you need to achieve the desired efficiencies. Business processes span these applications. Data problems break these processes rendering them far less efficient than they need to be to achieve organization goals. Thus, the expected return on the investment in these applications is never realized. The success of all business processes depends on the availability of accurate master data. Clearly, the solution to this problem is to consolidate all the master data an organization uses to run its business. Then clean it up, augment it, govern it, and connect it back to the applications that need it. Until now, this obvious solution has been difficult to achieve because no one had defined a data model sufficiently broad, deep and flexible enough to support transaction processing on all key business entities and serve as a master superset to all other operational data models deployed in heterogeneous IT environments. Today, the situation has changed. Oracle has created an operational data model (aka schema) that can support accurate and consistent master data across heterogeneous IT systems. This is foundational for providing a way to consolidate and integrate master data without having to replace investments in existing applications. This Global Single Schema (GSS) represents a revolutionary breakthrough that allows for true master data consolidation. Oracle has deep knowledge of applications dating back to the early 1990s. It developed applications in the areas of Supply Chain Management (SCM), Product Lifecycle Management (PLM), Enterprise Resource Planning (ERP), Customer Relationship Management (CRM), Human Capital Management (HCM), Financials and Manufacturing. In addition, Oracle applications were delivered for key industries such as Communications, Financial Services, Retail, Public Sector, High Tech Manufacturing (HTM) and more. Expertise in all these areas drove requirements for GSS. The following figure illustrates Oracle's unique position that enabled the creation of the Global Single Schema. GSS Requirements Gathering GSS defines all the key business entities and attributes including Customers, Contacts, Suppliers, Accounts, Products, Services, Materials, Employees, Installed Base, Sites, Assets, and Inventory to name just a few. In addition, Oracle delivers GSS pre-integrated with a wide variety of operational applications. Business Process Automation EBusiness is about maximizing operational efficiency. At the highest level, these 'operations' span all that you do as an organization. The following figure illustrates some of these high-level business processes. Enterprise Business Processes Supplies are procured. Assets are maintained. Materials are stored. Inventory is accumulated. Products and Services are engineered, produced and sold. Customers are serviced. And across this entire spectrum, Employees do the procuring, supporting, engineering, producing, selling and servicing. Not shown, but not to be overlooked, are the accounting and the financial processes associated with all this procuring, manufacturing, and selling activity. Supporting all these applications is the master data. When this data is fragmented and inconsistent, the business processes fail and inefficiencies multiply. But imagine having all the data under these operational business processes in one place. · The same accurate and timely customer data will be provided to all your operational applications from the call center to the point of sale. · The same accurate and timely supplier data will be provided to all your operational applications from supply chain planning to procurement. · The same accurate and timely product information will be available to all your operational applications from demand chain planning to marketing. You would have a single version of the truth about your assets, financial information, customers, suppliers, employees, products and services to support your business automation processes as they flow across your business applications. All company and partner personnel will access the same exact data entity across all your channels and across all your lines of business. Oracle's Global Single Schema enables this vision of a single version of the truth across the heterogeneous operational applications supporting the entire enterprise. Global Single Schema Oracle's Global Single Schema organizes hundreds of thousands of attributes into 165 major schema objects supporting over 180 business application modules. It is designed for international operations, and extensibility. The schema is delivered with a full set of public Application Programming Interfaces (APIs) and an Integration Repository with modern Service Oriented Architecture interfaces to make data available as a services (DaaS) to business processes and enable operations in heterogeneous IT environments. · Key tables can be extended with unlimited numbers of additional attributes and attribute groups for maximum flexibility. o This enables model extensions that reflect business entities unique to your organization's operations. · The schema is multi-organization enabled so data manipulation can be controlled along organizational boundaries. · It uses variable byte Unicode to support over 31 languages. · The schema encodes flexible date and flexible address formats for easy localizations. No matter how complex your business is, Oracle's Global Single Schema can hold your business objects and support your global operations. Oracle's Global Single Schema identifies and defines the business objects an enterprise needs within the context of its business operations. The interrelationships between the business objects are also contained within the GSS data model. Their presence expresses fundamental business rules for the interaction between business entities. The following figure illustrates some of these connections. Interconnected Business Entities Interconnecte business processes require interconnected business data. No other MDM vendor has this capability. Everyone else has either one entity they can master or separate disconnected models for various business entities. Higher level integrations are made available, but that is a weak architectural alternative to data level integration in this critically important aspect of Master Data Management.

Read the article
SortedDictionary and SortedList

- by Simon Cooper

Apart from Dictionary<TKey, TValue>, there's two other dictionaries in the BCL - SortedDictionary<TKey, TValue> and SortedList<TKey, TValue>. On the face of it, these two classes do the same thing - provide an IDictionary<TKey, TValue> interface where the iterator returns the items sorted by the key. So what's the difference between them, and when should you use one rather than the other? (as in my previous post, I'll assume you have some basic algorithm & datastructure knowledge) SortedDictionary We'll first cover SortedDictionary. This is implemented as a special sort of binary tree called a red-black tree. Essentially, it's a binary tree that uses various constraints on how the nodes of the tree can be arranged to ensure the tree is always roughly balanced (for more gory algorithmical details, see the wikipedia link above). What I'm concerned about in this post is how the .NET SortedDictionary is actually implemented. In .NET 4, behind the scenes, the actual implementation of the tree is delegated to a SortedSet<KeyValuePair<TKey, TValue>>. One example tree might look like this: Each node in the above tree is stored as a separate SortedSet<T>.Node object (remember, in a SortedDictionary, T is instantiated to KeyValuePair<TKey, TValue>): class Node { public bool IsRed; public T Item; public SortedSet<T>.Node Left; public SortedSet<T>.Node Right; } The SortedSet only stores a reference to the root node; all the data in the tree is accessed by traversing the Left and Right node references until you reach the node you're looking for. Each individual node can be physically stored anywhere in memory; what's important is the relationship between the nodes. This is also why there is no constructor to SortedDictionary or SortedSet that takes an integer representing the capacity; there are no internal arrays that need to be created and resized. This may seen trivial, but it's an important distinction between SortedDictionary and SortedList that I'll cover later on. And that's pretty much it; it's a standard red-black tree. Plenty of webpages and datastructure books cover the algorithms behind the tree itself far better than I could. What's interesting is the comparions between SortedDictionary and SortedList, which I'll cover at the end. As a side point, SortedDictionary has existed in the BCL ever since .NET 2. That means that, all through .NET 2, 3, and 3.5, there has been a bona-fide sorted set class in the BCL (called TreeSet). However, it was internal, so it couldn't be used outside System.dll. Only in .NET 4 was this class exposed as SortedSet. SortedList Whereas SortedDictionary didn't use any backing arrays, SortedList does. It is implemented just as the name suggests; two arrays, one containing the keys, and one the values (I've just used random letters for the values): The items in the keys array are always guarenteed to be stored in sorted order, and the value corresponding to each key is stored in the same index as the key in the values array. In this example, the value for key item 5 is 'z', and for key item 8 is 'm'. Whenever an item is inserted or removed from the SortedList, a binary search is run on the keys array to find the correct index, then all the items in the arrays are shifted to accomodate the new or removed item. For example, if the key 3 was removed, a binary search would be run to find the array index the item was at, then everything above that index would be moved down by one: and then if the key/value pair {7, 'f'} was added, a binary search would be run on the keys to find the index to insert the new item, and everything above that index would be moved up to accomodate the new item: If another item was then added, both arrays would be resized (to a length of 10) before the new item was added to the arrays. As you can see, any insertions or removals in the middle of the list require a proportion of the array contents to be moved; an O(n) operation. However, if the insertion or removal is at the end of the array (ie the largest key), then it's only O(log n); the cost of the binary search to determine it does actually need to be added to the end (excluding the occasional O(n) cost of resizing the arrays to fit more items). As a side effect of using backing arrays, SortedList offers IList Keys and Values views that simply use the backing keys or values arrays, as well as various methods utilising the array index of stored items, which SortedDictionary does not (and cannot) offer. The Comparison So, when should you use one and not the other? Well, here's the important differences: Memory usage SortedDictionary and SortedList have got very different memory profiles. SortedDictionary... has a memory overhead of one object instance, a bool, and two references per item. On 64-bit systems, this adds up to ~40 bytes, not including the stored item and the reference to it from the Node object. stores the items in separate objects that can be spread all over the heap. This helps to keep memory fragmentation low, as the individual node objects can be allocated wherever there's a spare 60 bytes. In contrast, SortedList... has no additional overhead per item (only the reference to it in the array entries), however the backing arrays can be significantly larger than you need; every time the arrays are resized they double in size. That means that if you add 513 items to a SortedList, the backing arrays will each have a length of 1024. To conteract this, the TrimExcess method resizes the arrays back down to the actual size needed, or you can simply assign list.Capacity = list.Count. stores its items in a continuous block in memory. If the list stores thousands of items, this can cause significant problems with Large Object Heap memory fragmentation as the array resizes, which SortedDictionary doesn't have. Performance Operations on a SortedDictionary always have O(log n) performance, regardless of where in the collection you're adding or removing items. In contrast, SortedList has O(n) performance when you're altering the middle of the collection. If you're adding or removing from the end (ie the largest item), then performance is O(log n), same as SortedDictionary (in practice, it will likely be slightly faster, due to the array items all being in the same area in memory, also called locality of reference). So, when should you use one and not the other? As always with these sort of things, there are no hard-and-fast rules. But generally, if you: need to access items using their index within the collection are populating the dictionary all at once from sorted data aren't adding or removing keys once it's populated then use a SortedList. But if you: don't know how many items are going to be in the dictionary are populating the dictionary from random, unsorted data are adding & removing items randomly then use a SortedDictionary. The default (again, there's no definite rules on these sort of things!) should be to use SortedDictionary, unless there's a good reason to use SortedList, due to the bad performance of SortedList when altering the middle of the collection.

Read the article
Microsoft Business Intelligence Seminar 2011

- by DavidWimbush

I was lucky enough to attend the maiden presentation of this at Microsoft Reading yesterday. It was pretty gripping stuff not only because of what was said but also because of what could only be hinted at. Here's what I took away from the day. (Disclaimer: I'm not a BI guru, just a reasonably experienced BI developer, so I may have misunderstood or misinterpreted a few things. Particularly when so much of the talk was about the vision and subtle hints of what is coming. Please comment if you think I've got anything wrong. I'm also not going to even try to cover Master Data Services as I struggled to imagine how you would actually use it.) I was a bit worried when I learned that the whole day was going to be presented by one guy but Rafal Lukawiecki is a very engaging speaker. He's going to be presenting this about 20 times around the world over the coming months. If you get a chance to hear him speak, I say go for it. No doubt some of the hints will become clearer as Denali gets closer to RTM. Firstly, things are definitely happening in the SQL Server Reporting and BI world. Traditionally IT would build a data warehouse, then cubes on top of that, and then publish them in a structured and controlled way. But, just as with many IT projects in general, by the time it's finished the business has moved on and the system no longer meets their requirements. This not sustainable and something more agile is needed but there has to be some control. Apparently we're going to be hearing the catchphrase 'Balancing agility with control' a lot. More users want more access to more data. Can they define what they want? Of course not, but they'll recognise it when they see it. It's estimated that only 28% of potential BI users have meaningful access to the data they need, so there is a real pent-up demand. The answer looks like: give them some self-service tools so they can experiment and see what works, and then IT can help to support the results. It's estimated that 32% of Excel users are comfortable with its analysis tools such as pivot tables. It's the power user's preferred tool. Why fight it? That's why PowerPivot is an Excel add-in and that's why they released a Data Mining add-in for it as well. It does appear that the strategy is going to be to use Reporting Services (in SharePoint mode), PowerPivot, and possibly something new (smiles and hints but no details) to create reports and explore data. Everything will be published and managed in SharePoint which gives users the ability to mash-up, share and socialise what they've found out. SharePoint also gives IT tools to understand what people are looking at and where to concentrate effort. If PowerPivot report X becomes widely used, it's time to check that it shows what they think it does and perhaps get it a bit more under central control. There was more SharePoint detail that went slightly over my head regarding where Excel Services and Excel Web Application fit in, the differences between them, and the suggestion that it is likely they will one day become one (but not in the immediate future). That basic pattern is set to be expanded upon by further exploiting Vertipaq (the columnar indexing engine that enables PowerPivot to store and process a lot of data fast and in a small memory footprint) to provide scalability 'from the desktop to the data centre', and some yet to be detailed advances in 'frictionless deployment' (part of which is about making the difference between local and the cloud pretty much irrelevant). Excel looks like becoming Microsoft's primary BI client. It already has: the ability to consume cubes strong visualisation tools slicers (which are part of Excel not PowerPivot) a data mining add-in PowerPivot A major hurdle for self-service BI is presenting the data in a consumable format. You can't just give users PowerPivot and a server with a copy of the OLTP database(s). Building cubes is labour intensive and doesn't always give the user what they need. This is where the BI Semantic Model (BISM) comes in. I gather it's a layer of metadata you define that can combine multiple data sources (and types of data source) into a clear 'interface' that users can work with. It comes with a new query language called DAX. SSAS cubes are unlikely to go away overnight because, with their pre-calculated results, they are still the most efficient way to work with really big data sets. A few other random titbits that came up: Reporting Services is going to get some good new stuff in Denali. Keep an eye on www.projectbotticelli.com for the slides. You can also view last year's seminar sessions which covered a lot of the same ground as far as the overall strategy is concerned. They plan to add more material as Denali's features are publicly exposed. Check out the PASS keynote address for a showing of Yahoo's SQL BI servers. Apparently they wheeled the rack out on stage still plugged in and running! Check out the Excel 2010 Data Mining Add-Ins. 32 bit only at present but 64 bit is on the way. There are lots of data sets, many of them free, at the Windows Azure Marketplace Data Market (where you can also get ESRI shape files). If you haven't already seen it, have a look at the Silverlight Pivot Viewer (http://weblogs.asp.net/scottgu/archive/2010/06/29/silverlight-pivotviewer-now-available.aspx). The Bing Maps Data Connector is worth a look if you're into spatial stuff (http://www.bing.com/community/site_blogs/b/maps/archive/2010/07/13/data-connector-sql-server-2008-spatial-amp-bing-maps.aspx).

Read the article
Premature-Optimization and Performance Anxiety

- by James Michael Hare

While writing my post analyzing the new .NET 4 ConcurrentDictionary class (here), I fell into one of the classic blunders that I myself always love to warn about. After analyzing the differences of time between a Dictionary with locking versus the new ConcurrentDictionary class, I noted that the ConcurrentDictionary was faster with read-heavy multi-threaded operations. Then, I made the classic blunder of thinking that because the original Dictionary with locking was faster for those write-heavy uses, it was the best choice for those types of tasks. In short, I fell into the premature-optimization anti-pattern. Basically, the premature-optimization anti-pattern is when a developer is coding very early for a perceived (whether rightly-or-wrongly) performance gain and sacrificing good design and maintainability in the process. At best, the performance gains are usually negligible and at worst, can either negatively impact performance, or can degrade maintainability so much that time to market suffers or the code becomes very fragile due to the complexity. Keep in mind the distinction above. I'm not talking about valid performance decisions. There are decisions one should make when designing and writing an application that are valid performance decisions. Examples of this are knowing the best data structures for a given situation (Dictionary versus List, for example) and choosing performance algorithms (linear search vs. binary search). But these in my mind are macro optimizations. The error is not in deciding to use a better data structure or algorithm, the anti-pattern as stated above is when you attempt to over-optimize early on in such a way that it sacrifices maintainability. In my case, I was actually considering trading the safety and maintainability gains of the ConcurrentDictionary (no locking required) for a slight performance gain by using the Dictionary with locking. This would have been a mistake as I would be trading maintainability (ConcurrentDictionary requires no locking which helps readability) and safety (ConcurrentDictionary is safe for iteration even while being modified and you don't risk the developer locking incorrectly) -- and I fell for it even when I knew to watch out for it. I think in my case, and it may be true for others as well, a large part of it was due to the time I was trained as a developer. I began college in in the 90s when C and C++ was king and hardware speed and memory were still relatively priceless commodities and not to be squandered. In those days, using a long instead of a short could waste precious resources, and as such, we were taught to try to minimize space and favor performance. This is why in many cases such early code-bases were very hard to maintain. I don't know how many times I heard back then to avoid too many function calls because of the overhead -- and in fact just last year I heard a new hire in the company where I work declare that she didn't want to refactor a long method because of function call overhead. Now back then, that may have been a valid concern, but with today's modern hardware even if you're calling a trivial method in an extremely tight loop (which chances are the JIT compiler would optimize anyway) the results of removing method calls to speed up performance are negligible for the great majority of applications. Now, obviously, there are those coding applications where speed is absolutely king (for example drivers, computer games, operating systems) where such sacrifices may be made. But I would strongly advice against such optimization because of it's cost. Many folks that are performing an optimization think it's always a win-win. That they're simply adding speed to the application, what could possibly be wrong with that? What they don't realize is the cost of their choice. For every piece of straight-forward code that you obfuscate with performance enhancements, you risk the introduction of bugs in the long term technical debt of the application. It will become so fragile over time that maintenance will become a nightmare. I've seen such applications in places I have worked. There are times I've seen applications where the designer was so obsessed with performance that they even designed their own memory management system for their application to try to squeeze out every ounce of performance. Unfortunately, the application stability often suffers as a result and it is very difficult for anyone other than the original designer to maintain. I've even seen this recently where I heard a C++ developer bemoaning that in VS2010 the iterators are about twice as slow as they used to be because Microsoft added range checking (probably as part of the 0x standard implementation). To me this was almost a joke. Twice as slow sounds bad, but it almost never as bad as you think -- especially if you're gaining safety. The only time twice is really that much slower is when once was too slow to begin with. Think about it. 2 minutes is slow as a response time because 1 minute is slow. But if an iterator takes 1 microsecond to move one position and a new, safer iterator takes 2 microseconds, this is trivial! The only way you'd ever really notice this would be in iterating a collection just for the sake of iterating (i.e. no other operations). To my mind, the added safety makes the extra time worth it. Always favor safety and maintainability when you can. I know it can be a hard habit to break, especially if you started out your career early or in a language such as C where they are very performance conscious. But in reality, these type of micro-optimizations only end up hurting you in the long run. Remember the two laws of optimization. I'm not sure where I first heard these, but they are so true: For beginners: Do not optimize. For experts: Do not optimize yet. This is so true. If you're a beginner, resist the urge to optimize at all costs. And if you are an expert, delay that decision. As long as you have chosen the right data structures and algorithms for your task, your performance will probably be more than sufficient. Chances are it will be network, database, or disk hits that will be your slow-down, not your code. As they say, 98% of your code's bottleneck is in 2% of your code so premature-optimization may add maintenance and safety debt that won't have any measurable impact. Instead, code for maintainability and safety, and then, and only then, when you find a true bottleneck, then you should go back and optimize further.

Read the article
WebSocket Applications using Java: JSR 356 Early Draft Now Available (TOTD #183)

- by arungupta

WebSocket provide a full-duplex and bi-directional communication protocol over a single TCP connection. JSR 356 is defining a standard API for creating WebSocket applications in the Java EE 7 Platform. This Tip Of The Day (TOTD) will provide an introduction to WebSocket and how the JSR is evolving to support the programming model. First, a little primer on WebSocket! WebSocket is a combination of IETF RFC 6455 Protocol and W3C JavaScript API (still a Candidate Recommendation). The protocol defines an opening handshake and data transfer. The API enables Web pages to use the WebSocket protocol for two-way communication with the remote host. Unlike HTTP, there is no need to create a new TCP connection and send a chock-full of headers for every message exchange between client and server. The WebSocket protocol defines basic message framing, layered over TCP. Once the initial handshake happens using HTTP Upgrade, the client and server can send messages to each other, independent from the other. There are no pre-defined message exchange patterns of request/response or one-way between client and and server. These need to be explicitly defined over the basic protocol. The communication between client and server is pretty symmetric but there are two differences: A client initiates a connection to a server that is listening for a WebSocket request. A client connects to one server using a URI. A server may listen to requests from multiple clients on the same URI. Other than these two difference, the client and server behave symmetrically after the opening handshake. In that sense, they are considered as "peers". After a successful handshake, clients and servers transfer data back and forth in conceptual units referred as "messages". On the wire, a message is composed of one or more frames. Application frames carry payload intended for the application and can be text or binary data. Control frames carry data intended for protocol-level signaling. Now lets talk about the JSR! The Java API for WebSocket is worked upon as JSR 356 in the Java Community Process. This will define a standard API for building WebSocket applications. This JSR will provide support for: Creating WebSocket Java components to handle bi-directional WebSocket conversations Initiating and intercepting WebSocket events Creation and consumption of WebSocket text and binary messages The ability to define WebSocket protocols and content models for an application Configuration and management of WebSocket sessions, like timeouts, retries, cookies, connection pooling Specification of how WebSocket application will work within the Java EE security model Tyrus is the Reference Implementation for JSR 356 and is already integrated in GlassFish 4.0 Promoted Builds. And finally some code! The API allows to create WebSocket endpoints using annotations and interface. This TOTD will show a simple sample using annotations. A subsequent blog will show more advanced samples. A POJO can be converted to a WebSocket endpoint by specifying @WebSocketEndpoint and @WebSocketMessage. @WebSocketEndpoint(path="/hello")public class HelloBean { @WebSocketMessage public String sayHello(String name) { return "Hello " + name + "!"; }} @WebSocketEndpoint marks this class as a WebSocket endpoint listening at URI defined by the path attribute. The @WebSocketMessage identifies the method that will receive the incoming WebSocket message. This first method parameter is injected with payload of the incoming message. In this case it is assumed that the payload is text-based. It can also be of the type byte[] in case the payload is binary. A custom object may be specified if decoders attribute is specified in the @WebSocketEndpoint. This attribute will provide a list of classes that define how a custom object can be decoded. This method can also take an optional Session parameter. This is injected by the runtime and capture a conversation between two endpoints. The return type of the method can be String, byte[] or a custom object. The encoders attribute on @WebSocketEndpoint need to define how a custom object can be encoded. The client side is an index.jsp with embedded JavaScript. The JSP body looks like: <div style="text-align: center;"> <form action=""> <input onclick="say_hello()" value="Say Hello" type="button"> <input id="nameField" name="name" value="WebSocket" type="text"><br> </form> </div> <div id="output"></div> The code is relatively straight forward. It has an HTML form with a button that invokes say_hello() method and a text field named nameField. A div placeholder is available for displaying the output. Now, lets take a look at some JavaScript code: <script language="javascript" type="text/javascript"> var wsUri = "ws://localhost:8080/HelloWebSocket/hello"; var websocket = new WebSocket(wsUri); websocket.onopen = function(evt) { onOpen(evt) }; websocket.onmessage = function(evt) { onMessage(evt) }; websocket.onerror = function(evt) { onError(evt) }; function init() { output = document.getElementById("output"); } function say_hello() { websocket.send(nameField.value); writeToScreen("SENT: " + nameField.value); } This application is deployed as "HelloWebSocket.war" (download here) on GlassFish 4.0 promoted build 57. So the WebSocket endpoint is listening at "ws://localhost:8080/HelloWebSocket/hello". A new WebSocket connection is initiated by specifying the URI to connect to. The JavaScript API defines callback methods that are invoked when the connection is opened (onOpen), closed (onClose), error received (onError), or a message from the endpoint is received (onMessage). The client API has several send methods that transmit data over the connection. This particular script sends text data in the say_hello method using nameField's value from the HTML shown earlier. Each click on the button sends the textbox content to the endpoint over a WebSocket connection and receives a response based upon implementation in the sayHello method shown above. How to test this out ? Download the entire source project here or just the WAR file. Download GlassFish4.0 build 57 or later and unzip. Start GlassFish as "asadmin start-domain". Deploy the WAR file as "asadmin deploy HelloWebSocket.war". Access the application at http://localhost:8080/HelloWebSocket/index.jsp. After clicking on "Say Hello" button, the output would look like: Here are some references for you: WebSocket - Protocol and JavaScript API JSR 356: Java API for WebSocket - Specification (Early Draft) and Implementation (already integrated in GlassFish 4 promoted builds) Subsequent blogs will discuss the following topics (not necessary in that order) ... Binary data as payload Custom payloads using encoder/decoder Error handling Interface-driven WebSocket endpoint Java client API Client and Server configuration Security Subprotocols Extensions Other topics from the API Capturing WebSocket on-the-wire messages

Read the article
SQL SERVER – SSIS Look Up Component – Cache Mode – Notes from the Field #028

- by Pinal Dave

[Notes from Pinal]: Lots of people think that SSIS is all about arranging various operations together in one logical flow. Well, the understanding is absolutely correct, but the implementation of the same is not as easy as it seems. Similarly most of the people think lookup component is just component which does look up for additional information and does not pay much attention to it. Due to the same reason they do not pay attention to the same and eventually get very bad performance. Linchpin People are database coaches and wellness experts for a data driven world. In this 28th episode of the Notes from the Fields series database expert Tim Mitchell (partner at Linchpin People) shares very interesting conversation related to how to write a good lookup component with Cache Mode. In SQL Server Integration Services, the lookup component is one of the most frequently used tools for data validation and completion. The lookup component is provided as a means to virtually join one set of data to another to validate and/or retrieve missing values. Properly configured, it is reliable and reasonably fast. Among the many settings available on the lookup component, one of the most critical is the cache mode. This selection will determine whether and how the distinct lookup values are cached during package execution. It is critical to know how cache modes affect the result of the lookup and the performance of the package, as choosing the wrong setting can lead to poorly performing packages, and in some cases, incorrect results. Full Cache The full cache mode setting is the default cache mode selection in the SSIS lookup transformation. Like the name implies, full cache mode will cause the lookup transformation to retrieve and store in SSIS cache the entire set of data from the specified lookup location. As a result, the data flow in which the lookup transformation resides will not start processing any data buffers until all of the rows from the lookup query have been cached in SSIS. The most commonly used cache mode is the full cache setting, and for good reason. The full cache setting has the most practical applications, and should be considered the go-to cache setting when dealing with an untested set of data. With a moderately sized set of reference data, a lookup transformation using full cache mode usually performs well. Full cache mode does not require multiple round trips to the database, since the entire reference result set is cached prior to data flow execution. There are a few potential gotchas to be aware of when using full cache mode. First, you can see some performance issues – memory pressure in particular – when using full cache mode against large sets of reference data. If the table you use for the lookup is very large (either deep or wide, or perhaps both), there’s going to be a performance cost associated with retrieving and caching all of that data. Also, keep in mind that when doing a lookup on character data, full cache mode will always do a case-sensitive (and in some cases, space-sensitive) string comparison even if your database is set to a case-insensitive collation. This is because the in-memory lookup uses a .NET string comparison (which is case- and space-sensitive) as opposed to a database string comparison (which may be case sensitive, depending on collation). There’s a relatively easy workaround in which you can use the UPPER() or LOWER() function in the pipeline data and the reference data to ensure that case differences do not impact the success of your lookup operation. Again, neither of these present a reason to avoid full cache mode, but should be used to determine whether full cache mode should be used in a given situation. Full cache mode is ideally useful when one or all of the following conditions exist: The size of the reference data set is small to moderately sized The size of the pipeline data set (the data you are comparing to the lookup table) is large, is unknown at design time, or is unpredictable Each distinct key value(s) in the pipeline data set is expected to be found multiple times in that set of data Partial Cache When using the partial cache setting, lookup values will still be cached, but only as each distinct value is encountered in the data flow. Initially, each distinct value will be retrieved individually from the specified source, and then cached. To be clear, this is a row-by-row lookup for each distinct key value(s). This is a less frequently used cache setting because it addresses a narrower set of scenarios. Because each distinct key value(s) combination requires a relational round trip to the lookup source, performance can be an issue, especially with a large pipeline data set to be compared to the lookup data set. If you have, for example, a million records from your pipeline data source, you have the potential for doing a million lookup queries against your lookup data source (depending on the number of distinct values in the key column(s)). Therefore, one has to be keenly aware of the expected row count and value distribution of the pipeline data to safely use partial cache mode. Using partial cache mode is ideally suited for the conditions below: The size of the data in the pipeline (more specifically, the number of distinct key column) is relatively small The size of the lookup data is too large to effectively store in cache The lookup source is well indexed to allow for fast retrieval of row-by-row values No Cache As you might guess, selecting no cache mode will not add any values to the lookup cache in SSIS. As a result, every single row in the pipeline data set will require a query against the lookup source. Since no data is cached, it is possible to save a small amount of overhead in SSIS memory in cases where key values are not reused. In the real world, I don’t see a lot of use of the no cache setting, but I can imagine some edge cases where it might be useful. As such, it’s critical to know your data before choosing this option. Obviously, performance will be an issue with anything other than small sets of data, as the no cache setting requires row-by-row processing of all of the data in the pipeline. I would recommend considering the no cache mode only when all of the below conditions are true: The reference data set is too large to reasonably be loaded into SSIS memory The pipeline data set is small and is not expected to grow There are expected to be very few or no duplicates of the key values(s) in the pipeline data set (i.e., there would be no benefit from caching these values) Conclusion The cache mode, an often-overlooked setting on the SSIS lookup component, represents an important design decision in your SSIS data flow. Choosing the right lookup cache mode directly impacts the fidelity of your results and the performance of package execution. Know how this selection impacts your ETL loads, and you’ll end up with more reliable, faster packages. If you want me to take a look at your server and its settings, or if your server is facing any issue we can Fix Your SQL Server. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: Notes from the Field, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: SSIS

Read the article
Configuration "diff" across Oracle WebCenter Sites instances

- by Mark Fincham-Oracle

Problem Statement With many Oracle WebCenter Sites environments - how do you know if the various configuration assets and settings are in sync across all of those environments? Background At Oracle we typically have a "W" shaped set of environments. For the "Production" environments we typically have a disaster recovery clone as well and sometimes additional QA environments alongside the production management environment. In the case of www.java.com we have 10 different environments. All configuration assets/settings (CSElements, Templates, Start Menus etc..) start life on the Development Management environment and are then published downstream to other environments as part of the software development lifecycle. Ensuring that each of these 10 environments has the same set of Templates, CSElements, StartMenus, TreeTabs etc.. is impossible to do efficiently without automation. Solution Summary The solution comprises of two components. A JSON data feed from each environment. A simple HTML page that consumes these JSON data feeds. Data Feed: Create a JSON WebService on each environment. The WebService is no more than a SiteEntry + CSElement. The CSElement queries various DB tables to obtain details of the assets/settings returning this data in a JSON feed. Report: Create a simple HTML page that uses JQuery to fetch the JSON feed from each environment and display the results in a table. Since all assets (CSElements, Templates etc..) are published between environments they will have the same last modified date. If the last modified date of an asset is different in the JSON feed or is mising from an environment entirely then highlight that in the report table. Example Solution Details Step 1: Create a Site Entry + CSElement that outputs JSON Site Entry & CSElement Setup The SiteEntry should be uncached so that the most recent configuration information is returned at all times. In the CSElement set the contenttype accordingly: Step 2: Write the CSElement Logic The basic logic, that we repeat for each asset or setting that we are interested in, is to query the DB using <ics:sql> and then loop over the resultset with <ics:listloop>. For example: <ics:sql sql="SELECT name,updateddate FROM Template WHERE status != 'VO'" listname="TemplateList" table="Template" /> "templates": [ <ics:listloop listname="TemplateList"> {"name":"<ics:listget listname="TemplateList" fieldname="name"/>", "modified":"<ics:listget listname="TemplateList" fieldname="updateddate"/>"}, </ics:listloop> ], A comprehensive list of SQL queries to fetch each configuration asset/settings can be seen in the appendix at the end of this article. For the generation of the JSON data structure you could use Jettison (the library ships with the 11.1.1.8 version of the product), native Java 7 capabilities or (as the above example demonstrates) you could roll-your-own JSON output but that is not advised. Step 3: Create an HTML Report The JavaScript logic looks something like this.. 1) Create a list of JSON feeds to fetch: ENVS['dev-mgmngt'] = 'http://dev-mngmnt.example.com/sites/ContentServer?d=&pagename=settings.json'; ENVS['dev-dlvry'] = 'http://dev-dlvry.example.com/sites/ContentServer?d=&pagename=settings.json'; ENVS['test-mngmnt'] = 'http://test-mngmnt.example.com/sites/ContentServer?d=&pagename=settings.json'; ENVS['test-dlvry'] = 'http://test-dlvry.example.com/sites/ContentServer?d=&pagename=settings.json'; 2) Create a function to get the JSON feeds: function getDataForEnvironment(url){ return $.ajax({ type: 'GET', url: url, dataType: 'jsonp', beforeSend: function (jqXHR, settings){ jqXHR.originalEnv = env; jqXHR.originalUrl = url; }, success: function(json, status, jqXHR) { console.log('....success fetching: ' + jqXHR.originalUrl); // store the returned data in ALLDATA ALLDATA[jqXHR.originalEnv] = json; }, error: function(jqXHR, status, e) { console.log('....ERROR: Failed to get data from [' + url + '] ' + status + ' ' + e); } }); } 3) Fetch each JSON feed: for (var env in ENVS) { console.log('Fetching data for env [' + env +'].'); var promisedData = getDataForEnvironment(ENVS[env]); promisedData.success(function (data) {}); } 4) For each configuration asset or setting create a table in the report For example, CSElements: 1) Get a list of unique CSElement names from all of the returned JSON data. 2) For each unique CSElement name, create a row in the table 3) Select 1 environment to represent the master or ideal state (e.g. "Everything should be like Production Delivery") 4) For each environment, compare the last modified date of this envs CSElement to the master. Highlight any differences in last modified date or missing CSElements. 5) Repeat... Appendix This section contains various SQL statements that can be used to retrieve configuration settings from the DB. Templates <ics:sql sql="SELECT name,updateddate FROM Template WHERE status != 'VO'" listname="TemplateList" table="Template" /> CSElements <ics:sql sql="SELECT name,updateddate FROM CSElement WHERE status != 'VO'" listname="CSEList" table="CSElement" /> Start Menus <ics:sql sql="select sm.id, sm.cs_name, sm.cs_description, sm.cs_assettype, sm.cs_assetsubtype, sm.cs_itemtype, smr.cs_rolename, p.name from StartMenu sm, StartMenu_Sites sms, StartMenu_Roles smr, Publication p where sm.id=sms.ownerid and sm.id=smr.cs_ownerid and sms.pubid=p.id order by sm.id" listname="startList" table="Publication,StartMenu,StartMenu_Roles,StartMenu_Sites"/> Publishing Configurations <ics:sql sql="select id, name, description, type, dest, factors from PubTarget" listname="pubTargetList" table="PubTarget" /> Tree Tabs <ics:sql sql="select tt.id, tt.title, tt.tooltip, p.name as pubname, ttr.cs_rolename, ttsect.name as sectname from TreeTabs tt, TreeTabs_Roles ttr, TreeTabs_Sect ttsect,TreeTabs_Sites ttsites LEFT JOIN Publication p on p.id=ttsites.pubid where p.id is not null and tt.id=ttsites.ownerid and ttsites.pubid=p.id and tt.id=ttr.cs_ownerid and tt.id=ttsect.ownerid order by tt.id" listname="treeTabList" table="TreeTabs,TreeTabs_Roles,TreeTabs_Sect,TreeTabs_Sites,Publication" /> Filters <ics:sql sql="select name,description,classname from Filters" listname="filtersList" table="Filters" /> Attribute Types <ics:sql sql="select id,valuetype,name,updateddate from AttrTypes where status != 'VO'" listname="AttrList" table="AttrTypes" /> WebReference Patterns <ics:sql sql="select id,webroot,pattern,assettype,name,params,publication from WebReferencesPatterns" listname="WebRefList" table="WebReferencesPatterns" /> Device Groups <ics:sql sql="select id,devicegroupsuffix,updateddate,name from DeviceGroup" listname="DeviceList" table="DeviceGroup" /> Site Entries <ics:sql sql="select se.id,se.name,se.pagename,se.cselement_id,se.updateddate,cse.rootelement from SiteEntry se LEFT JOIN CSElement cse on cse.id = se.cselement_id where se.status != 'VO'" listname="SiteList" table="SiteEntry,CSElement" /> Webroots <ics:sql sql="select id,name,rooturl,updatedby,updateddate from WebRoot" listname="webrootList" table="WebRoot" /> Page Definitions <ics:sql sql="select pd.id, pd.name, pd.updatedby, pd.updateddate, pd.description, pdt.attributeid, pa.name as nameattr, pdt.requiredflag, pdt.ordinal from PageDefinition pd, PageDefinition_TAttr pdt, PageAttribute pa where pd.status != 'VO' and pa.id=pdt.attributeid and pdt.ownerid=pd.id order by pd.id,pdt.ordinal" listname="pageDefList" table="PageDefinition,PageAttribute,PageDefinition_TAttr" /> FW_Application <ics:sql sql="select id,name,updateddate from FW_Application where status != 'VO'" listname="FWList" table="FW_Application" /> Custom Elements <ics:sql sql="select elementname from ElementCatalog where elementname like 'CustomElements%'" listname="elementList" table="ElementCatalog" />

Read the article
Developing a Cost Model for Cloud Applications

- by BuckWoody

Note - please pay attention to the date of this post. As much as I attempt to make the information below accurate, the nature of distributed computing means that components, units and pricing will change over time. The definitive costs for Microsoft Windows Azure and SQL Azure are located here, and are more accurate than anything you will see in this post: http://www.microsoft.com/windowsazure/offers/ When writing software that is run on a Platform-as-a-Service (PaaS) offering like Windows Azure / SQL Azure, one of the questions you must answer is how much the system will cost. I will not discuss the comparisons between on-premise costs (which are nigh impossible to calculate accurately) versus cloud costs, but instead focus on creating a general model for estimating costs for a given application. You should be aware that there are (at this writing) two billing mechanisms for Windows and SQL Azure: “Pay-as-you-go” or consumption, and “Subscription” or commitment. Conceptually, you can consider the former a pay-as-you-go cell phone plan, where you pay by the unit used (at a slightly higher rate) and the latter as a standard cell phone plan where you commit to a contract and thus pay lower rates. In this post I’ll stick with the pay-as-you-go mechanism for simplicity, which should be the maximum cost you would pay. From there you may be able to get a lower cost if you use the other mechanism. In any case, the model you create should hold. Developing a good cost model is essential. As a developer or architect, you’ll most certainly be asked how much something will cost, and you need to have a reliable way to estimate that. Businesses and Organizations have been used to paying for servers, software licenses, and other infrastructure as an up-front cost, and power, people to the systems and so on as an ongoing (and sometimes not factored) cost. When presented with a new paradigm like distributed computing, they may not understand the true cost/value proposition, and that’s where the architect and developer can guide the conversation to make a choice based on features of the application versus the true costs. The two big buckets of use-types for these applications are customer-based and steady-state. In the customer-based use type, each successful use of the program results in a sale or income for your organization. Perhaps you’ve written an application that provides the spot-price of foo, and your customer pays for the use of that application. In that case, once you’ve estimated your cost for a successful traversal of the application, you can build that into the price you charge the user. It’s a standard restaurant model, where the price of the meal is determined by the cost of making it, plus any profit you can make. In the second use-type, the application will be used by a more-or-less constant number of processes or users and no direct revenue is attached to the system. A typical example is a customer-tracking system used by the employees within your company. In this case, the cost model is often created “in reverse” - meaning that you pilot the application, monitor the use (and costs) and that cost is held steady. This is where the comparison with an on-premise system becomes necessary, even though it is more difficult to estimate those on-premise true costs. For instance, do you know exactly how much cost the air conditioning is because you have a team of system administrators? This may sound trivial, but that, along with the insurance for the building, the wiring, and every other part of the system is in fact a cost to the business. There are three primary methods that I’ve been successful with in estimating the cost. None are perfect, all are demand-driven. The general process is to lay out a matrix of: components units cost per unit and then multiply that times the usage of the system, based on which components you use in the program. That sounds a bit simplistic, but using those metrics in a calculation becomes more detailed. In all of the methods that follow, you need to know your application. The components for a PaaS include computing instances, storage, transactions, bandwidth and in the case of SQL Azure, database size. In most cases, architects start with the first model and progress through the other methods to gain accuracy. Simple Estimation The simplest way to calculate costs is to architect the application (even UML or on-paper, no coding involved) and then estimate which of the components you’ll use, and how much of each will be used. Microsoft provides two tools to do this - one is a simple slider-application located here: http://www.microsoft.com/windowsazure/pricing-calculator/ The other is a tool you download to create an “Return on Investment” (ROI) spreadsheet, which has the advantage of leading you through various questions to estimate what you plan to use, located here: https://roianalyst.alinean.com/msft/AutoLogin.do?d=176318219048082115 You can also just create a spreadsheet yourself with a structure like this: Program Element Azure Component Unit of Measure Cost Per Unit Estimated Use of Component Total Cost Per Component Cumulative Cost Of course, the consideration with this model is that it is difficult to predict a system that is not running or hasn’t even been developed. Which brings us to the next model type. Measure and Project A more accurate model is to actually write the code for the application, using the Software Development Kit (SDK) which can run entirely disconnected from Azure. The code should be instrumented to estimate the use of the application components, logging to a local file on the development system. A series of unit and integration tests should be run, which will create load on the test system. You can use standard development concepts to track this usage, and even use Windows Performance Monitor counters. The best place to start with this method is to use the Windows Azure Diagnostics subsystem in your code, which you can read more about here: http://blogs.msdn.com/b/sumitm/archive/2009/11/18/introducing-windows-azure-diagnostics.aspx This set of API’s greatly simplifies tracking the application, and in fact you can use this information for more than just a cost model. After you have the tracking logs, you can plug the numbers into ay of the tools above, which should give a representative cost or in some cases a unit cost. The consideration with this model is that the SDK fabric is not a one-to-one comparison with performance on the actual Windows Azure fabric. Those differences are usually smaller, but they do need to be considered. Also, you may not be able to accurately predict the load on the system, which might lead to an architectural change, which changes the model. This leads us to the next, most accurate method for a cost model. Sample and Estimate Using standard statistical and other predictive math, once the application is deployed you will get a bill each month from Microsoft for your Azure usage. The bill is quite detailed, and you can export the data from it to do analysis, and using methods like regression and so on project out into the future what the costs will be. I normally advise that the architect also extrapolate a unit cost from those metrics as well. This is the information that should be reported back to the executives that pay the bills: the past cost, future projected costs, and unit cost “per click” or “per transaction”, as your case warrants. The challenge here is in the model itself - statistical methods are not foolproof, and the larger the sample (in this case I recommend the entire population, not a smaller sample) is key. References and Tools Articles: http://blogs.msdn.com/b/patrick_butler_monterde/archive/2010/02/10/windows-azure-billing-overview.aspx http://technet.microsoft.com/en-us/magazine/gg213848.aspx http://blog.codingoutloud.com/2011/06/05/azure-faq-how-much-will-it-cost-me-to-run-my-application-on-windows-azure/ http://blogs.msdn.com/b/johnalioto/archive/2010/08/25/10054193.aspx http://geekswithblogs.net/iupdateable/archive/2010/02/08/qampa-how-can-i-calculate-the-tco-and-roi-when.aspx Other Tools: http://cloud-assessment.com/ http://communities.quest.com/community/cloud_tools

Read the article
Waterfall Model (SDLC) vs. Prototyping Model

The characters in the fable of the Tortoise and the Hare can easily be used to demonstrate the similarities and differences between the Waterfall and Prototyping software development models. This children fable is about a race between a consistently slow moving but steadfast turtle and an extremely fast but unreliable rabbit. After closely comparing each character’s attributes in correlation with both software development models, a trend seems to appear in that the Waterfall closely resembles the Tortoise in that Waterfall Model is typically a slow moving process that is broken up in to multiple sequential steps that must be executed in a standard linear pattern. The Tortoise can be quoted several times in the story saying “Slow and steady wins the race.” This is the perfect mantra for the Waterfall Model in that this model is seen as a cumbersome and slow moving. Waterfall Model Phases Requirement Analysis & Definition This phase focuses on defining requirements for a project that is to be developed and determining if the project is even feasible. Requirements are collected by analyzing existing systems and functionality in correlation with the needs of the business and the desires of the end users. The desired output for this phase is a list of specific requirements from the business that are to be designed and implemented in the subsequent steps. In addition this phase is used to determine if any value will be gained by completing the project. System Design This phase focuses primarily on the actual architectural design of a system, and how it will interact within itself and with other existing applications. Projects at this level should be viewed at a high level so that actual implementation details are decided in the implementation phase. However major environmental decision like hardware and platform decision are typically decided in this phase. Furthermore the basic goal of this phase is to design an application at the system level in those classes, interfaces, and interactions are defined. Additionally decisions about scalability, distribution and reliability should also be considered for all decisions. The desired output for this phase is a functional design document that states all of the architectural decisions that have been made in regards to the project as well as a diagrams like a sequence and class diagrams. Software Design This phase focuses primarily on the refining of the decisions found in the functional design document. Classes and interfaces are further broken down in to logical modules based on the interfaces and interactions previously indicated. The output of this phase is a formal design document. Implementation / Coding This phase focuses primarily on implementing the previously defined modules in to units of code. These units are developed independently are intergraded as the system is put together as part of a whole system. Software Integration & Verification This phase primarily focuses on testing each of the units of code developed as well as testing the system as a whole. There are basic types of testing at this phase and they include: Unit Test and Integration Test. Unit Test are built to test the functionality of a code unit to ensure that it preforms its desired task. Integration testing test the system as a whole because it focuses on results of combining specific units of code and validating it against expected results. The output of this phase is a test plan that includes test with expected results and actual results. System Verification This phase primarily focuses on testing the system as a whole in regards to the list of project requirements and desired operating environment. Operation & Maintenance his phase primarily focuses on handing off the competed project over to the customer so that they can verify that all of their requirements have been met based on their original requirements. This phase will also validate the correctness of their requirements and if any changed need to be made. In addition, any problems not resolved in the previous phase will be handled in this section. The Waterfall Model’s linear and sequential methodology does offer a project certain advantages and disadvantages. Advantages of the Waterfall Model Simplistic to implement and execute for projects and/or company wide Limited demand on resources Large emphasis on documentation Disadvantages of the Waterfall Model Completed phases cannot be revisited regardless if issues arise within a project Accurate requirement are never gather prior to the completion of the requirement phase due to the lack of clarification in regards to client’s desires. Small changes or errors that arise in applications may cause additional problems The client cannot change any requirements once the requirements phase has been completed leaving them no options for changes as they see their requirements changes as the customers desires change. Excess documentation Phases are cumbersome and slow moving Learn more about the Major Process in the Sofware Development Life Cycle and Waterfall Model. Conversely, the Hare shares similar traits with the prototyping software development model in that ideas are rapidly converted to basic working examples and subsequent changes are made to quickly align the project with customers desires as they are formulated and as software strays from the customers vision. The basic concept of prototyping is to eliminate the use of well-defined project requirements. Projects are allowed to grow as the customer needs and request grow. Projects are initially designed according to basic requirements and are refined as requirement become more refined. This process allows customer to feel their way around the application to ensure that they are developing exactly what they want in the application This model also works well for determining the feasibility of certain approaches in regards to an application. Prototypes allow for quickly developing examples of implementing specific functionality based on certain techniques. Advantages of Prototyping Active participation from users and customers Allows customers to change their mind in specifying requirements Customers get a better understanding of the system as it is developed Earlier bug/error detection Promotes communication with customers Prototype could be used as final production Reduced time needed to develop applications compared to the Waterfall method Disadvantages of Prototyping Promotes constantly redefining project requirements that cause major system rewrites Potential for increased complexity of a system as scope of the system expands Customer could believe the prototype as the working version. Implementation compromises could increase the complexity when applying updates and or application fixes When companies trying to decide between the Waterfall model and Prototype model they need to evaluate the benefits and disadvantages for both models. Typically smaller companies or projects that have major time constraints typically head for more of a Prototype model approach because it can reduce the time needed to complete the project because there is more of a focus on building a project and less on defining requirements and scope prior to the start of a project. On the other hand, Companies with well-defined requirements and time allowed to generate proper documentation should steer towards more of a waterfall model because they are in a position to obtain clarified requirements and have to design and optimal solution prior to the start of coding on a project.

Read the article
The Faces in the Crowdsourcing

- by Applications User Experience

By Jeff Sauro, Principal Usability Engineer, Oracle Imagine having access to a global workforce of hundreds of thousands of people who can perform tasks or provide feedback on a design quickly and almost immediately. Distributing simple tasks not easily done by computers to the masses is called "crowdsourcing" and until recently was an interesting concept, but due to practical constraints wasn't used often. Enter Amazon.com. For five years, Amazon has hosted a service called Mechanical Turk, which provides an easy interface to the crowds. The service has almost half a million registered, global users performing a quarter of a million human intelligence tasks (HITs). HITs are submitted by individuals and companies in the U.S. and pay from $.01 for simple tasks (such as determining if a picture is offensive) to several dollars (for tasks like transcribing audio). What do we know about the people who toil away in this digital crowd? Can we rely on the work done in this anonymous marketplace? A rendering of the actual Mechanical Turk (from Wikipedia) Knowing who is behind Amazon's Mechanical Turk is fitting, considering the history of the actual Mechanical Turk. In the late 1800's, a mechanical chess-playing machine awed crowds as it beat master chess players in what was thought to be a mechanical miracle. It turned out that the creator, Wolfgang von Kempelen, had a small person (also a chess master) hiding inside the machine operating the arms to provide the illusion of automation. The field of human computer interaction (HCI) is quite familiar with gathering user input and incorporating it into all stages of the design process. It makes sense then that Mechanical Turk was a popular discussion topic at the recent Computer Human Interaction usability conference sponsored by the Association for Computing Machinery in Atlanta. It is already being used as a source for input on Web sites (for example, Feedbackarmy.com) and behavioral research studies. Two papers shed some light on the faces in this crowd. One paper tells us about the shifting demographics from mostly stay-at-home moms to young men in India. The second paper discusses the reliability and quality of work from the workers. Just who exactly would spend time doing tasks for pennies? In "Who are the crowdworkers?" University of California researchers Ross, Silberman, Zaldivar and Tomlinson conducted a survey of Mechanical Turk worker demographics and compared it to a similar survey done two years before. The initial survey reported workers consisting largely of young, well-educated women living in the U.S. with annual household incomes above $40,000. The more recent survey reveals a shift in demographics largely driven by an influx of workers from India. Indian workers went from 5% to over 30% of the crowd, and this block is largely male (two-thirds) with a higher average education than U.S. workers, and 64% report an annual income of less than $10,000 (keeping in mind $1 has a lot more purchasing power in India). This shifting demographic certainly has implications as language and culture can play critical roles in the outcome of HITs. Of course, the demographic data came from paying Turkers $.10 to fill out a survey, so there is some question about both a self-selection bias (characteristics which cause Turks to take this survey may be unrepresentative of the larger population), not to mention whether we can really trust the data we get from the crowd. Crowds can perform tasks or provide feedback on a design quickly and almost immediately for usability testing. (Photo attributed to victoriapeckham Flikr While having immediate access to a global workforce is nice, one major problem with Mechanical Turk is the incentive structure. Individuals and companies that deploy HITs want quality responses for a low price. Workers, on the other hand, want to complete the task and get paid as quickly as possible, so that they can get on to the next task. Since many HITs on Mechanical Turk are surveys, how valid and reliable are these results? How do we know whether workers are just rushing through the multiple-choice responses haphazardly answering? In "Are your participants gaming the system?" researchers at Carnegie Mellon (Downs, Holbrook, Sheng and Cranor) set up an experiment to find out what percentage of their workers were just in it for the money. The authors set up a 30-minute HIT (one of the more lengthy ones for Mechanical Turk) and offered a very high $4 to those who qualified and $.20 to those who did not. As part of the HIT, workers were asked to read an email and respond to two questions that determined whether workers were likely rushing through the HIT and not answering conscientiously. One question was simple and took little effort, while the second question required a bit more work to find the answer. Workers were led to believe other factors than these two questions were the qualifying aspect of the HIT. Of the 2000 participants, roughly 1200 (or 61%) answered both questions correctly. Eighty-eight percent answered the easy question correctly, and 64% answered the difficult question correctly. In other words, about 12% of the crowd were gaming the system, not paying enough attention to the question or making careless errors. Up to about 40% won't put in more than a modest effort to get paid for a HIT. Young men and those that considered themselves in the financial industry tended to be the most likely to try to game the system. There wasn't a breakdown by country, but given the demographic information from the first article, we could infer that many of these young men come from India, which makes language and other cultural differences a factor. These articles raise questions about the role of crowdsourcing as a means for getting quick user input at low cost. While compensating users for their time is nothing new, the incentive structure and anonymity of Mechanical Turk raises some interesting questions. How complex of a task can we ask of the crowd, and how much should these workers be paid? Can we rely on the information we get from these professional users, and if so, how can we best incorporate it into designing more usable products? Traditional usability testing will still play a central role in enterprise software. Crowdsourcing doesn't replace testing; instead, it makes certain parts of gathering user feedback easier. One can turn to the crowd for simple tasks that don't require specialized skills and get a lot of data fast. As more studies are conducted on Mechanical Turk, I suspect we will see crowdsourcing playing an increasing role in human computer interaction and enterprise computing. References: Downs, J. S., Holbrook, M. B., Sheng, S., and Cranor, L. F. 2010. Are your participants gaming the system?: screening mechanical turk workers. In Proceedings of the 28th international Conference on Human Factors in Computing Systems (Atlanta, Georgia, USA, April 10 - 15, 2010). CHI '10. ACM, New York, NY, 2399-2402. Link: http://doi.acm.org/10.1145/1753326.1753688 Ross, J., Irani, L., Silberman, M. S., Zaldivar, A., and Tomlinson, B. 2010. Who are the crowdworkers?: shifting demographics in mechanical turk. In Proceedings of the 28th of the international Conference Extended Abstracts on Human Factors in Computing Systems (Atlanta, Georgia, USA, April 10 - 15, 2010). CHI EA '10. ACM, New York, NY, 2863-2872. Link: http://doi.acm.org/10.1145/1753846.1753873

Read the article
What's new in Solaris 11.1?

- by Karoly Vegh

Solaris 11.1 is released. This is the first release update since Solaris 11 11/11, the versioning has been changed from MM/YY style to 11.1 highlighting that this is Solaris 11 Update 1. Solaris 11 itself has been great. What's new in Solaris 11.1? Allow me to pick some new features from the What's New PDF that can be found in the official Oracle Solaris 11.1 Documentation. The updates are very numerous, I really can't include all. I. New AI Automated Installer RBAC profiles have been introduced to enable delegation of installation tasks. II. The interactive installer now supports installing the OS to iSCSI targets. III. ASR (Auto Service Request) and OCM (Oracle Configuration Manager) have been enabled by default to proactively provide support information and create service requests to speed up support processes. This is optional and can be disabled but helps a lot in supportcases. For further information, see: http://oracle.com/goto/solarisautoreg IV. The new command svcbundle helps you to create SMF manifests without having to struggle with XML editing. (btw, do you know the interactive editprop subcommand in svccfg? The listprop/setprop subcommands are great for scripting and automating, but for an interactive property editing session try, for example, this: svccfg -s svc:/application/pkg/system-repository:default editprop ) V. pfedit: Ever wondered how to delegate editing permissions to certain files? It is well known "sudo /usr/bin/vi /etc/hosts" is not the right way, for sudo elevates the complete vi process to admin levels, and the user can "break" out of the session as root with simply starting a shell from that vi. Now, the new pfedit command provides a solution exactly to this challenge - an auditable, secure, per-user configurable editing possibility. See the pfedit man page for examples. VI. rsyslog, the popular logging daemon (filters, SSL, formattable output, SQL collect...) has been included in Solaris 11.1 as an alternative to syslog. VII: Zones: Solaris Zones - as a major Solaris differentiator - got lots of love in terms of new features: ZOSS - Zones on Shared Storage: Placing your zones to shared storage (FC, iSCSI) has never been this easy - via zonecfg. parallell updates - with S11's bootenvironments updating zones was no problem and meant no downtime anyway, but still, now you can update them parallelly, a way faster update action if you are running a large number of zones. This is like parallell patching in Solaris 10, but with all the IPS/ZFS/S11 goodness. per-zone fstype statistics: Running zones on a shared filesystems complicate the I/O debugging, since ZFS collects all the random writes and delivers them sequentially to boost performance. Now, over kstat you can find out which zone's I/O has an impact on the other ones, see the examples in the documentation: http://docs.oracle.com/cd/E26502_01/html/E29024/gmheh.html#scrolltoc Zones got RDSv3 protocol support for InfiniBand, and IPoIB support with Crossbow's anet (automatic vnic creation) feature. NUMA I/O support for Zones: customers can now determine the NUMA I/O topology of the system from within zones. VIII: Security got a lot of attention too: Automated security/audit reporting, with builtin reporting templates e.g. for PCI (payment card industry) audits. PAM is now configureable on a per-user basis instead of system wide, allowing different authentication requirements for different users SSH in Solaris 11.1 now supports running in FIPS 140-2 mode, that is, in a U.S. government security accredited fashion. SHA512/224 and SHA512/256 cryptographic hash functions are implemented in a FIPS-compliant way - and on a T4 implemented in silicon! That is, goverment-approved cryptography at HW-speed. Generally, Solaris is currently under evaluation to be both FIPS and Common Criteria certified. IX. Networking, as one of the core strengths of Solaris 11, has been extended with: Data Center Bridging (DCB) - not only setups where network and storage share the same fabric (FCoE, anyone?) can have Quality-of-Service requirements. DCB enables peers to distinguish traffic based on priorities. Your NICs have to support DCB, see the documentation, and additional information on Wikipedia. DataLink MultiPathing, DLMP, enables link aggregation to span across multiple switches, even between those of different vendors. But there are essential differences to the good old bandwidth-aggregating LACP, see the documentation: http://docs.oracle.com/cd/E26502_01/html/E28993/gmdlu.html#scrolltoc VNIC live migration is now supported from one physical NIC to another on-the-fly X. Data management: FedFS, (Federated FileSystem) is new, it relies on Solaris 11's NFS referring mechanism to join separate shares of different NFS servers into a single filesystem namespace. The referring system has been there since S11 11/11, in Solaris 11.1 FedFS uses a LDAP - as the one global nameservice to bind them all. The iSCSI initiator now uses the T4 CPU's HW-implemented CRC32 algorithm - thus improving iSCSI throughput while reducing CPU utilization on a T4 Storage locking improvements are now RAC aware, speeding up throughput with better locking-communication between nodes up to 20%! XI: Kernel performance optimizations: The new Virtual Memory subsystem ("VM2") scales now to 100+ TB Memory ranges. The memory predictor monitors large memory page usage, and adjust memory page sizes to applications' needs OSM, the Optimized Shared Memory allows Oracle DBs' SGA to be resized online XII: The Power Aware Dispatcher in now by default enabled, reducing power consumption of idle CPUs. Also, the LDoms' Power Management policies and the poweradm settings in Solaris 11 OS will cooperate. XIII: x86 boot: upgrade to the (Grand Unified Bootloader) GRUB2. Because grub2 differs in the configuration syntactically from grub1, one shall not edit the new grub configuration (grub.cfg) but use the new bootadm features to update it. GRUB2 adds UEFI support and also support for disks over 2TB. XIV: Improved viewing of per-CPU statistics of mpstat. This one might seem of less importance at first, but nowadays having better sorting/filtering possibilities on a periodically updated mpstat output of 256+ vCPUs can be a blessing. XV: Support for Solaris Cluster 4.1: The What's New document doesn't actually mention this one, since OSC 4.1 has not been released at the time 11.1 was. But since then it is available, and it requires Solaris 11.1. And it's only a "pkg update" away. ...aand I seriously need to stop here. There's a lot I missed, Edge Virtual Bridging, lofi tuning, ZFS sharing and crypto enhancements, USB3.0, pulseaudio, trusted extensions updates, etc - but if I mention all those then I effectively copy the What's New document. Which I recommend reading now anyway, it is a great extract of the 300+ new projects and RFE-followups in S11.1. And this blogpost is a summary of that extract. For closing words, allow me to come back to Request For Enhancements, RFEs. Any customer can request features. Open up a Support Request, explain that this is an RFE, describe the feature you/your company desires to have in S11 implemented. The more SRs are collected for an RFE, the more chance it's got to get implemented. Feel free to provide feedback about the product, as well as about the Solaris 11.1 Documentation using the "Feedback" button there. Both the Solaris engineers and the documentation writers are eager to hear your input.Feel free to comment about this post too. Except that it's too long ;) wbr,charlie

Read the article
Oracle OpenWorld 2013 – Wrap up by Sven Bernhardt

- by JuergenKress

OOW 2013 is over and we’re heading home, so it is time to lean back and reflecting about the impressions we have from the conference. First of all: OOW was great! It was a pleasure to be a part of it. As already mentioned in our last blog article: It was the biggest OOW ever. Parallel to the conference the America’s Cup took place in San Francisco and the Oracle Team America won. Amazing job by the team and again congratulations from our side Back to the conference. The main topics for us are: Oracle SOA / BPM Suite 12c Adaptive Case management (ACM) Big Data Fast Data Cloud Mobile Below we will go a little more into detail, what are the key takeaways regarding the mentioned points: Oracle SOA / BPM Suite 12c During the five days at OOW, first details of the upcoming major release of Oracle SOA Suite 12c and Oracle BPM Suite 12c have been introduced. Some new key features are: Managed File Transfer (MFT) for transferring big files from a source to a target location Enhanced REST support by introducing a new REST binding Introduction of a generic cloud adapter, which can be used to connect to different cloud providers, like Salesforce Enhanced analytics with BAM, which has been totally reengineered (BAM Console now also runs in Firefox!) Introduction of templates (OSB pipelines, component templates, BPEL activities templates) EM as a single monitoring console OSB design-time integration into JDeveloper (Really great!) Enterprise modeling capabilities in BPM Composer These are only a few points from what is coming with 12c. We are really looking forward for the new realese to come out, because this seems to be really great stuff. The suite becomes more and more integrated. From 10g to 11g it was an evolution in terms of developing SOA-based applications. With 12c, Oracle continues it’s way – very impressive. Adaptive Case Management Another fantastic topic was Adaptive Case Management (ACM). The Oracle PMs did a great job especially at the demo grounds in showing the upcoming Case Management UI (will be available in 11g with the next BPM Suite MLR Patch), the roadmap and the differences between traditional business process modeling. They have been very busy during the conference because a lot of partners and customers have been interested Big Data Big Data is one of the current hype themes. Because of huge data amounts from different internal or external sources, the handling of these data becomes more and more challenging. Companies have a need for analyzing the data to optimize their business. The challenge is here: the amount of data is growing daily! To store and analyze the data efficiently, it is necessary to have a scalable and flexible infrastructure. Here it is important that hardware and software are engineered to work together. Therefore several new features of the Oracle Database 12c, like the new in-memory option, have been presented by Larry Ellison himself. From a hardware side new server machines like Fujitsu M10 or new processors, such as Oracle’s new M6-32 have been announced. The performance improvements, when using one of these hardware components in connection with the improved software solutions were really impressive. For more details about this, please take look at our previous blog post. Regarding Big Data, Oracle also introduced their Big Data architecture, which consists of: Oracle Big Data Appliance that is preconfigured with Hadoop Oracle Exdata which stores a huge amount of data efficently, to achieve optimal query performance Oracle Exalytics as a fast and scalable Business analytics system Analysis of the stored data can be performed using SQL, by streaming the data directly from Hadoop to an Oracle Database 12c. Alternatively the analysis can be directly implemented in Hadoop using “R”. In addition Oracle BI Tools can be used to analyze the data. Fast Data Fast Data is a complementary approach to Big Data. A huge amount of mostly unstructured data comes in via different channels with a high frequency. The analysis of these data streams is also important for companies, because the incoming data has to be analyzed regarding business-relevant patterns in real-time. Therefore these patterns must be identified efficiently and performant. To do so, in-memory grid solutions in combination with Oracle Coherence and Oracle Event Processing demonstrated very impressive how efficient real-time data processing can be. One example for Fast Data solutions that was shown during the OOW was the analysis of twitter streams regarding customer satisfaction. The feeds with negative words like “bad” or “worse” have been filtered and after a defined treshold has been reached in a certain timeframe, a business event was triggered. Cloud Another key trend in the IT market is of course Cloud Computing and what it means for companies and their businesses. Oracle announced their Cloud strategy and vision – companies can focus on their real business while all of the applications are available via Cloud. This also includes Oracle Database or Oracle Weblogic, so that companies can also build, deploy and run their own applications within the cloud. Three different approaches have been introduced: Infrastructure as a Service (IaaS) Platform as a Service (PaaS) Software as a Service (SaaS) Using the IaaS approach only the infrastructure components will be managed in the Cloud. Customers will be very flexible regarding memory, storage or number of CPUs because those parameters can be adjusted elastically. The PaaS approach means that besides the infrastructure also the platforms (such as databases or application servers) necessary for running applications will be provided within the Cloud. Here customers can also decide, if installation and management of these infrastructure components should be done by Oracle. The SaaS approach describes the most complete one, hence all applications a company uses are managed in the Cloud. Oracle is planning to provide all of their applications, like ERP systems or HR applications, as Cloud services. In conclusion this seems to be a very forward-thinking strategy, which opens up new possibilities for customers to manage their infrastructure and applications in a flexible, scalable and future-oriented manner. As you can see, our OOW days have been very very interresting. We collected many helpful informations for our projects. The new innovations presented at the confernce are great and being part of this was even greater! We are looking forward to next years’ conference! Links: http://www.oracle.com/openworld/index.html http://thecattlecrew.wordpress.com/2013/09/23/first-impressions-from-oracle-open-world-2013 SOA & BPM Partner Community For regular information on Oracle SOA Suite become a member in the SOA & BPM Partner Community for registration please visit www.oracle.com/goto/emea/soa (OPN account required) If you need support with your account please contact the Oracle Partner Business Center. Blog Twitter LinkedIn Facebook Wiki Mix Forum Technorati Tags: cattleCrew,Sven Bernhard,OOW2013,SOA Community,Oracle SOA,Oracle BPM,Community,OPN,Jürgen Kress

Read the article
Fun with Aggregates

- by Paul White

There are interesting things to be learned from even the simplest queries. For example, imagine you are given the task of writing a query to list AdventureWorks product names where the product has at least one entry in the transaction history table, but fewer than ten. One possible query to meet that specification is: SELECT p.Name FROM Production.Product AS p JOIN Production.TransactionHistory AS th ON p.ProductID = th.ProductID GROUP BY p.ProductID, p.Name HAVING COUNT_BIG(*) < 10; That query correctly returns 23 rows (execution plan and data sample shown below): The execution plan looks a bit different from the written form of the query: the base tables are accessed in reverse order, and the aggregation is performed before the join. The general idea is to read all rows from the history table, compute the count of rows grouped by ProductID, merge join the results to the Product table on ProductID, and finally filter to only return rows where the count is less than ten. This ‘fully-optimized’ plan has an estimated cost of around 0.33 units. The reason for the quote marks there is that this plan is not quite as optimal as it could be – surely it would make sense to push the Filter down past the join too? To answer that, let’s look at some other ways to formulate this query. This being SQL, there are any number of ways to write logically-equivalent query specifications, so we’ll just look at a couple of interesting ones. The first query is an attempt to reverse-engineer T-SQL from the optimized query plan shown above. It joins the result of pre-aggregating the history table to the Product table before filtering: SELECT p.Name FROM ( SELECT th.ProductID, cnt = COUNT_BIG(*) FROM Production.TransactionHistory AS th GROUP BY th.ProductID ) AS q1 JOIN Production.Product AS p ON p.ProductID = q1.ProductID WHERE q1.cnt < 10; Perhaps a little surprisingly, we get a slightly different execution plan: The results are the same (23 rows) but this time the Filter is pushed below the join! The optimizer chooses nested loops for the join, because the cardinality estimate for rows passing the Filter is a bit low (estimate 1 versus 23 actual), though you can force a merge join with a hint and the Filter still appears below the join. In yet another variation, the < 10 predicate can be ‘manually pushed’ by specifying it in a HAVING clause in the “q1” sub-query instead of in the WHERE clause as written above. The reason this predicate can be pushed past the join in this query form, but not in the original formulation is simply an optimizer limitation – it does make efforts (primarily during the simplification phase) to encourage logically-equivalent query specifications to produce the same execution plan, but the implementation is not completely comprehensive. Moving on to a second example, the following query specification results from phrasing the requirement as “list the products where there exists fewer than ten correlated rows in the history table”: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) < 10 ); Unfortunately, this query produces an incorrect result (86 rows): The problem is that it lists products with no history rows, though the reasons are interesting. The COUNT_BIG(*) in the EXISTS clause is a scalar aggregate (meaning there is no GROUP BY clause) and scalar aggregates always produce a value, even when the input is an empty set. In the case of the COUNT aggregate, the result of aggregating the empty set is zero (the other standard aggregates produce a NULL). To make the point really clear, let’s look at product 709, which happens to be one for which no history rows exist: -- Scalar aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709; -- Vector aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709 GROUP BY th.ProductID; The estimated execution plans for these two statements are almost identical: You might expect the Stream Aggregate to have a Group By for the second statement, but this is not the case. The query includes an equality comparison to a constant value (709), so all qualified rows are guaranteed to have the same value for ProductID and the Group By is optimized away. In fact there are some minor differences between the two plans (the first is auto-parameterized and qualifies for trivial plan, whereas the second is not auto-parameterized and requires cost-based optimization), but there is nothing to indicate that one is a scalar aggregate and the other is a vector aggregate. This is something I would like to see exposed in show plan so I suggested it on Connect. Anyway, the results of running the two queries show the difference at runtime: The scalar aggregate (no GROUP BY) returns a result of zero, whereas the vector aggregate (with a GROUP BY clause) returns nothing at all. Returning to our EXISTS query, we could ‘fix’ it by changing the HAVING clause to reject rows where the scalar aggregate returns zero: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) BETWEEN 1 AND 9 ); The query now returns the correct 23 rows: Unfortunately, the execution plan is less efficient now – it has an estimated cost of 0.78 compared to 0.33 for the earlier plans. Let’s try adding a redundant GROUP BY instead of changing the HAVING clause: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY th.ProductID HAVING COUNT_BIG(*) < 10 ); Not only do we now get correct results (23 rows), this is the execution plan: I like to compare that plan to quantum physics: if you don’t find it shocking, you haven’t understood it properly :) The simple addition of a redundant GROUP BY has resulted in the EXISTS form of the query being transformed into exactly the same optimal plan we found earlier. What’s more, in SQL Server 2008 and later, we can replace the odd-looking GROUP BY with an explicit GROUP BY on the empty set: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ); I offer that as an alternative because some people find it more intuitive (and it perhaps has more geek value too). Whichever way you prefer, it’s rather satisfying to note that the result of the sub-query does not exist for a particular correlated value where a vector aggregate is used (the scalar COUNT aggregate always returns a value, even if zero, so it always ‘EXISTS’ regardless which ProductID is logically being evaluated). The following query forms also produce the optimal plan and correct results, so long as a vector aggregate is used (you can probably find more equivalent query forms): WHERE Clause SELECT p.Name FROM Production.Product AS p WHERE ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) < 10; APPLY SELECT p.Name FROM Production.Product AS p CROSS APPLY ( SELECT NULL FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ) AS ca (dummy); FROM Clause SELECT q1.Name FROM ( SELECT p.Name, cnt = ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) FROM Production.Product AS p ) AS q1 WHERE q1.cnt < 10; This last example uses SUM(1) instead of COUNT and does not require a vector aggregate…you should be able to work out why :) SELECT q.Name FROM ( SELECT p.Name, cnt = ( SELECT SUM(1) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID ) FROM Production.Product AS p ) AS q WHERE q.cnt < 10; The semantics of SQL aggregates are rather odd in places. It definitely pays to get to know the rules, and to be careful to check whether your queries are using scalar or vector aggregates. As we have seen, query plans do not show in which ‘mode’ an aggregate is running and getting it wrong can cause poor performance, wrong results, or both. © 2012 Paul White Twitter: @SQL_Kiwi email: [email protected]

Read the article

Search Results

Search found 2108 results on 85 pages for 'differences'.

Page 79/85 | < Previous Page | 75 76 77 78 79 80 81 82 83 84 85 | Next Page >

- by mbridge

- by pinaldave

- by Jesse

- by Paul White

- by Geordie

- by Pinal Dave

- by subodhnpushpak

- by RobertChipperfield

- by Sysadmin Geek

- by James Michael Hare

- by david.butler(at)oracle.com

- by Simon Cooper

- by DavidWimbush

- by James Michael Hare

- by arungupta

- by Pinal Dave

- by Mark Fincham-Oracle

- by BuckWoody

- by Applications User Experience

- by Karoly Vegh

- by JuergenKress

- by Paul White

< Previous Page | 75 76 77 78 79 80 81 82 83 84 85 | Next Page >