Search Results

Search found 13749 results on 550 pages for 'reason'.

Page 162/550 | < Previous Page | 158 159 160 161 162 163 164 165 166 167 168 169 | Next Page >

From Binary to Data Structures

- by Cédric Menzi

Table of Contents Introduction PE file format and COFF header COFF file header BaseCoffReader Byte4ByteCoffReader UnsafeCoffReader ManagedCoffReader Conclusion History This article is also available on CodeProject Introduction Sometimes, you want to parse well-formed binary data and bring it into your objects to do some dirty stuff with it. In the Windows world most data structures are stored in special binary format. Either we call a WinApi function or we want to read from special files like images, spool files, executables or may be the previously announced Outlook Personal Folders File. Most specifications for these files can be found on the MSDN Libarary: Open Specification In my example, we are going to get the COFF (Common Object File Format) file header from a PE (Portable Executable). The exact specification can be found here: PECOFF PE file format and COFF header Before we start we need to know how this file is formatted. The following figure shows an overview of the Microsoft PE executable format. Source: Microsoft Our goal is to get the PE header. As we can see, the image starts with a MS-DOS 2.0 header with is not important for us. From the documentation we can read "...After the MS DOS stub, at the file offset specified at offset 0x3c, is a 4-byte...". With this information we know our reader has to jump to location 0x3c and read the offset to the signature. The signature is always 4 bytes that ensures that the image is a PE file. The signature is: PE\0\0. To prove this we first seek to the offset 0x3c, read if the file consist the signature. So we need to declare some constants, because we do not want magic numbers. private const int PeSignatureOffsetLocation = 0x3c; private const int PeSignatureSize = 4; private const string PeSignatureContent = "PE"; Then a method for moving the reader to the correct location to read the offset of signature. With this method we always move the underlining Stream of the BinaryReader to the start location of the PE signature. private void SeekToPeSignature(BinaryReader br) { // seek to the offset for the PE signagure br.BaseStream.Seek(PeSignatureOffsetLocation, SeekOrigin.Begin); // read the offset int offsetToPeSig = br.ReadInt32(); // seek to the start of the PE signature br.BaseStream.Seek(offsetToPeSig, SeekOrigin.Begin); } Now, we can check if it is a valid PE image by reading of the next 4 byte contains the content PE. private bool IsValidPeSignature(BinaryReader br) { // read 4 bytes to get the PE signature byte[] peSigBytes = br.ReadBytes(PeSignatureSize); // convert it to a string and trim \0 at the end of the content string peContent = Encoding.Default.GetString(peSigBytes).TrimEnd('\0'); // check if PE is in the content return peContent.Equals(PeSignatureContent); } With this basic functionality we have a good base reader class to try the different methods of parsing the COFF file header. COFF file header The COFF header has the following structure: Offset Size Field 0 2 Machine 2 2 NumberOfSections 4 4 TimeDateStamp 8 4 PointerToSymbolTable 12 4 NumberOfSymbols 16 2 SizeOfOptionalHeader 18 2 Characteristics If we translate this table to code, we get something like this: [StructLayout(LayoutKind.Sequential, CharSet = CharSet.Unicode)] public struct CoffHeader { public MachineType Machine; public ushort NumberOfSections; public uint TimeDateStamp; public uint PointerToSymbolTable; public uint NumberOfSymbols; public ushort SizeOfOptionalHeader; public Characteristic Characteristics; } BaseCoffReader All readers do the same thing, so we go to the patterns library in our head and see that Strategy pattern or Template method pattern is sticked out in the bookshelf. I have decided to take the template method pattern in this case, because the Parse() should handle the IO for all implementations and the concrete parsing should done in its derived classes. public CoffHeader Parse() { using (var br = new BinaryReader(File.Open(_fileName, FileMode.Open, FileAccess.Read, FileShare.Read))) { SeekToPeSignature(br); if (!IsValidPeSignature(br)) { throw new BadImageFormatException(); } return ParseInternal(br); } } protected abstract CoffHeader ParseInternal(BinaryReader br); First we open the BinaryReader, seek to the PE signature then we check if it contains a valid PE signature and rest is done by the derived implementations. Byte4ByteCoffReader The first solution is using the BinaryReader. It is the general way to get the data. We only need to know which order, which data-type and its size. If we read byte for byte we could comment out the first line in the CoffHeader structure, because we have control about the order of the member assignment. protected override CoffHeader ParseInternal(BinaryReader br) { CoffHeader coff = new CoffHeader(); coff.Machine = (MachineType)br.ReadInt16(); coff.NumberOfSections = (ushort)br.ReadInt16(); coff.TimeDateStamp = br.ReadUInt32(); coff.PointerToSymbolTable = br.ReadUInt32(); coff.NumberOfSymbols = br.ReadUInt32(); coff.SizeOfOptionalHeader = (ushort)br.ReadInt16(); coff.Characteristics = (Characteristic)br.ReadInt16(); return coff; } If the structure is as short as the COFF header here and the specification will never changed, there is probably no reason to change the strategy. But if a data-type will be changed, a new member will be added or ordering of member will be changed the maintenance costs of this method are very high. UnsafeCoffReader Another way to bring the data into this structure is using a "magically" unsafe trick. As above, we know the layout and order of the data structure. Now, we need the StructLayout attribute, because we have to ensure that the .NET Runtime allocates the structure in the same order as it is specified in the source code. We also need to enable "Allow unsafe code (/unsafe)" in the project's build properties. Then we need to add the following constructor to the CoffHeader structure. [StructLayout(LayoutKind.Sequential, CharSet = CharSet.Unicode)] public struct CoffHeader { public CoffHeader(byte[] data) { unsafe { fixed (byte* packet = &data[0]) { this = *(CoffHeader*)packet; } } } } The "magic" trick is in the statement: this = *(CoffHeader*)packet;. What happens here? We have a fixed size of data somewhere in the memory and because a struct in C# is a value-type, the assignment operator = copies the whole data of the structure and not only the reference. To fill the structure with data, we need to pass the data as bytes into the CoffHeader structure. This can be achieved by reading the exact size of the structure from the PE file. protected override CoffHeader ParseInternal(BinaryReader br) { return new CoffHeader(br.ReadBytes(Marshal.SizeOf(typeof(CoffHeader)))); } This solution is the fastest way to parse the data and bring it into the structure, but it is unsafe and it could introduce some security and stability risks. ManagedCoffReader In this solution we are using the same approach of the structure assignment as above. But we need to replace the unsafe part in the constructor with the following managed part: [StructLayout(LayoutKind.Sequential, CharSet = CharSet.Unicode)] public struct CoffHeader { public CoffHeader(byte[] data) { IntPtr coffPtr = IntPtr.Zero; try { int size = Marshal.SizeOf(typeof(CoffHeader)); coffPtr = Marshal.AllocHGlobal(size); Marshal.Copy(data, 0, coffPtr, size); this = (CoffHeader)Marshal.PtrToStructure(coffPtr, typeof(CoffHeader)); } finally { Marshal.FreeHGlobal(coffPtr); } } } Conclusion We saw that we can parse well-formed binary data to our data structures using different approaches. The first is probably the clearest way, because we know each member and its size and ordering and we have control about the reading the data for each member. But if add member or the structure is going change by some reason, we need to change the reader. The two other solutions use the approach of the structure assignment. In the unsafe implementation we need to compile the project with the /unsafe option. We increase the performance, but we get some security risks.

Read the article
LINQ to SQL and missing Many to Many EntityRefs

- by Rick Strahl

Ran into an odd behavior today with a many to many mapping of one of my tables in LINQ to SQL. Many to many mappings aren’t transparent in LINQ to SQL and it maps the link table the same way the SQL schema has it when creating one. In other words LINQ to SQL isn’t smart about many to many mappings and just treats it like the 3 underlying tables that make up the many to many relationship. Iain Galloway has a nice blog entry about Many to Many relationships in LINQ to SQL. I can live with that – it’s not really difficult to deal with this arrangement once mapped, especially when reading data back. Writing is a little more difficult as you do have to insert into two entities for new records, but nothing that can’t be handled in a small business object method with a few lines of code. When I created a database I’ve been using to experiment around with various different OR/Ms recently I found that for some reason LINQ to SQL was completely failing to map even to the linking table. As it turns out there’s a good reason why it fails, can you spot it below? (read on :-}) Here is the original database layout: There’s an items table, a category table and a link table that holds only the foreign keys to the Items and Category tables for a typical M->M relationship. When these three tables are imported into the model the *look* correct – I do get the relationships added (after modifying the entity names to strip the prefix): The relationship looks perfectly fine, both in the designer as well as in the XML document: <Table Name="dbo.wws_Item_Categories" Member="ItemCategories"> <Type Name="ItemCategory"> <Column Name="ItemId" Type="System.Guid" DbType="uniqueidentifier NOT NULL" CanBeNull="false" /> <Column Name="CategoryId" Type="System.Guid" DbType="uniqueidentifier NOT NULL" CanBeNull="false" /> <Association Name="ItemCategory_Category" Member="Categories" ThisKey="CategoryId" OtherKey="Id" Type="Category" /> <Association Name="Item_ItemCategory" Member="Item" ThisKey="ItemId" OtherKey="Id" Type="Item" IsForeignKey="true" /> </Type> </Table> <Table Name="dbo.wws_Categories" Member="Categories"> <Type Name="Category"> <Column Name="Id" Type="System.Guid" DbType="UniqueIdentifier NOT NULL" IsPrimaryKey="true" IsDbGenerated="true" CanBeNull="false" /> <Column Name="ParentId" Type="System.Guid" DbType="UniqueIdentifier" CanBeNull="true" /> <Column Name="CategoryName" Type="System.String" DbType="NVarChar(150)" CanBeNull="true" /> <Column Name="CategoryDescription" Type="System.String" DbType="NVarChar(MAX)" CanBeNull="true" /> <Column Name="tstamp" AccessModifier="Internal" Type="System.Data.Linq.Binary" DbType="rowversion" CanBeNull="true" IsVersion="true" /> <Association Name="ItemCategory_Category" Member="ItemCategory" ThisKey="Id" OtherKey="CategoryId" Type="ItemCategory" IsForeignKey="true" /> </Type> </Table> However when looking at the code generated these navigation properties (also on Item) are completely missing: [global::System.Data.Linq.Mapping.TableAttribute(Name="dbo.wws_Item_Categories")] [global::System.Runtime.Serialization.DataContractAttribute()] public partial class ItemCategory : Westwind.BusinessFramework.EntityBase { private System.Guid _ItemId; private System.Guid _CategoryId; public ItemCategory() { } [global::System.Data.Linq.Mapping.ColumnAttribute(Storage="_ItemId", DbType="uniqueidentifier NOT NULL")] [global::System.Runtime.Serialization.DataMemberAttribute(Order=1)] public System.Guid ItemId { get { return this._ItemId; } set { if ((this._ItemId != value)) { this._ItemId = value; } } } [global::System.Data.Linq.Mapping.ColumnAttribute(Storage="_CategoryId", DbType="uniqueidentifier NOT NULL")] [global::System.Runtime.Serialization.DataMemberAttribute(Order=2)] public System.Guid CategoryId { get { return this._CategoryId; } set { if ((this._CategoryId != value)) { this._CategoryId = value; } } } } Notice that the Item and Category association properties which should be EntityRef properties are completely missing. They’re there in the model, but the generated code – not so much. So what’s the problem here? The problem – it appears – is that LINQ to SQL requires primary keys on all entities it tracks. In order to support tracking – even of the link table entity – the link table requires a primary key. Real obvious ain’t it, especially since the designer happily lets you import the table and even shows the relationship and implicitly the related properties. Adding an Id field as a Pk to the database and then importing results in this model layout: which properly generates the Item and Category properties into the link entity. It’s ironic that LINQ to SQL *requires* the PK in the middle – the Entity Framework requires that a link table have *only* the two foreign key fields in a table in order to recognize a many to many relation. EF actually handles the M->M relation directly without the intermediate link entity unlike LINQ to SQL. [updated from comments – 12/24/2009] Another approach is to set up both ItemId and CategoryId in the database which shows up in LINQ to SQL like this: This also work in creating the Category and Item fields in the ItemCategory entity. Ultimately this is probably the best approach as it also guarantees uniqueness of the keys and so helps in database integrity. It took me a while to figure out WTF was going on here – lulled by the designer to think that the properties should be when they were not. It’s actually a well documented feature of L2S that each entity in the model requires a Pk but of course that’s easy to miss when the model viewer shows it to you and even the underlying XML model shows the Associations properly. This is one of the issue with L2S of course – you have to play by its rules and once you hit one of those rules there’s no way around them – you’re stuck with what it requires which in this case meant changing the database.© Rick Strahl, West Wind Technologies, 2005-2010Posted in ADO.NET LINQ

Read the article
Mulit-tenant ASP.NET MVC – Controllers

- by zowens

Part I – Introduction Part II – Foundation The time has come to talk about controllers in a multi-tenant ASP.NET MVC architecture. This is actually the most critical design decision you will make when dealing with multi-tenancy with MVC. In my design, I took into account the design goals I mentioned in the introduction about inversion of control and what a tenant is to my design. Be aware that this is only one way to achieve multi-tenant controllers. The Premise MvcEx (which is a sample written by Rob Ashton) utilizes dynamic controllers. Essentially a controller is “dynamic” in that multiple action results can be placed in different “controllers” with the same name. This approach is a bit too complicated for my design. I wanted to stick with plain old inheritance when dealing with controllers. The basic premise of my controller design is that my main host defines a set of universal controllers. It is the responsibility of the tenant to decide if the tenant would like to utilize these core controllers. This can be done either by straight usage of the controller or inheritance for extension of the functionality defined by the controller. The controller is resolved by a StructureMap container that is attached to the tenant, as discussed in Part II. Controller Resolution I have been thinking about two different ways to resolve controllers with StructureMap. One way is to use named instances. This is a really easy way to simply pull the controller right out of the container without a lot of fuss. I ultimately chose not to use this approach. The reason for this decision is to ensure that the controllers are named properly. If a controller has a different named instance that the controller type, then the resolution has a significant disconnect and there are no guarantees. The final approach, the one utilized by the sample, is to simply pull all controller types and correlate the type with a controller name. This has a bit of a application start performance disadvantage, but is significantly more approachable for maintainability. For example, if I wanted to go back and add a “ControllerName” attribute, I would just have to change the ControllerFactory to suit my needs. The Code The container factory that I have built is actually pretty simple. That’s really all we need. The most significant method is the GetControllersFor method. This method makes the model from the Container and determines all the concrete types for IController. The thing you might notice is that this doesn’t depend on tenants, but rather containers. You could easily use this controller factory for an application that doesn’t utilize multi-tenancy. public class ContainerControllerFactory : IControllerFactory { private readonly ThreadSafeDictionary<IContainer, IDictionary<string, Type>> typeCache; public ContainerControllerFactory(IContainerResolver resolver) { Ensure.Argument.NotNull(resolver, "resolver"); this.ContainerResolver = resolver; this.typeCache = new ThreadSafeDictionary<IContainer, IDictionary<string, Type>>(); } public IContainerResolver ContainerResolver { get; private set; } public virtual IController CreateController(RequestContext requestContext, string controllerName) { var controllerType = this.GetControllerType(requestContext, controllerName); if (controllerType == null) return null; var controller = this.ContainerResolver.Resolve(requestContext).GetInstance(controllerType) as IController; // ensure the action invoker is a ContainerControllerActionInvoker if (controller != null && controller is Controller && !((controller as Controller).ActionInvoker is ContainerControllerActionInvoker)) (controller as Controller).ActionInvoker = new ContainerControllerActionInvoker(this.ContainerResolver); return controller; } public void ReleaseController(IController controller) { if (controller != null && controller is IDisposable) ((IDisposable)controller).Dispose(); } internal static IEnumerable<Type> GetControllersFor(IContainer container) { Ensure.Argument.NotNull(container); return container.Model.InstancesOf<IController>().Select(x => x.ConcreteType).Distinct(); } protected virtual Type GetControllerType(RequestContext requestContext, string controllerName) { Ensure.Argument.NotNull(requestContext, "requestContext"); Ensure.Argument.NotNullOrEmpty(controllerName, "controllerName"); var container = this.ContainerResolver.Resolve(requestContext); var typeDictionary = this.typeCache.GetOrAdd(container, () => GetControllersFor(container).ToDictionary(x => ControllerFriendlyName(x.Name))); Type found = null; if (typeDictionary.TryGetValue(ControllerFriendlyName(controllerName), out found)) return found; return null; } private static string ControllerFriendlyName(string value) { return (value ?? string.Empty).ToLowerInvariant().Without("controller"); } } One thing to note about my implementation is that we do not use namespaces that can be utilized in the default ASP.NET MVC controller factory. This is something that I don’t use and have no desire to implement and test. The reason I am not using namespaces in this situation is because each tenant has its own namespaces and the routing would not make sense in this case. Because we are using IoC, dependencies are automatically injected into the constructor. For example, a tenant container could implement it’s own IRepository and a controller could be defined in the “main” project. The IRepository from the tenant would be injected into the main project’s controller. This is quite a useful feature. Again, the source code is on GitHub here. Up Next Up next is the view resolution. This is a complicated issue, so be prepared. I hope that you have found this series useful. If you have any questions about my implementation so far, send me an email or DM me on Twitter. I have had a lot of great conversations about multi-tenancy so far and I greatly appreciate the feedback!

Read the article
Behind ASP.NET MVC Mock Objects

- by imran_ku07

Introduction: I think this sentence now become very familiar to ASP.NET MVC developers that "ASP.NET MVC is designed with testability in mind". But what ASP.NET MVC team did for making applications build with ASP.NET MVC become easily testable? Understanding this is also very important because it gives you some help when designing custom classes. So in this article i will discuss some abstract classes provided by ASP.NET MVC team for the various ASP.NET intrinsic objects, including HttpContext, HttpRequest, and HttpResponse for making these objects as testable. I will also discuss that why it is hard and difficult to test ASP.NET Web Forms. Description: Starting from Classic ASP to ASP.NET MVC, ASP.NET Intrinsic objects is extensively used in all form of web application. They provide information about Request, Response, Server, Application and so on. But ASP.NET MVC uses these intrinsic objects in some abstract manner. The reason for this abstraction is to make your application testable. So let see the abstraction. As we know that ASP.NET MVC uses the same runtime engine as ASP.NET Web Form uses, therefore the first receiver of the request after IIS and aspnet_filter.dll is aspnet_isapi.dll. This will start the application domain. With the application domain up and running, ASP.NET does some initialization and after some initialization it will call Application_Start if it is defined. Then the normal HTTP pipeline event handlers will be executed including both HTTP Modules and global.asax event handlers. One of the HTTP Module is registered by ASP.NET MVC is UrlRoutingModule. The purpose of this module is to match a route defined in global.asax. Every matched route must have IRouteHandler. In default case this is MvcRouteHandler which is responsible for determining the HTTP Handler which returns MvcHandler (which is derived from IHttpHandler). In simple words, Route has MvcRouteHandler which returns MvcHandler which is the IHttpHandler of current request. In between HTTP pipeline events the handler of ASP.NET MVC, MvcHandler.ProcessRequest will be executed and shown as given below, void IHttpHandler.ProcessRequest(HttpContext context) { this.ProcessRequest(context); } protected virtual void ProcessRequest(HttpContext context) { // HttpContextWrapper inherits from HttpContextBase HttpContextBase ctxBase = new HttpContextWrapper(context); this.ProcessRequest(ctxBase); } protected internal virtual void ProcessRequest(HttpContextBase ctxBase) { . . . } HttpContextBase is the base class. HttpContextWrapper inherits from HttpContextBase, which is the parent class that include information about a single HTTP request. This is what ASP.NET MVC team did, just wrap old instrinsic HttpContext into HttpContextWrapper object and provide opportunity for other framework to provide their own implementation of HttpContextBase. For example public class MockHttpContext : HttpContextBase { . . . } As you can see, it is very easy to create your own HttpContext. That's what did the third party mock frameworks like TypeMock, Moq, RhinoMocks, or NMock2 to provide their own implementation of ASP.NET instrinsic objects classes. The key point to note here is the types of ASP.NET instrinsic objects. In ASP.NET Web Form and ASP.NET MVC. For example in ASP.NET Web Form the type of Request object is HttpRequest (which is sealed) and in ASP.NET MVC the type of Request object is HttpRequestBase. This is one of the reason that makes test in ASP.NET WebForm is difficult. because their is no base class and the HttpRequest class is sealed, therefore it cannot act as a base class to others. On the other side ASP.NET MVC always uses a base class to give a chance to third parties and unit test frameworks to create thier own implementation ASP.NET instrinsic object. Therefore we can say that in ASP.NET MVC, instrinsic objects are of type base classes (for example HttpContextBase) .Actually these base classes had it's own implementation of same interface as the intrinsic objects it abstracts. It includes only virtual members which simply throws an exception. ASP.NET MVC also provides the corresponding wrapper classes (for example, HttpRequestWrapper) which provides a concrete implementation of the base classes in the form of ASP.NET intrinsic object. Other wrapper classes may be defined by third parties in the form of a mock object for testing purpose. So we can say that a Request object in ASP.NET MVC may be HttpRequestWrapper or may be MockRequestWrapper(assuming that MockRequestWrapper class is used for testing purpose). Here is list of ASP.NET instrinsic and their implementation in ASP.NET MVC in the form of base and wrapper classes. Base Class Wrapper Class ASP.NET Intrinsic Object Description HttpApplicationStateBase HttpApplicationStateWrapper Application HttpApplicationStateBase abstracts the intrinsic Application object HttpBrowserCapabilitiesBase HttpBrowserCapabilitiesWrapper HttpBrowserCapabilities HttpBrowserCapabilitiesBase abstracts the HttpBrowserCapabilities class HttpCachePolicyBase HttpCachePolicyWrapper HttpCachePolicy HttpCachePolicyBase abstracts the HttpCachePolicy class HttpContextBase HttpContextWrapper HttpContext HttpContextBase abstracts the intrinsic HttpContext object HttpFileCollectionBase HttpFileCollectionWrapper HttpFileCollection HttpFileCollectionBase abstracts the HttpFileCollection class HttpPostedFileBase HttpPostedFileWrapper HttpPostedFile HttpPostedFileBase abstracts the HttpPostedFile class HttpRequestBase HttpRequestWrapper Request HttpRequestBase abstracts the intrinsic Request object HttpResponseBase HttpResponseWrapper Response HttpResponseBase abstracts the intrinsic Response object HttpServerUtilityBase HttpServerUtilityWrapper Server HttpServerUtilityBase abstracts the intrinsic Server object HttpSessionStateBase HttpSessionStateWrapper Session HttpSessionStateBase abstracts the intrinsic Session object HttpStaticObjectsCollectionBase HttpStaticObjectsCollectionWrapper HttpStaticObjectsCollection HttpStaticObjectsCollectionBase abstracts the HttpStaticObjectsCollection class Summary: ASP.NET MVC provides a set of abstract classes for ASP.NET instrinsic objects in the form of base classes, allowing someone to create their own implementation. In addition, ASP.NET MVC also provide set of concrete classes in the form of wrapper classes. This design really makes application easier to test and even application may replace concrete implementation with thier own implementation, which makes ASP.NET MVC very flexable.

Read the article
Tips on installing Visual Studio 2010 SP1

- by Jon Galloway

Visual Studio SP1 went up on MSDN downloads (here) on March 8, and will be released publicly on March 10 here. Release announcements: Soma: Visual Studio 2010 enhancements Jason Zander: Announcing Visual Studio 2010 Service Pack 1 I started on this post with tips on installing VS2010 SP1 when I realized I’ve been writing these up for Visual Studio and .NET framework SP releases for a while (e.g. VS2008 / .NET 3.5 SP1 post, VS2005 SP1 post). Looking back the years of Visual Studio SP installs (and remembering when we’d get up to SP6 for a Visual Studio release), I’m happy to see that it just keeps getting easier. Service Packs are a lot less finicky about requiring beta software to be uninstalled, install more quickly, and are just generally a lot less scary. If I can’t have a jetpack, at least my future provided me faster, easier service packs. Disclaimer: These tips are just general things I've picked up over the years. I don't have any inside knowledge here. If you see anything wrong, be sure to let me know in the comments. You may want to check the readme file before installing - it's short, and it's in that new-fangled HTML format. On with the tips! Before starting, uninstall Visual Studio features you don't use Visual Studio service packs (and other Microsoft service packs as well) install patches for the specific features you’ve got installed. This is a big reason to always do a custom install when you first install Visual Studio, but it’s not difficult to update your existing installation. Here’s the quick way to do that: Tap the windows key and type “add or remove programs” and press enter (or click on the “Add or remove programs” link if you must). Type “Visual Studio 2010” in the search box in the upper right corner, click on the Visual Studio program (the one with the VS infinity looking logo) and click on Uninstall/Change. Click on Add or Remove Features The next part’s up to you – what features do you actually use? I’ve been doing primarily ASP.NET MVC development in C# lately, so I selected Visual C# and Visual Web Developer. Remember that you can install features later if needed, and can also install the express versions if you want. Selecting everything just because it’s there - or you paid for it – means that you install updates for everything, every time. When you’ve made your changes, click on the Update button to uninstall unused features. Shut down all instances of Visual Studio It probably goes without saying that you should close a program down before installing it, partly to avoid the file-in-use-reboot-after-install horror. Additional "hunch / works on my machine" quality tip: On one computer I saw a note in the setup log about Visual Studio a prompt for user input to close Visual Studio, although I never saw the prompt. Just to be sure, I'd personally open up Task Manager and kill any devenv.exe processes I saw running, as it couldn't hurt. Use the web installer I use the Web Installers whenever possible. There’s no point in downloading the DVD unless you’re doing multiple installs or won’t have internet access. The DVD IS is 1.5GB, since it needs to be able to service every possible supported installation option on both x86 and x64. The web installer is 776 KB (smaller than calc.exe), so you can start the installation right away. Like other web installers, the real benefit is that it only installs the updates you need (hence the reason for step 1 – uninstalling unused components). Instead of 1.5GB, my download was roughly 530MB. If you’re installing from MSDN (this link takes you right to the Visual Studio installs), select the first one on the list: The first step in the installation process is to analyze the machine configuration and tell you what needs to be installed. Since I've trimmed down my features, that's a pretty short list. The time's not far off where I may not install SQL Server on my dev machines, just using SQL Server Compact - that would shorten the list further. When I hit next, you can see that the download size has shrunk considerably. When I start the install, note that the installation begins while other components are downloading - another benefit of the web install. On my mid-range desktop machine, the install took 25 minutes. What if it takes longer? According to Heath Stewart (Visual Studio installer guru), average SP1 installs take roughly 45 minutes. An installation which takes hours to complete may be a sign of a problem: see his post Visual Studio 2010 Service Pack 1 installing for over 2 hours could be a sign of a problem. Why so long? Yes, even 25 minutes is a while. Heath's got another blog post explaining why the update can take longer than the initial install (see: A patch may take as long or longer to install than the target product) which explains all the additional steps and complexities a patch needs to deal with, as well as some mitigation steps that deployment authors can take to mitigate the impact. Other things to know about Visual Studio 2010 SP1 Installs over Visual Studio 2010 SP1 Beta That's nice. Previous Visual Studio versions did a number of annoying things when you installed SP's over beta's - fail with weird errors, get part way through and tell you needed to cancel and uninstall first, etc. I've installed this on two machines that had random beta stuff installed without tears. That Readme file you didn't read I mentioned the readme file earlier (http://go.microsoft.com/fwlink/?LinkId=210711 ). Some interesting things I picked up in there: 2.1.3. Visual Studio 2010 Service Pack 1 installation may fail when a USB drive or other removeable drive is connected 2.1.4. Visual Studio must be restarted after Visual Studio 2010 SP1 tooling for SQL Server Compact (Compact) 4.0 is installed 2.2.1. If Visual Studio 2010 Service Pack 1 is uninstalled, Visual Studio 2010 must be reinstalled to restore certain components 2.2.2. If Visual Studio 2010 Service Pack 1 is uninstalled, Visual Studio 2010 must be reinstalled before SP1 can be installed again 2.4.3.1. Async CTP If you installed the pre-SP1 version of Async CTP but did not uninstall it before you installed Visual Studio 2010 SP1, then your computer will be in a state in which the version of the C# compiler in the .NET Framework does not match the C# compiler in Visual Studio. To resolve this issue: After you install Visual Studio 2010 SP1, reinstall the SP1 version of the Async CTP from here. Hardware acceleration for Visual Studio is disabled on Windows XP Visual Studio 2010 SP1 disables hardware acceleration when running on Windows XP (only on XP). You can turn it back on in the Visual Studio options, under Environment / General, as shown below. See Jason Zander's post titled Performance Troubleshooting Article and VS2010 SP1 Change.

Read the article
SQL Server 2008 R2 Reporting Services - The Word is But a Stage (T-SQL Tuesday #006)

- by smisner

Host Michael Coles (blog|twitter) has selected LOB data as the topic for this month's T-SQL Tuesday, so I'll take this opportunity to post an overview of reporting with spatial data types. As part of my work with SQL Server 2008 R2 Reporting Services, I've been exploring the use of spatial data types in the new map data region. You can create a map using any of the following data sources: Map Gallery - a set of Shapefiles for the United States only that ships with Reporting Services ESRI Shapefile - a .shp file conforming to the Environmental Systems Research Institute, Inc. (ESRI) shapefile spatial data format SQL Server spatial data - a query that includes SQLGeography or SQLGeometry data types Rob Farley (blog|twitter) points out today in his T-SQL Tuesday post that using the SQL geography field is a preferable alternative to ESRI shapefiles for storing spatial data in SQL Server. So how do you get spatial data? If you don't already have a GIS application in-house, you can find a variety of sources. Here are a few to get you started: US Census Bureau Website, http://www.census.gov/geo/www/tiger/ Global Administrative Areas Spatial Database, http://biogeo.berkeley.edu/gadm/ Digital Chart of the World Data Server, http://www.maproom.psu.edu/dcw/ In a recent post by Pinal Dave (blog|twitter), you can find a link to free shapefiles for download and a tutorial for using Shape2SQL, a free tool to convert shapefiles into SQL Server data. In my post today, I'll show you how to use combine spatial data that describes boundaries with spatial data in AdventureWorks2008R2 that identifies stores locations to embed a map in a report. Preparing the spatial data First, I downloaded Shapefile data for the administrative boundaries in France and unzipped the data to a local folder. Then I used Shape2SQL to upload the data into a SQL Server database called Spatial. I'm not sure of the reason why, but I had to uncheck the option to create a spatial index to upload the data. Otherwise, the upload appeared to run successfully, but no table appeared in my database. The zip file that I downloaded contained three files, but I didn't know what was in them until I used Shape2SQL to upload the data into tables. Then I found that FRA_adm0 contains spatial data for the country of France, FRA_adm1 contains spatial data for each region, and FRA_adm2 contains spatial data for each department (a subdivision of region). Next I prepared my SQL query containing sales data for fictional stores selling Adventure Works products in France. The Person.Address table in the AdventureWorks2008R2 database (which you can download from Codeplex) contains a SpatialLocation column which I joined - along with several other tables - to the Sales.Customer and Sales.Store tables. I'll be able to superimpose this data on a map to see where these stores are located. I included the SQL script for this query (as well as the spatial data for France) in the downloadable project that I created for this post. Step 1: Using the Map Wizard to Create a Map of France You can build a map without using the wizard, but I find it's rather useful in this case. Whether you use Business Intelligence Development Studio (BIDS) or Report Builder 3.0, the map wizard is the same. I used BIDS so that I could create a project that includes all the files related to this post. To get started, I added an empty report template to the project and named it France Stores. Then I opened the Toolbox window and dragged the Map item to the report body which starts the wizard. Here are the steps to perform to create a map of France: On the Choose a source of spatial data page of the wizard, select SQL Server spatial query, and click Next. On the Choose a dataset with SQL Server spatial data page, select Add a new dataset with SQL Server spatial data. On the Choose a connection to a SQL Server spatial data source page, select New. In the Data Source Properties dialog box, on the General page, add a connecton string like this (changing your server name if necessary): Data Source=(local);Initial Catalog=Spatial Click OK and then click Next. On the Design a query page, add a query for the country shape, like this: select * from fra_adm1 Click Next. The map wizard reads the spatial data and renders it for you on the Choose spatial data and map view options page, as shown below. You have the option to add a Bing Maps layer which shows surrounding countries. Depending on the type of Bing Maps layer that you choose to add (from Road, Aerial, or Hybrid) and the zoom percentage you select, you can view city names and roads and various boundaries. To keep from cluttering my map, I'm going to omit the Bing Maps layer in this example, but I do recommend that you experiment with this feature. It's a nice integration feature. Use the + or - button to rexize the map as needed. (I used the + button to increase the size of the map until its edges were just inside the boundaries of the visible map area (which is called the viewport). You can eliminate the color scale and distance scale boxes that appear in the map area later. Select the Embed map data in this report for faster rendering. The spatial data won't be changing, so there's no need to leave it in the database. However, it does increase the size of the RDL. Click Next. On the Choose map visualization page, select Basic Map. We'll add data for visualization later. For now, we have just the outline of France to serve as the foundation layer for our map. Click Next, and then click Finish. Now click the color scale box in the lower left corner of the map, and press the Delete key to remove it. Then repeat to remove the distance scale box in the lower right corner of the map. Step 2: Add a Map Layer to an Existing Map The map data region allows you to add multiple layers. Each layer is associated with a different data set. Thus far, we have the spatial data that defines the regional boundaries in the first map layer. Now I'll add in another layer for the store locations by following these steps: If the Map Layers windows is not visible, click the report body, and then click twice anywhere on the map data region to display it. Click on the New Layer Wizard button in the Map layers window. And then we start over again with the process by choosing a spatial data source. Select SQL Server spatial query, and click Next. Select Add a new dataset with SQL Server spatial data, and click Next. Click New, add a connection string to the AdventureWorks2008R2 database, and click Next. Add a query with spatial data (like the one I included in the downloadable project), and click Next. The location data now appears as another layer on top of the regional map created earlier. Use the + button to resize the map again to fill as much of the viewport as possible without cutting off edges of the map. You might need to drag the map within the viewport to center it properly. Select Embed map data in this report, and click Next. On the Choose map visualization page, select Basic Marker Map, and click Next. On the Choose color theme and data visualization page, in the Marker drop-down list, change the marker to diamond. There's no particular reason for a diamond; I think it stands out a little better than a circle on this map. Clear the Single color map checkbox as another way to distinguish the markers from the map. You can of course create an analytical map instead, which would change the size and/or color of the markers according to criteria that you specify, such as sales volume of each store, but I'll save that exploration for another post on another day. Click Finish and then click Preview to see the rendered report. Et voilà...c'est fini. Yes, it's a very simple map at this point, but there are many other things you can do to enhance the map. I'll create a series of posts to explore the possibilities. Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
The Incremental Architect´s Napkin - #1 - It´s about the money, stupid

- by Ralf Westphal

Originally posted on: http://geekswithblogs.net/theArchitectsNapkin/archive/2014/05/24/the-incremental-architectacutes-napkin---1---itacutes-about-the.aspx Software development is an economic endeavor. A customer is only willing to pay for value. What makes a software valuable is required to become a trait of the software. We as software developers thus need to understand and then find a way to implement requirements. Whether or in how far a customer really can know beforehand what´s going to be valuable for him/her in the end is a topic of constant debate. Some aspects of the requirements might be less foggy than others. Sometimes the customer does not know what he/she wants. Sometimes he/she´s certain to want something - but then is not happy when that´s delivered. Nevertheless requirements exist. And developers will only be paid if they deliver value. So we better focus on doing that. Although is might sound trivial I think it´s important to state the corollary: We need to be able to trace anything we do as developers back to some requirement. You decide to use Go as the implementation language? Well, what´s the customer´s requirement this decision is linked to? You decide to use WPF as the GUI technology? What´s the customer´s requirement? You decide in favor of a layered architecture? What´s the customer´s requirement? You decide to put code in three classes instead of just one? What´s the customer´s requirement behind that? You decide to use MongoDB over MySql? What´s the customer´s requirement behind that? etc. I´m not saying any of these decisions are wrong. I´m just saying whatever you decide be clear about the requirement that´s driving your decision. You have to be able to answer the question: Why do you think will X deliver more value to the customer than the alternatives? Customers are not interested in romantic ideals of hard working, good willing, quality focused craftsmen. They don´t care how and why you work - as long as what you deliver fulfills their needs. They want to trust you to recognize this as your top priority - and then deliver. That´s all. Fundamental aspects of requirements If you´re like me you´re probably not used to such scrutinization. You want to be trusted as a professional developer - and decide quite a few things following your gut feeling. Or by relying on “established practices”. That´s ok in general and most of the time - but still… I think we should be more conscious about our decisions. Which would make us more responsible, even more professional. But without further guidance it´s hard to reason about many of the myriad decisions we´ve to make over the course of a software project. What I found helpful in this situation is structuring requirements into fundamental aspects. Instead of one large heap of requirements then there are smaller blobs. With them it´s easier to check if a decisions falls in their scope. Sure, every project has it´s very own requirements. But all of them belong to just three different major categories, I think. Any requirement either pertains to functionality, non-functional aspects or sustainability. For short I call those aspects: Functionality, because such requirements describe which transformations a software should offer. For example: A calculator software should be able to add and multiply real numbers. An auction website should enable you to set up an auction anytime or to find auctions to bid for. Quality, because such requirements describe how functionality is supposed to work, e.g. fast or secure. For example: A calculator should be able to calculate the sinus of a value much faster than you could in your head. An auction website should accept bids from millions of users. Security of Investment, because functionality and quality need not just be delivered in any way. It´s important to the customer to get them quickly - and not only today but over the course of several years. This aspect introduces time into the “requrements equation”. Security of Investments (SoI) sure is a non-functional requirement. But I think it´s important to not subsume it under the Quality (Q) aspect. That´s because SoI has quite special properties. For one, SoI for software means something completely different from what it means for hardware. If you buy hardware (a car, a hair blower) you find that a worthwhile investment, if the hardware does not change it´s functionality or quality over time. A car still running smoothly with hardly any rust spots after 10 years of daily usage would be a very secure investment. So for hardware (or material products, if you like) “unchangeability” (in the face of usage) is desirable. With software you want the contrary. Software that cannot be changed is a waste. SoI for software means “changeability”. You want to be sure that the software you buy/order today can be changed, adapted, improved over an unforseeable number of years so as fit changes in its usage environment. But that´s not the only reason why the SoI aspect is special. On top of changeability[1] (or evolvability) comes immeasurability. Evolvability cannot readily be measured by counting something. Whether the changeability is as high as the customer wants it, cannot be determined by looking at metrics like Lines of Code or Cyclomatic Complexity or Afferent Coupling. They may give a hint… but they are far, far from precise. That´s because of the nature of changeability. It´s different from performance or scalability. Also it´s because a customer cannot tell upfront, “how much” evolvability he/she wants. Whether requirements regarding Functionality (F) and Q have been met, a customer can tell you very quickly and very precisely. A calculation is missing, the calculation takes too long, the calculation time degrades with increased load, the calculation is accessible to the wrong users etc. That´s all very or at least comparatively easy to determine. But changeability… That´s a whole different thing. Nevertheless over time the customer will develop a feedling if changeability is good enough or degrading. He/she just has to check the development of the frequency of “WTF”s from developers ;-) F and Q are “timeless” requirement categories. Customers want us to deliver on them now. Just focusing on the now, though, is rarely beneficial in the long run. So SoI adds a counterweight to the requirements picture. Customers want SoI - whether they know it or not, whether they state if explicitly or not. In closing A customer´s requirements are not monolithic. They are not all made the same. Rather they fall into different categories. We as developers need to recognize these categories when confronted with some requirement - and take them into account. Only then can we make true professional decisions, i.e. conscious and responsible ones. I call this fundamental trait of software “changeability” and not “flexibility” to distinguish to whom it´s a concern. “Flexibility” to me means, software as is can easily be adapted to a change in its environment, e.g. by tweaking some config data or adding a library which gets picked up by a plug-in engine. “Flexibiltiy” thus is a matter of some user. “Changeability”, on the other hand, to me means, software can easily be changed in its structure to adapt it to new requirements. That´s a matter of the software developer. ?

Read the article
HDMI video connection cuts top and bottom borders of screen

- by Luis Alvarado

Ok this is an extension of another problem I had with a VGA connection and an Nvidia Geforce GT 440 card. Here is goes the explanation of this particular problem: I have a Soneview 32' TV. This TV has many connections including VGA (First reason I bought it), HDMI (Second reason but did not have a HDMI cable at that time) and DVI. I have had this TV for little over a month now, actually I had it to celebrate the release of Ubuntu 11.10 and started using it exactly on that date (I know too much fan there but hey, I like geek stuff). I started using it with the VGA cable. After 2 weeks I bought an Nvidia GT440 card. The previous 9500GT was working correctly with no problems whatsoever. I installed the GT440 and the first problem that I encountered using this latest card is mentioned here: Nvidia GT 440 black screen problem when loading lightdm greeter. The solution to this problem was to actually disconnect then connect again the VGA cable. This would result in the screen showing me the lightdm screen for my login. If I did not disconnect then connect the cable I could be there forever thinking that there is no video signal. I got tired of looking for answers that did not work and for solutions that made me literally have to install Ubuntu again. I just went and bought a HDMI cable and changed the VGA one for that one. It worked and I did not have to disconnect/connect the cable but now I have this problem when using any resolution. My normal resolution is 1920x1080 (This TV is 1080HD) so in VGA I could use this resolution with no problem, but on HDMI am getting the borders cut out. Here is a pic: As you can see from the PIC, the Launcher icons only show less than 50% of their witdh. Forget about the top and bottom parts, I can access them with the mouse but I can not visualize them in the screen. It is like it's outside of the TVs view. Basically there is like 20 to 30 pixels gone from all sides. I searched around and came to running xrand --verbose to see what it could detect from the TV. I got this: cyrex@cyrex:~$ xrandr --verbose xrandr: Failed to get size of gamma for output default Screen 0: minimum 320 x 175, current 1920 x 1080, maximum 1920 x 1080 default connected 1920x1080+0+0 (0x164) normal (normal) 0mm x 0mm Identifier: 0x163 Timestamp: 465485 Subpixel: unknown Clones: CRTC: 0 CRTCs: 0 Transform: 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 filter: 1920x1080 (0x164) 103.7MHz *current h: width 1920 start 0 end 0 total 1920 skew 0 clock 54.0KHz v: height 1080 start 0 end 0 total 1080 clock 50.0Hz 1920x1080 (0x165) 105.8MHz h: width 1920 start 0 end 0 total 1920 skew 0 clock 55.1KHz v: height 1080 start 0 end 0 total 1080 clock 51.0Hz 1920x1080 (0x166) 107.8MHz h: width 1920 start 0 end 0 total 1920 skew 0 clock 56.2KHz v: height 1080 start 0 end 0 total 1080 clock 52.0Hz 1920x1080 (0x167) 109.9MHz h: width 1920 start 0 end 0 total 1920 skew 0 clock 57.2KHz v: height 1080 start 0 end 0 total 1080 clock 53.0Hz 1920x1080 (0x168) 112.0MHz h: width 1920 start 0 end 0 total 1920 skew 0 clock 58.3KHz v: height 1080 start 0 end 0 total 1080 clock 54.0Hz 1920x1080 (0x169) 114.0MHz h: width 1920 start 0 end 0 total 1920 skew 0 clock 59.4KHz v: height 1080 start 0 end 0 total 1080 clock 55.0Hz 1680x1050 (0x16a) 98.8MHz h: width 1680 start 0 end 0 total 1680 skew 0 clock 58.8KHz v: height 1050 start 0 end 0 total 1050 clock 56.0Hz 1680x1050 (0x16b) 100.5MHz h: width 1680 start 0 end 0 total 1680 skew 0 clock 59.9KHz v: height 1050 start 0 end 0 total 1050 clock 57.0Hz 1600x1024 (0x16c) 95.0MHz h: width 1600 start 0 end 0 total 1600 skew 0 clock 59.4KHz v: height 1024 start 0 end 0 total 1024 clock 58.0Hz 1440x900 (0x16d) 76.5MHz h: width 1440 start 0 end 0 total 1440 skew 0 clock 53.1KHz v: height 900 start 0 end 0 total 900 clock 59.0Hz 1360x768 (0x171) 65.8MHz h: width 1360 start 0 end 0 total 1360 skew 0 clock 48.4KHz v: height 768 start 0 end 0 total 768 clock 63.0Hz 1360x768 (0x172) 66.8MHz h: width 1360 start 0 end 0 total 1360 skew 0 clock 49.2KHz v: height 768 start 0 end 0 total 768 clock 64.0Hz 1280x1024 (0x173) 85.2MHz h: width 1280 start 0 end 0 total 1280 skew 0 clock 66.6KHz v: height 1024 start 0 end 0 total 1024 clock 65.0Hz 1280x960 (0x176) 83.6MHz h: width 1280 start 0 end 0 total 1280 skew 0 clock 65.3KHz v: height 960 start 0 end 0 total 960 clock 68.0Hz 1280x960 (0x177) 84.8MHz h: width 1280 start 0 end 0 total 1280 skew 0 clock 66.2KHz v: height 960 start 0 end 0 total 960 clock 69.0Hz 1280x720 (0x178) 64.5MHz h: width 1280 start 0 end 0 total 1280 skew 0 clock 50.4KHz v: height 720 start 0 end 0 total 720 clock 70.0Hz 1280x720 (0x179) 65.4MHz h: width 1280 start 0 end 0 total 1280 skew 0 clock 51.1KHz v: height 720 start 0 end 0 total 720 clock 71.0Hz 1280x720 (0x17a) 66.4MHz h: width 1280 start 0 end 0 total 1280 skew 0 clock 51.8KHz v: height 720 start 0 end 0 total 720 clock 72.0Hz 1152x864 (0x17b) 72.7MHz h: width 1152 start 0 end 0 total 1152 skew 0 clock 63.1KHz v: height 864 start 0 end 0 total 864 clock 73.0Hz 1152x864 (0x17c) 73.7MHz h: width 1152 start 0 end 0 total 1152 skew 0 clock 63.9KHz v: height 864 start 0 end 0 total 864 clock 74.0Hz ....Many Resolutions later... 320x200 (0x1d1) 10.2MHz h: width 320 start 0 end 0 total 320 skew 0 clock 31.8KHz v: height 200 start 0 end 0 total 200 clock 159.0Hz 320x175 (0x1d2) 9.0MHz h: width 320 start 0 end 0 total 320 skew 0 clock 28.0KHz v: height 175 start 0 end 0 total 175 clock 160.0Hz 1920x1080 (0x1dd) 333.8MHz h: width 1920 start 0 end 0 total 1920 skew 0 clock 173.9KHz v: height 1080 start 0 end 0 total 1080 clock 161.0Hz If it helps, the Refresh Rate at 1920x1080 is 60. There is a flickering effect at this resolution using HDMI but not VGA which I imagine is related to the borders cut off issue am asking here. I have also done the following but this will only solve the problem on lower resolutions than 1920x1080 or on others TV (My father has a Sony TV where this problem is also solved): NVIDIA WAY Go to Nvidia-Settings and there will be an option that will have more features if a HDMI cable is connected. In the next pic the option is DFP-1 (CNDLCD) but this name changes depending on what device the PC is connected to: Uncheck Force Full GPU Scaling What this will do for resolutions LOWER than 1920x1080 (At least in my case) is solve the flickering problem and fix the borders cut by the monitor. Save to Xorg.conf file the changes made after changing to a resolution acceptable to your eyes. TV WAY If you TV has OSD Menu and this menu has options for scanning the screen resolution or auto adjusting to it, disable them. Specifically the option about SCAN. If you have an option for AV Mode disable it. Basically disable any option that needs to scan and scale the resolution. Test one by one. In the case of my father's TV this did it. In my case, the Nvidia solved it for lower resolutions. NOTE: In the case this is not solved in the next couple of weeks I will add this as the answer but take into consideration that the issue is still active with 1920x1080 resolutions.

Read the article
The last MVVM you'll ever need?

- by Nuri Halperin

As my MVC projects mature and grow, the need to have some omnipresent, ambient model properties quickly emerge. The application no longer has only one dynamic pieced of data on the page: A sidebar with a shopping cart, some news flash on the side – pretty common stuff. The rub is that a controller is invoked in context of a single intended request. The rest of the data, even though it could be just as dynamic, is expected to appear on it's own. There are many solutions to this scenario. MVVM prescribes creating elaborate objects which expose your new data as a property on some uber-object with more properties exposing the "side show" ambient data. The reason I don't love this approach is because it forces fairly acute awareness of the view, and soon enough you have many MVVM objects laying around, and views have to start doing null-checks in order to ensure you really supplied all the values before binding to them. Ick. Just as unattractive is the ViewData dictionary. It's not strongly typed, and in both this and the MVVM approach someone has to populate these properties – n'est pas? Where does that live? With MVC2, we get the formerly-futures feature Html.RenderAction(). The feature allows you plant a line in a view, of the format: <% Html.RenderAction("SessionInterest", "Session"); %> While this syntax looks very clean, I can't help being bothered by it. MVC was touting a very strong separation of concerns, the Model taking on the role of the business logic, the controller handling route and performing minimal view-choosing operations and the views strictly focused on rendering out angled-bracket tags. The RenderAction() syntax has the view calling some controller and invoking it inline with it's runtime rendering. This – to my taste – embeds too much knowledge of controllers into the view's code – which was allegedly forbidden. The one way flow "Controller Receive Data –> Controller invoke Model –> Controller select view –> Controller Hand data to view" now gets a "View calls controller and gets it's own data" which is not so one-way anymore. Ick. I toyed with some other solutions a bit, including some base controllers, special view classes etc. My current favorite though is making use of the ExpandoObject and dynamic features with C# 4.0. If you follow Phil Haack or read a bit from David Heyden you can see the general picture emerging. The game changer is that using the new dynamic syntax, one can sprout properties on an object and make use of them in the view. Well that beats having a bunch of uni-purpose MVVM's any day! Rather than statically exposed properties, we'll just use the capability of adding members at runtime. Armed with new ideas and syntax, I went to work: First, I created a factory method to enrich the focuse object: public static class ModelExtension { public static dynamic Decorate(this Controller controller, object mainValue) { dynamic result = new ExpandoObject(); result.Value = mainValue; result.SessionInterest = CodeCampBL.SessoinInterest(); result.TagUsage = CodeCampBL.TagUsage(); return result; } } This gives me a nice fluent way to have the controller add the rest of the ambient "side show" items (SessionInterest, TagUsage in this demo) and expose them all as the Model: public ActionResult Index() { var data = SyndicationBL.Refresh(TWEET_SOURCE_URL); dynamic result = this.Decorate(data); return View(result); } So now what remains is that my view knows to expect a dynamic object (rather than statically typed) so that the ASP.NET page compiler won't barf: <%@ Page Language="C#" Title="Ambient Demo" MasterPageFile="~/Views/Shared/Ambient.Master" Inherits="System.Web.Mvc.ViewPage<dynamic>" %> Notice the generic ViewPage<dynamic>. It doesn't work otherwise. In the page itself, Model.Value property contains the main data returned from the controller. The nice thing about this, is that the master page (Ambient.Master) also inherits from the generic ViewMasterPage<dynamic>. So rather than the page worrying about all this ambient stuff, the side bars and panels for ambient data all reside in a master page, and can be rendered using the RenderPartial() syntax: <% Html.RenderPartial("TagCloud", Model.SessionInterest as Dictionary<string, int>); %> Note here that a cast is necessary. This is because although dynamic is magic, it can't figure out what type this property is, and wants you to give it a type so its binder can figure out the right property to bind to at runtime. I use as, you can cast if you like. So there we go – no violation of MVC, no explosion of MVVM models and voila – right? Well, I could not let this go without a tweak or two more. The first thing to improve, is that some views may not need all the properties. In that case, it would be a waste of resources to populate every property. The solution to this is simple: rather than exposing properties, I change d the factory method to expose lambdas - Func<T> really. So only if and when a view accesses a member of the dynamic object does it load the data. public static class ModelExtension { // take two.. lazy loading! public static dynamic LazyDecorate(this Controller c, object mainValue) { dynamic result = new ExpandoObject(); result.Value = mainValue; result.SessionInterest = new Func<Dictionary<string, int>>(() => CodeCampBL.SessoinInterest()); result.TagUsage = new Func<Dictionary<string, int>>(() => CodeCampBL.TagUsage()); return result; } } Now that lazy loading is in place, there's really no reason not to hook up all and any possible ambient property. Go nuts! Add them all in – they won't get invoked unless used. This now requires changing the signature of usage on the ambient properties methods –adding some parenthesis to the master view: <% Html.RenderPartial("TagCloud", Model.SessionInterest() as Dictionary<string, int>); %> And, of course, the controller needs to call LazyDecorate() rather than the old Decorate(). The final touch is to introduce a convenience method to the my Controller class , so that the tedium of calling Decorate() everywhere goes away. This is done quite simply by adding a bunch of methods, matching View(object), View(string,object) signatures of the Controller class: public ActionResult Index() { var data = SyndicationBL.Refresh(TWEET_SOURCE_URL); return AmbientView(data); } //these methods can reside in a base controller for the solution: public ViewResult AmbientView(dynamic data) { dynamic result = ModelExtension.LazyDecorate(this, data); return View(result); } public ViewResult AmbientView(string viewName, dynamic data) { dynamic result = ModelExtension.LazyDecorate(this, data); return View(viewName, result); } The call to AmbientView now replaces any call the View() that requires the ambient data. DRY sattisfied, lazy loading and no need to replace core pieces of the MVC pipeline. I call this a good MVC day. Enjoy!

Read the article
The Proper Use of the VM Role in Windows Azure

- by BuckWoody

At the Professional Developer’s Conference (PDC) in 2010 we announced an addition to the Computational Roles in Windows Azure, called the VM Role. This new feature allows a great deal of control over the applications you write, but some have confused it with our full infrastructure offering in Windows Hyper-V. There is a proper architecture pattern for both of them. Virtualization Virtualization is the process of taking all of the hardware of a physical computer and replicating it in software alone. This means that a single computer can “host” or run several “virtual” computers. These virtual computers can run anywhere - including at a vendor’s location. Some companies refer to this as Cloud Computing since the hardware is operated and maintained elsewhere. IaaS The more detailed definition of this type of computing is called Infrastructure as a Service (Iaas) since it removes the need for you to maintain hardware at your organization. The operating system, drivers, and all the other software required to run an application are still under your control and your responsibility to license, patch, and scale. Microsoft has an offering in this space called Hyper-V, that runs on the Windows operating system. Combined with a hardware hosting vendor and the System Center software to create and deploy Virtual Machines (a process referred to as provisioning), you can create a Cloud environment with full control over all aspects of the machine, including multiple operating systems if you like. Hosting machines and provisioning them at your own buildings is sometimes called a Private Cloud, and hosting them somewhere else is often called a Public Cloud. State-ful and Stateless Programming This paradigm does not create a new, scalable way of computing. It simply moves the hardware away. The reason is that when you limit the Cloud efforts to a Virtual Machine, you are in effect limiting the computing resources to what that single system can provide. This is because much of the software developed in this environment maintains “state” - and that requires a little explanation. “State-ful programming” means that all parts of the computing environment stay connected to each other throughout a compute cycle. The system expects the memory, CPU, storage and network to remain in the same state from the beginning of the process to the end. You can think of this as a telephone conversation - you expect that the other person picks up the phone, listens to you, and talks back all in a single unit of time. In “Stateless” computing the system is designed to allow the different parts of the code to run independently of each other. You can think of this like an e-mail exchange. You compose an e-mail from your system (it has the state when you’re doing that) and then you walk away for a bit to make some coffee. A few minutes later you click the “send” button (the network has the state) and you go to a meeting. The server receives the message and stores it on a mail program’s database (the mail server has the state now) and continues working on other mail. Finally, the other party logs on to their mail client and reads the mail (the other user has the state) and responds to it and so on. These events might be separated by milliseconds or even days, but the system continues to operate. The entire process doesn’t maintain the state, each component does. This is the exact concept behind coding for Windows Azure. The stateless programming model allows amazing rates of scale, since the message (think of the e-mail) can be broken apart by multiple programs and worked on in parallel (like when the e-mail goes to hundreds of users), and only the order of re-assembling the work is important to consider. For the exact same reason, if the system makes copies of those running programs as Windows Azure does, you have built-in redundancy and recovery. It’s just built into the design. The Difference Between Infrastructure Designs and Platform Designs When you simply take a physical server running software and virtualize it either privately or publicly, you haven’t done anything to allow the code to scale or have recovery. That all has to be handled by adding more code and more Virtual Machines that have a slight lag in maintaining the running state of the system. Add more machines and you get more lag, so the scale is limited. This is the primary limitation with IaaS. It’s also not as easy to deploy these VM’s, and more importantly, you’re often charged on a longer basis to remove them. your agility in IaaS is more limited. Windows Azure is a Platform - meaning that you get objects you can code against. The code you write runs on multiple nodes with multiple copies, and it all works because of the magic of Stateless programming. you don’t worry, or even care, about what is running underneath. It could be Windows (and it is in fact a type of Windows Server), Linux, or anything else - but that' isn’t what you want to manage, monitor, maintain or license. You don’t want to deploy an operating system - you want to deploy an application. You want your code to run, and you don’t care how it does that. Another benefit to PaaS is that you can ask for hundreds or thousands of new nodes of computing power - there’s no provisioning, it just happens. And you can stop using them quicker - and the base code for your application does not have to change to make this happen. Windows Azure Roles and Their Use If you need your code to have a user interface, in Visual Studio you add a Web Role to your project, and if the code needs to do work that doesn’t involve a user interface you can add a Worker Role. They are just containers that act a certain way. I’ll provide more detail on those later. Note: That’s a general description, so it’s not entirely accurate, but it’s accurate enough for this discussion. So now we’re back to that VM Role. Because of the name, some have mistakenly thought that you can take a Virtual Machine running, say Linux, and deploy it to Windows Azure using this Role. But you can’t. That’s not what it is designed for at all. If you do need that kind of deployment, you should look into Hyper-V and System Center to create the Private or Public Infrastructure as a Service. What the VM Role is actually designed to do is to allow you to have a great deal of control over the system where your code will run. Let’s take an example. You’ve heard about Windows Azure, and Platform programming. You’re convinced it’s the right way to code. But you have a lot of things you’ve written in another way at your company. Re-writing all of your code to take advantage of Windows Azure will take a long time. Or perhaps you have a certain version of Apache Web Server that you need for your code to work. In both cases, you think you can (or already have) code the the software to be “Stateless”, you just need more control over the place where the code runs. That’s the place where a VM Role makes sense. Recap Virtualizing servers alone has limitations of scale, availability and recovery. Microsoft’s offering in this area is Hyper-V and System Center, not the VM Role. The VM Role is still used for running Stateless code, just like the Web and Worker Roles, with the exception that it allows you more control over the environment of where that code runs.

Read the article
Handling HTTP 404 Error in ASP.NET Web API

- by imran_ku07

Introduction: Building modern HTTP/RESTful/RPC services has become very easy with the new ASP.NET Web API framework. Using ASP.NET Web API framework, you can create HTTP services which can be accessed from browsers, machines, mobile devices and other clients. Developing HTTP services is now become more easy for ASP.NET MVC developer becasue ASP.NET Web API is now included in ASP.NET MVC. In addition to developing HTTP services, it is also important to return meaningful response to client if a resource(uri) not found(HTTP 404) for a reason(for example, mistyped resource uri). It is also important to make this response centralized so you can configure all of 'HTTP 404 Not Found' resource at one place. In this article, I will show you how to handle 'HTTP 404 Not Found' at one place. Description: Let's say that you are developing a HTTP RESTful application using ASP.NET Web API framework. In this application you need to handle HTTP 404 errors in a centralized location. From ASP.NET Web API point of you, you need to handle these situations, No route matched. Route is matched but no {controller} has been found on route. No type with {controller} name has been found. No matching action method found in the selected controller due to no action method start with the request HTTP method verb or no action method with IActionHttpMethodProviderRoute implemented attribute found or no method with {action} name found or no method with the matching {action} name found. Now, let create a ErrorController with Handle404 action method. This action method will be used in all of the above cases for sending HTTP 404 response message to the client. public class ErrorController : ApiController { [HttpGet, HttpPost, HttpPut, HttpDelete, HttpHead, HttpOptions, AcceptVerbs("PATCH")] public HttpResponseMessage Handle404() { var responseMessage = new HttpResponseMessage(HttpStatusCode.NotFound); responseMessage.ReasonPhrase = "The requested resource is not found"; return responseMessage; } } You can easily change the above action method to send some other specific HTTP 404 error response. If a client of your HTTP service send a request to a resource(uri) and no route matched with this uri on server then you can route the request to the above Handle404 method using a custom route. Put this route at the very bottom of route configuration, routes.MapHttpRoute( name: "Error404", routeTemplate: "{*url}", defaults: new { controller = "Error", action = "Handle404" } ); Now you need handle the case when there is no {controller} in the matching route or when there is no type with {controller} name found. You can easily handle this case and route the request to the above Handle404 method using a custom IHttpControllerSelector. Here is the definition of a custom IHttpControllerSelector, public class HttpNotFoundAwareDefaultHttpControllerSelector : DefaultHttpControllerSelector { public HttpNotFoundAwareDefaultHttpControllerSelector(HttpConfiguration configuration) : base(configuration) { } public override HttpControllerDescriptor SelectController(HttpRequestMessage request) { HttpControllerDescriptor decriptor = null; try { decriptor = base.SelectController(request); } catch (HttpResponseException ex) { var code = ex.Response.StatusCode; if (code != HttpStatusCode.NotFound) throw; var routeValues = request.GetRouteData().Values; routeValues["controller"] = "Error"; routeValues["action"] = "Handle404"; decriptor = base.SelectController(request); } return decriptor; } } Next, it is also required to pass the request to the above Handle404 method if no matching action method found in the selected controller due to the reason discussed above. This situation can also be easily handled through a custom IHttpActionSelector. Here is the source of custom IHttpActionSelector, public class HttpNotFoundAwareControllerActionSelector : ApiControllerActionSelector { public HttpNotFoundAwareControllerActionSelector() { } public override HttpActionDescriptor SelectAction(HttpControllerContext controllerContext) { HttpActionDescriptor decriptor = null; try { decriptor = base.SelectAction(controllerContext); } catch (HttpResponseException ex) { var code = ex.Response.StatusCode; if (code != HttpStatusCode.NotFound && code != HttpStatusCode.MethodNotAllowed) throw; var routeData = controllerContext.RouteData; routeData.Values["action"] = "Handle404"; IHttpController httpController = new ErrorController(); controllerContext.Controller = httpController; controllerContext.ControllerDescriptor = new HttpControllerDescriptor(controllerContext.Configuration, "Error", httpController.GetType()); decriptor = base.SelectAction(controllerContext); } return decriptor; } } Finally, we need to register the custom IHttpControllerSelector and IHttpActionSelector. Open global.asax.cs file and add these lines, configuration.Services.Replace(typeof(IHttpControllerSelector), new HttpNotFoundAwareDefaultHttpControllerSelector(configuration)); configuration.Services.Replace(typeof(IHttpActionSelector), new HttpNotFoundAwareControllerActionSelector()); Summary: In addition to building an application for HTTP services, it is also important to send meaningful centralized information in response when something goes wrong, for example 'HTTP 404 Not Found' error. In this article, I showed you how to handle 'HTTP 404 Not Found' error in a centralized location. Hopefully you will enjoy this article too.

Read the article
Wrapping ASP.NET Client Callbacks

- by Ricardo Peres

Client Callbacks are probably the less known (and I dare say, less loved) of all the AJAX options in ASP.NET, which also include the UpdatePanel, Page Methods and Web Services. The reason for that, I believe, is it’s relative complexity: Get a reference to a JavaScript function; Dynamically register function that calls the above reference; Have a JavaScript handler call the registered function. However, it has some the nice advantage of being self-contained, that is, doesn’t need additional files, such as web services, JavaScript libraries, etc, or static methods declared on a page, or any kind of attributes. So, here’s what I want to do: Have a DOM element which exposes a method that is executed server side, passing it a string and returning a string; Have a server-side event that handles the client-side call; Have two client-side user-supplied callback functions for handling the success and error results. I’m going to develop a custom control without user interface that does the registration of the client JavaScript method as well as a server-side event that can be hooked by some handler on a page. My markup will look like this: 1: <script type="text/javascript"> 1: 2: 3: function onCallbackSuccess(result, context) 4: { 5: } 6: 7: function onCallbackError(error, context) 8: { 9: } 10: </script> 2: <my:CallbackControl runat="server" ID="callback" SendAllData="true" OnCallback="OnCallback"/> The control itself looks like this: 1: public class CallbackControl : Control, ICallbackEventHandler 2: { 3: #region Public constructor 4: public CallbackControl() 5: { 6: this.SendAllData = false; 7: this.Async = true; 8: } 9: #endregion 10: 11: #region Public properties and events 12: public event EventHandler<CallbackEventArgs> Callback; 13: 14: [DefaultValue(true)] 15: public Boolean Async 16: { 17: get; 18: set; 19: } 20: 21: [DefaultValue(false)] 22: public Boolean SendAllData 23: { 24: get; 25: set; 26: } 27: 28: #endregion 29: 30: #region Protected override methods 31: 32: protected override void Render(HtmlTextWriter writer) 33: { 34: writer.AddAttribute(HtmlTextWriterAttribute.Id, this.ClientID); 35: writer.RenderBeginTag(HtmlTextWriterTag.Span); 36: 37: base.Render(writer); 38: 39: writer.RenderEndTag(); 40: } 41: 42: protected override void OnInit(EventArgs e) 43: { 44: String reference = this.Page.ClientScript.GetCallbackEventReference(this, "arg", "onCallbackSuccess", "context", "onCallbackError", this.Async); 45: String script = String.Concat("\ndocument.getElementById('", this.ClientID, "').callback = function(arg, context, onCallbackSuccess, onCallbackError){", ((this.SendAllData == true) ? "__theFormPostCollection.length = 0; __theFormPostData = ''; WebForm_InitCallback(); " : String.Empty), reference, ";};\n"); 46: 47: this.Page.ClientScript.RegisterStartupScript(this.GetType(), String.Concat("callback", this.ClientID), script, true); 48: 49: base.OnInit(e); 50: } 51: 52: #endregion 53: 54: #region Protected virtual methods 55: protected virtual void OnCallback(CallbackEventArgs args) 56: { 57: EventHandler<CallbackEventArgs> handler = this.Callback; 58: 59: if (handler != null) 60: { 61: handler(this, args); 62: } 63: } 64: 65: #endregion 66: 67: #region ICallbackEventHandler Members 68: 69: String ICallbackEventHandler.GetCallbackResult() 70: { 71: CallbackEventArgs args = new CallbackEventArgs(this.Context.Items["Data"] as String); 72: 73: this.OnCallback(args); 74: 75: return (args.Result); 76: } 77: 78: void ICallbackEventHandler.RaiseCallbackEvent(String eventArgument) 79: { 80: this.Context.Items["Data"] = eventArgument; 81: } 82: 83: #endregion 84: } And the event argument class: 1: [Serializable] 2: public class CallbackEventArgs : EventArgs 3: { 4: public CallbackEventArgs(String argument) 5: { 6: this.Argument = argument; 7: this.Result = String.Empty; 8: } 9: 10: public String Argument 11: { 12: get; 13: private set; 14: } 15: 16: public String Result 17: { 18: get; 19: set; 20: } 21: } You will notice two properties on the CallbackControl: Async: indicates if the call should be made asynchronously or synchronously (the default); SendAllData: indicates if the callback call will include the view and control state of all of the controls on the page, so that, on the server side, they will have their properties set when the Callback event is fired. The CallbackEventArgs class exposes two properties: Argument: the read-only argument passed to the client-side function; Result: the result to return to the client-side callback function, set from the Callback event handler. An example of an handler for the Callback event would be: 1: protected void OnCallback(Object sender, CallbackEventArgs e) 2: { 3: e.Result = String.Join(String.Empty, e.Argument.Reverse()); 4: } Finally, in order to fire the Callback event from the client, you only need this: 1: <input type="text" id="input"/> 2: <input type="button" value="Get Result" onclick="document.getElementById('callback').callback(callback(document.getElementById('input').value, 'context', onCallbackSuccess, onCallbackError))"/> The syntax of the callback function is: arg: some string argument; context: some context that will be passed to the callback functions (success or failure); callbackSuccessFunction: some function that will be called when the callback succeeds; callbackFailureFunction: some function that will be called if the callback fails for some reason. Give it a try and see if it helps!

Read the article
How I do VCS

- by Wes McClure

After years of dabbling with different version control systems and techniques, I wanted to share some of what I like and dislike in a few blog posts. To start this out, I want to talk about how I use VCS in a team environment. These come in a series of tips or best practices that I try to follow. Note: This list is subject to change in the future. Always use some form of version control for all aspects of software development. Development is an evolution. Looking back at where we were is an invaluable asset in that process. This includes data schemas and documentation. Reverting / reapplying changes is absolutely critical for efficient development. The tools I use: Code: Hg (preferred), SVN Database: TSqlMigrations Documents: Sometimes in code repository, also SharePoint with versioning Always tag a commit (changeset) with comments This is a quick way to describe to someone else (or your future self) what the changeset entails. Be brief but courteous. One or two sentences about the task, not the actual changes. Use precommit hooks or setup the central repository to reject changes without comments. Link changesets to documentation If your project management system integrates with version control, or has a way to externally reference stories, tasks etc then leave a reference in the commit. This helps locate more information about the commit and/or related changesets. It’s best to have a precommit hook or system that requires this information, otherwise it’s easy to forget. Ability to work offline is required, including commits and history Yes this requires a DVCS locally but doesn’t require the central repository to be a DVCS. I prefer to use either Git or Hg but if it isn’t possible to migrate the central repository, it’s still possible for a developer to push / pull changes to that repository from a local Hg or Git repository. Never lock resources (files) in a central repository… Rude! We have merge tools for a reason, merging sucked a long time ago, it doesn’t anymore… stop locking files! This is unproductive, rude and annoying to other team members. Always review everything in your commit. Never ever commit a set of files without reviewing the changes in each. Never add a file without asking yourself, deep down inside, does this belong? If you leave to make changes during a review, start the review over when you come back. Never assume you didn’t touch a file, double check. This is another reason why you want to avoid large, infrequent commits. Requirements for tools Quickly show pending changes for the entire repository. Default action for a resource with pending changes is a diff. Pluggable diff & merge tool Produce a unified diff or a diff of all changes. This is helpful to bulk review changes instead of opening each file. The central repository is not your own personal dump yard. Breaking this rule is a sure fire way to get the F bomb dropped in front of your name, multiple times. If you turn on Visual Studio’s commit on closing studio option, I will personally break your fingers. By the way, the person(s) in charge of this feature should be fired and never be allowed near programming, ever again. Commit (integrate) to the central repository / branch frequently I try to do this before leaving each day, especially without a DVCS. One never knows when they might need to work from remote the following day. Never commit commented out code If it isn’t needed anymore, delete it! If you aren’t sure if it might be useful in the future, delete it! This is why we have history. If you don’t know why it’s commented out, figure it out and then either uncomment it or delete it. Don’t commit build artifacts, user preferences and temporary files. Build artifacts do not belong in VCS, everything in them is present in the code. (ie: bin\*, obj\*, *.dll, *.exe) User preferences are your settings, stop overriding my preferences files! (ie: *.suo and *.user files) Most tools allow you to ignore certain files and Hg/Git allow you to version this as an ignore file. Set this up as a first step when creating a new repository! Be polite when merging unresolved conflicts. Count to 10, cuss, grab a stress ball and realize it’s not a big deal. Actually, it’s an opportunity to let you know that someone else is working in the same area and you might want to communicate with them. Following the other rules, especially committing frequently, will reduce the likelihood of this. Suck it up, we all have to deal with this unintended consequence at times. Just be careful and GET FAMILIAR with your merge tool. It’s really not as scary as you think. I personally prefer KDiff3 as its merging capabilities rock. Don’t blindly merge and then blindly commit your changes, this is rude and unprofessional. Make sure you understand why the conflict occurred and which parts of the code you want to keep. Apply scrutiny when you commit a manual merge: review the diff! Make sure you test the changes (build and run automated tests) Become intimate with your version control system and the tools you use with it. Avoid trial and error as much as is possible, sit down and test the tool out, read some tutorials etc. Create test repositories and walk through common scenarios. Find the most efficient way to do your work. These tools will be used repetitively, so inefficiencies will add up. Sometimes this involves a mix of tools, both GUI and CLI. I like a combination of both Tortoise Hg and hg cli to get the job efficiently. Always tag releases Create a way to find a given release, whether this be in comments or an explicit tag / branch. This should be readily discoverable. Create release branches to patch bugs and then merge the changes back to other development branch(es). If using feature branches, strive for periodic integrations. Feature branches often cause forked code that becomes irreconcilable. Strive to re-integrate somewhat frequently with the branch this code will ultimately be merged into. This will avoid merge conflicts in the future. Feature branches are best when they are mutually exclusive of active development in other branches. Use and abuse local commits , at least one per task in a story. This builds a trail of changes in your local repository that can be pushed to a central repository when the story is complete. Never commit a broken build or failing tests to the central repository. It’s ok for a local commit to break the build and/or tests. In fact, I encourage this if it helps group the changes more logically. This is one of the main reasons I got excited about DVCS, when I wanted more than one changeset for a set of pending changes but some files could be grouped into both changesets (like solution file / project file changes). If you have more than a dozen outstanding changed resources, there should probably be more than one commit involved. Exceptions when maintaining code bases that require shotgun surgery, in this case, it’s a design smell :) Don’t version sensitive information Especially usernames / passwords There is one area I haven’t found a solution I like yet: versioning 3rd party libraries and/or code. I really dislike keeping any assemblies in the repository, but seems to be a common practice for external libraries. Please feel free to share your ideas about this below. -Wes

Read the article
Fun with Aggregates

- by Paul White

There are interesting things to be learned from even the simplest queries. For example, imagine you are given the task of writing a query to list AdventureWorks product names where the product has at least one entry in the transaction history table, but fewer than ten. One possible query to meet that specification is: SELECT p.Name FROM Production.Product AS p JOIN Production.TransactionHistory AS th ON p.ProductID = th.ProductID GROUP BY p.ProductID, p.Name HAVING COUNT_BIG(*) < 10; That query correctly returns 23 rows (execution plan and data sample shown below): The execution plan looks a bit different from the written form of the query: the base tables are accessed in reverse order, and the aggregation is performed before the join. The general idea is to read all rows from the history table, compute the count of rows grouped by ProductID, merge join the results to the Product table on ProductID, and finally filter to only return rows where the count is less than ten. This ‘fully-optimized’ plan has an estimated cost of around 0.33 units. The reason for the quote marks there is that this plan is not quite as optimal as it could be – surely it would make sense to push the Filter down past the join too? To answer that, let’s look at some other ways to formulate this query. This being SQL, there are any number of ways to write logically-equivalent query specifications, so we’ll just look at a couple of interesting ones. The first query is an attempt to reverse-engineer T-SQL from the optimized query plan shown above. It joins the result of pre-aggregating the history table to the Product table before filtering: SELECT p.Name FROM ( SELECT th.ProductID, cnt = COUNT_BIG(*) FROM Production.TransactionHistory AS th GROUP BY th.ProductID ) AS q1 JOIN Production.Product AS p ON p.ProductID = q1.ProductID WHERE q1.cnt < 10; Perhaps a little surprisingly, we get a slightly different execution plan: The results are the same (23 rows) but this time the Filter is pushed below the join! The optimizer chooses nested loops for the join, because the cardinality estimate for rows passing the Filter is a bit low (estimate 1 versus 23 actual), though you can force a merge join with a hint and the Filter still appears below the join. In yet another variation, the < 10 predicate can be ‘manually pushed’ by specifying it in a HAVING clause in the “q1” sub-query instead of in the WHERE clause as written above. The reason this predicate can be pushed past the join in this query form, but not in the original formulation is simply an optimizer limitation – it does make efforts (primarily during the simplification phase) to encourage logically-equivalent query specifications to produce the same execution plan, but the implementation is not completely comprehensive. Moving on to a second example, the following query specification results from phrasing the requirement as “list the products where there exists fewer than ten correlated rows in the history table”: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) < 10 ); Unfortunately, this query produces an incorrect result (86 rows): The problem is that it lists products with no history rows, though the reasons are interesting. The COUNT_BIG(*) in the EXISTS clause is a scalar aggregate (meaning there is no GROUP BY clause) and scalar aggregates always produce a value, even when the input is an empty set. In the case of the COUNT aggregate, the result of aggregating the empty set is zero (the other standard aggregates produce a NULL). To make the point really clear, let’s look at product 709, which happens to be one for which no history rows exist: -- Scalar aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709; -- Vector aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709 GROUP BY th.ProductID; The estimated execution plans for these two statements are almost identical: You might expect the Stream Aggregate to have a Group By for the second statement, but this is not the case. The query includes an equality comparison to a constant value (709), so all qualified rows are guaranteed to have the same value for ProductID and the Group By is optimized away. In fact there are some minor differences between the two plans (the first is auto-parameterized and qualifies for trivial plan, whereas the second is not auto-parameterized and requires cost-based optimization), but there is nothing to indicate that one is a scalar aggregate and the other is a vector aggregate. This is something I would like to see exposed in show plan so I suggested it on Connect. Anyway, the results of running the two queries show the difference at runtime: The scalar aggregate (no GROUP BY) returns a result of zero, whereas the vector aggregate (with a GROUP BY clause) returns nothing at all. Returning to our EXISTS query, we could ‘fix’ it by changing the HAVING clause to reject rows where the scalar aggregate returns zero: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) BETWEEN 1 AND 9 ); The query now returns the correct 23 rows: Unfortunately, the execution plan is less efficient now – it has an estimated cost of 0.78 compared to 0.33 for the earlier plans. Let’s try adding a redundant GROUP BY instead of changing the HAVING clause: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY th.ProductID HAVING COUNT_BIG(*) < 10 ); Not only do we now get correct results (23 rows), this is the execution plan: I like to compare that plan to quantum physics: if you don’t find it shocking, you haven’t understood it properly :) The simple addition of a redundant GROUP BY has resulted in the EXISTS form of the query being transformed into exactly the same optimal plan we found earlier. What’s more, in SQL Server 2008 and later, we can replace the odd-looking GROUP BY with an explicit GROUP BY on the empty set: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ); I offer that as an alternative because some people find it more intuitive (and it perhaps has more geek value too). Whichever way you prefer, it’s rather satisfying to note that the result of the sub-query does not exist for a particular correlated value where a vector aggregate is used (the scalar COUNT aggregate always returns a value, even if zero, so it always ‘EXISTS’ regardless which ProductID is logically being evaluated). The following query forms also produce the optimal plan and correct results, so long as a vector aggregate is used (you can probably find more equivalent query forms): WHERE Clause SELECT p.Name FROM Production.Product AS p WHERE ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) < 10; APPLY SELECT p.Name FROM Production.Product AS p CROSS APPLY ( SELECT NULL FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ) AS ca (dummy); FROM Clause SELECT q1.Name FROM ( SELECT p.Name, cnt = ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) FROM Production.Product AS p ) AS q1 WHERE q1.cnt < 10; This last example uses SUM(1) instead of COUNT and does not require a vector aggregate…you should be able to work out why :) SELECT q.Name FROM ( SELECT p.Name, cnt = ( SELECT SUM(1) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID ) FROM Production.Product AS p ) AS q WHERE q.cnt < 10; The semantics of SQL aggregates are rather odd in places. It definitely pays to get to know the rules, and to be careful to check whether your queries are using scalar or vector aggregates. As we have seen, query plans do not show in which ‘mode’ an aggregate is running and getting it wrong can cause poor performance, wrong results, or both. © 2012 Paul White Twitter: @SQL_Kiwi email: [email protected]

Read the article
Conversion of BizTalk Projects to Use the New WCF-SAP Adaptor

- by Geordie

We are in the process of upgrading our BizTalk Environment from BizTalk 2006 R2 to BizTalk 2010. The SAP adaptor in BizTalk 2010 is an all new and more powerful WCF-SAP adaptor. When my colleagues tested out the new adaptor they discovered that the format of the data extracted from SAP was not identical to the old adaptor. This is not a big deal if the structure of the messages from SAP is simple. In this case we were receiving the delivery and invoice iDocs. Both these structures are complex especially the delivery document. Over the past few years I have tweaked the delivery mapping to remove bugs from original mapping. The idea of redoing these maps did not appeal and due to the current work load was not even an option. I opted for a rather crude alternative of pulling in the iDoc in the new typed format and then adding a static map at the start of the orchestration to convert the data to the old schema. Note WCF-SAP data formats (on the binding tab of the configuration dialog box is the ‘RecieiveIdocFormat’ field): Typed: Returns a XML document with the hierarchy represented in XML and all fields being represented by XML tags. RFC: Returns an XML document with the hierarchy represented in XML but the iDoc lines in flat file format. String: This returns the iDoc in a format that is closest to the original flat file format but is still wrapped with some top level XML tags. The files also contained some strange characters at the end of each line. I started with the invoice document and it was quite straight forward to add the mapping but this is where my problems started. The orchestrations for these documents are dynamic and so require the identity of the partner to be able to correctly configure the orchestration. The partner identity is in the EDI_DC40 segment of the iDoc. In the old project the RECPRN node of the segment was promoted. The code to set a variable to the partner ID was now failing. After lot of head scratching I discovered the problem was due to the addition of Namespaces to the fields in the EDI_DC40 segment. To overcome this I needed to use an xPath query with a Namespace Manager. This had to be done in custom code. I now tried to repeat the process with the delivery document. Unfortunately when we tried to get sample typed data from SAP an exception was thrown. The adapter "WCF-SAP" raised an error message. Details "Microsoft.ServiceModel.Channels.Common.XmlReaderGenerationException: The segment or group definition E2EDKA1001 was not found in the IDoc metadata. The UniqueId of the IDoc type is: IDOCTYP/3/DESADV01/ZASNEXT1/640. For Receive operations, the SAP adapter does not support unreleased segments. Our guess is that when the WCF-SAP adaptor tries to down load the data it retrieves a data schema from SAP. For some reason the schema does not match the data. This may be due to the version of SAP we are running or due to a customization. Either way resolving this problem did not look easy. When doing some research on this problem I found an article showing me how to get the data from SAP using the WCF-SAP adaptor without any XML tags. http://blogs.msdn.com/b/adapters/archive/2007/10/05/receiving-idocs-getting-the-raw-idoc-data.aspx Reproduction of Mustansir blog: Since the WCF based SAP Adapter is ... well, WCF based, all data flowing in and out of the adapter is encapsulated within a SOAP message. Which means there are those pesky xml tags all over the place. If you want to receive an Idoc from SAP, you can receive it in "Typed" format (in which case each column in each segment of the idoc appears within its own xml tag), or you can receive it in "String" format (in which case there are just 2 xml tags at the top, the raw xml data in string/flat file format, and the 2 closing xml tags). In "String" format, an incoming idoc (for ORDERS05, containing 5 data records) would look like: <ReceiveIdoc ><idocData>EDI_DC40 8000000000001064985620 E2EDK01005 800000000000106498500000100000001 E2EDK14 8000000000001064985000002000000020111000 E2EDK14 8000000000001064985000003000000020081000 E2EDK14 80000000000010649850000040000000200710 E2EDK14 80000000000010649850000050000000200600</idocData></ReceiveIdoc> (I have trimmed part of the control record so that it fits cleanly here on one line). Now, you're only interested in the IDOC data, and don't care much for the XML tags. It isn't that difficult to write your own pipeline component, or even some logic in the orchestration to remove the tags, right? Well, you don't need to write any extra code at all - the WCF Adapter can help you here! During the configuration of your one-way Receive Location using WCF-Custom, navigate to the Messages tab. Under the section "Inbound BizTalk Messge Body", select the "Path" radio button, and: (a) Enter the body path expression as: /*[local-name()='ReceiveIdoc']/*[local-name()='idocData'] (b) Choose "String" for the Node Encoding. What we've done is, used an XPATH to pull out the value of the "idocData" node from the XML. Your Receive Location will now emit text containing only the idoc data. You can at this point, for example, put the Flat File Pipeline component to convert the flat text into a different xml format based on some other schema you already have, and receive your version of the xml formatted message in your orchestration. This was potentially a much easier solution than adding the static maps to the orchestrations and overcame the issue with ‘Typed’ delivery documents. Not quite so fast… Note: When I followed Mustansir’s blog the characters at the end of each line disappeared. After configuring the adaptor and passing the iDoc data into the original flat file receive pipelines I was receiving exceptions. There was a failure executing the receive pipeline: "PAPINETPipelines.DeliveryFlatFileReceive, CustomerIntegration2.PAPINET.Pipelines, Version=1.0.0.0, Culture=neutral, PublicKeyToken=4ca3635fbf092bbb" Source: "Pipeline " Receive Port: "recSAP_Delivery" URI: "D:\CustomerIntegration2\SAP\Delivery\*.xml" Reason: An error occurred when parsing the incoming document: "Unexpected data found while looking for: 'Z2EDPZ7' The current definition being parsed is E2EDP07GRP. The stream offset where the error occured is 8859. The line number where the error occured is 23. The column where the error occured is 0.". Although the new flat file looked the same as the old one there was a differences. In the original file all lines in the document were exactly 1064 character long. In the new file all lines were truncated to the last alphanumeric character. The final piece of the puzzle was to add a custom pipeline component to pad all the lines to 1064 characters. This component was added to the decode node of the custom delivery and invoice flat file disassembler pipelines. Execute method of the custom pipeline component: public IBaseMessage Execute(IPipelineContext pc, IBaseMessage inmsg) { //Convert Stream to a string Stream s = null; IBaseMessagePart bodyPart = inmsg.BodyPart; // NOTE inmsg.BodyPart.Data is implemented only as a setter in the http adapter API and a //getter and setter for the file adapter. Use GetOriginalDataStream to get data instead. if (bodyPart != null) s = bodyPart.GetOriginalDataStream(); string newMsg = string.Empty; string strLine; try { StreamReader sr = new StreamReader(s); strLine = sr.ReadLine(); while (strLine != null) { //Execute padding code if (strLine != null) strLine = strLine.PadRight(1064, ' ') + "\r\n"; newMsg += strLine; strLine = sr.ReadLine(); } sr.Close(); } catch (IOException ex) { throw new Exception("Error occured trying to pad the message to 1064 charactors"); } //Convert back to stream and set to Data property inmsg.BodyPart.Data = new MemoryStream(Encoding.UTF8.GetBytes(newMsg)); ; //reset the position of the stream to zero inmsg.BodyPart.Data.Position = 0; return inmsg; }

Read the article
Handling HumanTask attachments in Oracle BPM 11g PS4FP+ (I)

- by ccasares

Adding attachments to a HumanTask is a feature that exists in Oracle HWF (Human Workflow) since 10g. However, in 11g there have been many improvements on this feature and this entry will try to summarize them. Oracle BPM 11g 11.1.1.5.1 (aka PS4 Feature Pack or PS4FP) introduced two great features: Ability to link attachments at a Task scope or at a Process scope: "Task" attachments are only visible within the scope (lifetime) of a task. This means that, initially, any member of the assignment pattern of the Human Task will be able to handle (add, review or remove) attachments. However, once the task is completed, subsequent human tasks will not have access to them. This does not mean those attachments got lost. Once the human task is completed, attachments can be retrieved in order to, i.e., check them in to a Content Server or to inject them to a new and different human task. Aside note: a "re-initiated" human task will inherit comments and attachments, along with history and -optionally- payload. See here for more info. "Process" attachments are visible within the scope of the process. This means that subsequent human tasks in the same process instance will have access to them. Ability to use Oracle WebCenter Content (previously known as "Oracle UCM") as the backend for the attachments instead of using HWF database backend. This feature adds all content server document lifecycle capabilities to HWF attachments (versioning, RBAC, metadata management, etc). As of today, only Oracle WCC is supported. However, Oracle BPM Suite does include a license of Oracle WCC for the solely usage of document management within BPM scope. Here are some code samples that leverage the above features. Retrieving uploaded attachments -Non UCM- Non UCM attachments (default ones or those that have existed from 10g, and are stored "as-is" in HWK database backend) can be retrieved after the completion of the Human Task. Firstly, we need to know whether any attachment has been effectively uploaded to the human task. There are two ways to find it out: Through an XPath function: Checking the execData/attachment[] structure. For example: Once we are sure one ore more attachments were uploaded to the Human Task, we want to get them. In this example, by "get" I mean to get the attachment name and the payload of the file. Aside note: Oracle HWF lets you to upload two kind of [non-UCM] attachments: a desktop document and a Web URL. This example focuses just on the desktop document one. In order to "retrieve" an uploaded Web URL, you can get it directly from the execData/attachment[] structure. Attachment content (payload) is retrieved through the getTaskAttachmentContents() XPath function: This example shows how to retrieve as many attachments as those had been uploaded to the Human Task and write them to the server using the File Adapter service. The sample process excerpt is as follows: A dummy UserTask using "HumanTask1" Human Task followed by a Embedded Subprocess that will retrieve the attachments (we're assuming at least one attachment is uploaded): and once retrieved, we will write each of them back to a file in the server using a File Adapter service: In detail: We've defined an XSD structure that will hold the attachments (both name and payload): Then, we can create a BusinessObject based on such element (attachmentCollection) and create a variable (named attachmentBPM) of such BusinessObject type. We will also need to keep a copy of the HumanTask output's execData structure. Therefore we need to create a variable of type TaskExecutionData... ...and copy the HumanTask output execData to it: Now we get into the embedded subprocess that will retrieve the attachments' payload. First, and using an XSLT transformation, we feed the attachmentBPM variable with the name of each attachment and setting an empty value to the payload: Please note that we're using the XSLT for-each node to create as many target structures as necessary. Also note that we're setting an Empty text to the payload variable. The reason for this is to make sure the <payload></payload> tag gets created. This is needed when we map the payload to the XML variable later. Aside note: We are assuming that we're retrieving non-UCM attachments. However in real life you might want to check the type of attachment you're handling. The execData/attachment[]/storageType contains the values "UCM" for UCM type attachments, "TASK" for non-UCM ones or "URL" for Web URL ones. Those values are part of the "Ext.Com.Oracle.Xmlns.Bpel.Workflow.Task.StorageTypeEnum" enumeration. Once we have fed the attachmentsBPM structure and so it now contains the name of each of the attachments, it is time to iterate through it and get the payload. Therefore we will use a new embedded subprocess of type MultiInstance, that will iterate over the attachmentsBPM/attachment[] element: In every iteration we will use a Script activity to map the corresponding payload element with the result of the XPath function getTaskAttachmentContents(). Please, note how the target array element is indexed with the loopCounter predefined variable, so that we make sure we're feeding the right element during the array iteration: The XPath function used looks as follows: hwf:getTaskAttachmentContents(bpmn:getDataObject('UserTask1LocalExecData')/ns1:systemAttributes/ns1:taskId, bpmn:getDataObject('attachmentsBPM')/ns:attachment[bpmn:getActivityInstanceAttribute('SUBPROCESS3067107484296', 'loopCounter')]/ns:fileName) where the input parameters are: taskId of the just completed Human Task attachment name we're retrieving the payload from array index (loopCounter predefined variable) Aside note: The reason whereby we're iterating the execData/attachment[] structure through embedded subprocess and not, i.e., using XSLT and for-each nodes, is mostly because the getTaskAttachmentContents() XPath function is currently not available in XSLT mappings. So all this example might be considered as a workaround until this gets fixed/enhanced in future releases. Once this embedded subprocess ends, we will have all attachments (name + payload) in the attachmentsBPM variable, which is the main goal of this sample. But in order to test everything runs fine, we finish the sample writing each attachment to a file. To that end we include a final embedded subprocess to concurrently iterate through each attachmentsBPM/attachment[] element: On each iteration we will use a Service activity that invokes a File Adapter write service. In here we have two important parameters to set. First, the payload itself. The file adapter awaits binary data in base64 format (string). We have to map it using XPath (Simple mapping doesn't recognize a String as a base64-binary valid target): Second, we must set the target filename using the Service Properties dialog box: Again, note how we're making use of the loopCounter index variable to get the right element within the embedded subprocess iteration. Handling UCM attachments will be part of a different and upcoming blog entry. Once I finish will all posts on this matter, I will upload the whole sample project to java.net.

Read the article
Is RTD Stateless or Stateful?

- by [email protected]

Yes. A stateless service is one where each request is an independent transaction that can be processed by any of the servers in a cluster. A stateful service is one where state is kept in a server's memory from transaction to transaction, thus necessitating the proper routing of requests to the right server. The main advantage of stateless systems is simplicity of design. The main advantage of stateful systems is performance. I'm often asked whether RTD is a stateless or stateful service, so I wanted to clarify this issue in depth so that RTD's architecture will be properly understood. The short answer is: "RTD can be configured as a stateless or stateful service." The performance difference between stateless and stateful systems can be very significant, and while in a call center implementation it may be reasonable to use a pure stateless configuration, a web implementation that produces thousands of requests per second is practically impossible with a stateless configuration. RTD's performance is orders of magnitude better than most competing systems. RTD was architected from the ground up to achieve this performance. Features like automatic and dynamic compression of prediction models, automatic translation of metadata to machine code, lack of interpreted languages, and separation of model building from decisioning contribute to achieving this performance level. Because of this focus on performance we decided to have RTD's default configuration work in a stateful manner. By being stateful RTD requests are typically handled in a few milliseconds when repeated requests come to the same session. Now, those readers that have participated in implementations of RTD know that RTD's architecture is also focused on reducing Total Cost of Ownership (TCO) with features like automatic model building, automatic time windows, automatic maintenance of database tables, automatic evaluation of data mining models, automatic management of models partitioned by channel, geography, etcetera, and hot swapping of configurations. How do you reconcile the need for a low TCO and the need for performance? How do you get the performance of a stateful system with the simplicity of a stateless system? The answer is that you make the system behave like a stateless system to the exterior, but you let it automatically take advantage of situations where being stateful is better. For example, one of the advantages of stateless systems is that you can route a message to any server in a cluster, without worrying about sending it to the same server that was handling the session in previous messages. With an RTD stateful configuration you can still route the message to any server in the cluster, so from the point of view of the configuration of other systems, it is the same as a stateless service. The difference though comes in performance, because if the message arrives to the right server, RTD can serve it without any external access to the session's state, thus tremendously reducing processing time. In typical implementations it is not rare to have high percentages of messages routed directly to the right server, while those that are not, are easily handled by forwarding the messages to the right server. This architecture usually provides the best of both worlds with performance and simplicity of configuration. Configuring RTD as a pure stateless service A pure stateless configuration requires session data to be persisted at the end of handling each and every message and reloading that data at the beginning of handling any new message. This is of course, the root of the inefficiency of these configurations. This is also the reason why many "stateless" implementations actually do keep state to take advantage of a request coming back to the same server. Nevertheless, if the implementation requires a pure stateless decision service, this is easy to configure in RTD. The way to do it is: Mark every Integration Point to Close the session at the end of processing the message In the Session entity persist the session data on closing the session In the session entity check if a persisted version exists and load it An excellent solution for persisting the session data is Oracle Coherence, which provides a high performance, distributed cache that minimizes the performance impact of persisting and reloading the session. Alternatively, the session can be persisted to a local database. An interesting feature of the RTD stateless configuration is that it can cope with serializing concurrent requests for the same session. For example, if a web page produces two requests to the decision service, these requests could come concurrently to the decision services and be handled by different servers. Most stateless implementation would have the two requests step onto each other when saving the state, or fail one of the messages. When properly configured, RTD will make one message wait for the other before processing. A Word on Context Using the context of a customer interaction typically significantly increases lift. For example, offer success in a call center could double if the context of the call is taken into account. For this reason, it is important to utilize the contextual information in decision making. To make the contextual information available throughout a session it needs to be persisted. When there is a well defined owner for the information then there is no problem because in case of a session restart, the information can be easily retrieved. If there is no official owner of the information, then RTD can be configured to persist this information. Once again, RTD provides flexibility to ensure high performance when it is adequate to allow for some loss of state in the rare cases of server failure. For example, in a heavy use web site that serves 1000 pages per second the navigation history may be stored in the in memory session. In such sites it is typical that there is no OLTP that stores all the navigation events, therefore if an RTD server were to fail, it would be possible for the navigation to that point to be lost (note that a new session would be immediately established in one of the other servers). In most cases the loss of this navigation information would be acceptable as it would happen rarely. If it is desired to save this information, RTD would persist it every time the visitor navigates to a new page. Note that this practice is preferred whether RTD is configured in a stateless or stateful manner.

Read the article
SQL SERVER – Weekly Series – Memory Lane – #032

- by Pinal Dave

Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 Complete Series of Database Coding Standards and Guidelines SQL SERVER Database Coding Standards and Guidelines – Introduction SQL SERVER – Database Coding Standards and Guidelines – Part 1 SQL SERVER – Database Coding Standards and Guidelines – Part 2 SQL SERVER Database Coding Standards and Guidelines Complete List Download Explanation and Example – SELF JOIN When all of the data you require is contained within a single table, but data needed to extract is related to each other in the table itself. Examples of this type of data relate to Employee information, where the table may have both an Employee’s ID number for each record and also a field that displays the ID number of an Employee’s supervisor or manager. To retrieve the data tables are required to relate/join to itself. Insert Multiple Records Using One Insert Statement – Use of UNION ALL This is very interesting question I have received from new developer. How can I insert multiple values in table using only one insert? Now this is interesting question. When there are multiple records are to be inserted in the table following is the common way using T-SQL. Function to Display Current Week Date and Day – Weekly Calendar Straight blog post with script to find current week date and day based on the parameters passed in the function. 2008 In my beginning years, I have almost same confusion as many of the developer had in their earlier years. Here are two of the interesting question which I have attempted to answer in my early year. Even if you are experienced developer may be you will still like to read following two questions: Order Of Column In Index Order of Conditions in WHERE Clauses Example of DISTINCT in Aggregate Functions Have you ever used DISTINCT with the Aggregation Function? Here is a simple example about how users can do it. Create a Comma Delimited List Using SELECT Clause From Table Column Straight to script example where I explained how to do something easy and quickly. Compound Assignment Operators SQL SERVER 2008 has introduced new concept of Compound Assignment Operators. Compound Assignment Operators are available in many other programming languages for quite some time. Compound Assignment Operators is operator where variables are operated upon and assigned on the same line. PIVOT and UNPIVOT Table Examples Here is a very interesting question – the answer to the question can be YES or NO both. “If we PIVOT any table and UNPIVOT that table do we get our original table?” Read the blog post to get the explanation of the question above. 2009 What is Interim Table – Simple Definition of Interim Table The interim table is a table that is generated by joining two tables and not the final result table. In other words, when two tables are joined they create an interim table as resultset but the resultset is not final yet. It may be possible that more tables are about to join on the interim table, and more operations are still to be applied on that table (e.g. Order By, Having etc). Besides, it may be possible that there is no interim table; sometimes final table is what is generated when the query is run. 2010 Stored Procedure and Transactions If Stored Procedure is transactional then, it should roll back complete transactions when it encounters any errors. Well, that does not happen in this case, which proves that Stored Procedure does not only provide just the transactional feature to a batch of T-SQL. Generate Database Script for SQL Azure When talking about SQL Azure the most common complaint I hear is that the script generated from stand-along SQL Server database is not compatible with SQL Azure. This was true for some time for sure but not any more. If you have SQL Server 2008 R2 installed you can follow the guideline below to generate a script which is compatible with SQL Azure. Convert IN to EXISTS – Performance Talk It is NOT necessary that every time when IN is replaced by EXISTS it gives better performance. However, in our case listed above it does for sure give better performance. You can read about this subject in the associated blog post. Subquery or Join – Various Options – SQL Server Engine Knows the Best Every single time whenever there is a performance tuning exercise, I hear the conversation from developer where some prefer subquery and some prefer join. In this two part blog post, I explain the same in the detail with examples. Part 1 | Part 2 Merge Operations – Insert, Update, Delete in Single Execution MERGE is a new feature that provides an efficient way to do multiple DML operations. In earlier versions of SQL Server, we had to write separate statements to INSERT, UPDATE, or DELETE data based on certain conditions; however, at present, by using the MERGE statement, we can include the logic of such data changes in one statement that even checks when the data is matched and then just update it, and similarly, when the data is unmatched, it is inserted. 2011 Puzzle – Statistics are not updated but are Created Once Here is the quick scenario about my setup. Create Table Insert 1000 Records Check the Statistics Now insert 10 times more 10,000 indexes Check the Statistics – it will be NOT updated – WHY? Question to You – When to use Function and When to use Stored Procedure Personally, I believe that they are both different things - they cannot be compared. I can say, it will be like comparing apples and oranges. Each has its own unique use. However, they can be used interchangeably at many times and in real life (i.e., production environment). I have personally seen both of these being used interchangeably many times. This is the precise reason for asking this question. 2012 In year 2012 I had two interesting series ran on the blog. If there is no fun in learning, the learning becomes a burden. For the same reason, I had decided to build a three part quiz around SEQUENCE. The quiz was to identify the next value of the sequence. I encourage all of you to take part in this fun quiz. Guess the Next Value – Puzzle 1 Guess the Next Value – Puzzle 2 Guess the Next Value – Puzzle 3 Guess the Next Value – Puzzle 4 Simple Example to Configure Resource Governor – Introduction to Resource Governor Resource Governor is a feature which can manage SQL Server Workload and System Resource Consumption. We can limit the amount of CPU and memory consumption by limiting /governing /throttling on the SQL Server. If there are different workloads running on SQL Server and each of the workload needs different resources or when workloads are competing for resources with each other and affecting the performance of the whole server resource governor is a very important task. Tricks to Replace SELECT * with Column Names – SQL in Sixty Seconds #017 – Video Retrieves unnecessary columns and increases network traffic When a new columns are added views needs to be refreshed manually Leads to usage of sub-optimal execution plan Uses clustered index in most of the cases instead of using optimal index It is difficult to debug SQL SERVER – Load Generator – Free Tool From CodePlex The best part of this SQL Server Load Generator is that users can run multiple simultaneous queries again SQL Server using different login account and different application name. The interface of the tool is extremely easy to use and very intuitive as well. A Puzzle – Swap Value of Column Without Case Statement Let us assume there is a single column in the table called Gender. The challenge is to write a single update statement which will flip or swap the value in the column. For example if the value in the gender column is ‘male’ swap it with ‘female’ and if the value is ‘female’ swap it with ‘male’. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
SQL SERVER – Data Sources and Data Sets in Reporting Services SSRS

- by Pinal Dave

This example is from the Beginning SSRS by Kathi Kellenberger. Supporting files are available with a free download from the www.Joes2Pros.com web site. This example is from the Beginning SSRS. Supporting files are available with a free download from the www.Joes2Pros.com web site. Connecting to Your Data? When I was a child, the telephone book was an important part of my life. Maybe I was just a nerd, but I enjoyed getting a new book every year to page through to learn about the businesses in my small town or to discover where some of my school acquaintances lived. It was also the source of maps to my town’s neighborhoods and the towns that surrounded me. To make a phone call, I would need a telephone number. In order to find a telephone number, I had to know how to use the telephone book. That seems pretty simple, but it resembles connecting to any data. You have to know where the data is and how to interact with it. A data source is the connection information that the report uses to connect to the database. You have two choices when creating a data source, whether to embed it in the report or to make it a shared resource usable by many reports. Data Sources and Data Sets A few basic terms will make the upcoming choses make more sense. What database on what server do you want to connect to? It would be better to just ask… “what is your data source?” The connection you need to make to get your reports data is called a data source. If you connected to a data source (like the JProCo database) there may be hundreds of tables. You probably only want data from just a few tables. This means you want to write a specific query against this data source. A query on a data source to get just the records you need for an SSRS report is called a Data Set. Creating a local Data Source You can connect embed a connection from your report directly to your JProCo database which (let’s say) is installed on a server named Reno. If you move JProCo to a new server named Tampa then you need to update the Data Set. If you have 10 reports in one project that were all pointing to the JProCo database on the Reno server then they would all need to be updated at once. It’s possible to make a project level Data Source and have each report use that. This means one change can fix all 10 reports at once. This would be called a Shared Data Source. Creating a Shared Data Source The best advice I can give you is to create shared data sources. The reason I recommend this is that if a database moves to a new server you will have just one place in Report Manager to make the server name change. That one change will update the connection information in all the reports that use that data source. To get started, you will start with a fresh project. Go to Start > All Programs > SQL Server 2012 > Microsoft SQL Server Data Tools to launch SSDT. Once SSDT is running, click New Project to create a new project. Once the New Project dialog box appears, fill in the form, as shown in. Be sure to select Report Server Project this time – not the wizard. Click OK to dismiss the New Project dialog box. You should now have an empty project, as shown in the Solution Explorer. A report is meant to show you data. Where is the data? The first task is to create a Shared Data Source. Right-click on the Shared Data Sources folder and choose Add New Data Source. The Shared Data Source Properties dialog box will launch where you can fill in a name for the data source. By default, it is named DataSource1. The best practice is to give the data source a more meaningful name. It is possible that you will have projects with more than one data source and, by naming them, you can tell one from another. Type the name JProCo for the data source name and click the Edit button to configure the database connection properties. If you take a look at the types of data sources you can choose, you will see that SSRS works with many data platforms including Oracle, XML, and Teradata. Make sure SQL Server is selected before continuing. For this post, I am assuming that you are using a local SQL Server and that you can use your Windows account to log in to the SQL Server. If, for some reason you must use SQL Server Authentication, choose that option and fill in your SQL Server account credentials. Otherwise, just accept Windows Authentication. If your database server was installed locally and with the default instance, just type in Localhost for the Server name. Select the JProCo database from the database list. At this point, the connection properties should look like. If you have installed a named instance of SQL Server, you will have to specify the server name like this: Localhost\InstanceName, replacing the InstanceName with whatever your instance name is. If you are not sure about the named instance, launch the SQL Server Configuration Manager found at Start > All Programs > Microsoft SQL Server 2012 > Configuration Tools. If you have a named instance, the name will be shown in parentheses. A default instance of SQL Server will display MSSQLSERVER; a named instance will display the name chosen during installation. Once you get the connection properties filled in, click OK to dismiss the Connection Properties dialog box and OK again to dismiss the Shared Data Source properties. You now have a data source in the Solution Explorer. What’s next I really need to thank Kathi Kellenberger and Rick Morelan for sharing this material for this 5 day series of posts on SSRS. To get really comfortable with SSRS you will get to know the different SSDT windows, Build reports on your own (without the wizards), Add report headers and footers, Accept user input, create levels, charts, or even maps for visual appeal. You might be surprise to know a small 230 page book starts from the very beginning and covers the steps to do all these items. Beginning SSRS 2012 is a small easy to follow book so you can learn SSRS for less than $20. See Joes2Pros.com for more on this and other books. If you want to learn SSRS in easy to simple words – I strongly recommend you to get Beginning SSRS book from Joes 2 Pros. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: Reporting Services, SSRS

Read the article
MySQL for Excel 1.1.3 has been released

- by Javier Treviño

The MySQL Windows Experience Team is proud to announce the release of MySQL for Excel version 1.1.3, the latest addition to the MySQL Installer for Windows. MySQL for Excel is an application plug-in enabling data analysts to very easily access and manipulate MySQL data within Microsoft Excel. It enables you to directly work with a MySQL database from within Microsoft Excel so you can easily do tasks such as: Importing MySQL Data into Excel Exporting Excel data directly into MySQL to a new or existing table Editing MySQL data directly within Excel MySQL for Excel is installed using the MySQL Installer for Windows. The MySQL installer comes in 2 versions Full (150 MB) which includes a complete set of MySQL products with their binaries included in the download Web (1.5 MB - a network install) which will just pull MySQL for Excel over the web and install it when run. You can download MySQL Installer from our official Downloads page at http://dev.mysql.com/downloads/installer/. MySQL for Excel 1.1.3 introduces the following features: Upon saving a Workbook containing Worksheets in Edit Mode, the user is asked if he wants to exit the Edit Mode on all Worksheets before their parent Workbook is saved so the Worksheets are saved unprotected, otherwise the Worksheets will remain protected and the users will be able to unprotect them later retrieving the passkeys from the application log after closing MySQL for Excel. Added background coloring to the column names header row of an Import Data operation to have the same look as the one in an Edit Data operation (i.e. gray-ish background). Connection passwords can be stored securely just like MySQL Workbench does and these secured passwords are shared with Workbench in the same way connections are. Changed the way the MySQL for Excel ribbon toggle button works, instead of just showing or hiding the add-in it actually opens and closes it. Added a connection test before any operation against the database (schema creation, data import, append, export or edition) so the operation dialog is not shown and a friendlier error message is shown. Also this release contains the following bug fixes: Added a check on every connection test for an expired password, if the password has been expired a dialog is now shown to the user to reset the password. Bug #17354118 - DON'T HANDLE EXPIRED PASSWORDS Added code to escape text values to be imported to an Excel worksheet that start with an equals sign so Excel does not treat those values as formulas that will fail evaluation. This is an option turned on by default that can be turned off by users if they wish to import values to be treated as Excel formulas. Bug #17354102 - ERROR IMPORTING TEXT VALUES TO EXCEL STARTING WITH AN EQUALS SIGN Added code to properly check the reason for a failing connection, if it's a failing password the user gets a dialog to retry the connection with a different password until the connection succeeds, a connection error not related to the password is thrown or the user cancels. If the failing connection is not related to a bad password an error message is shown to the users indicating the reason of the failure. Bug #16239007 - CONNECTIONS TO MYSQL SERVICES NOT RUNNING DISPLAY A WRONG PASSWORD ERROR MESSAGE Added global options dialog that can be accessed from the Schema Selection and DB Object Selection panels where the timeouts for the connection to the DB Server and for the query commands can be changed from their default values (15 seconds for the connection timeout and 30 seconds for the query timeout). MySQL Bug #68732, Bug #17191646 - QUERY TIMEOUT CANNOT BE ADJUSTED IN MYSQL FOR EXCEL Changed the Varchar(65,535) data type shown in the Export Data data type combo box to Text since the maximum row size is 65,535 bytes and any autodetected column data type with a length greater than 4,000 should be set to Text actually for the table to be created successfully. MySQL Bug #69779, Bug #17191633 - EXPORT FAILS FOR EXCEL FILES CONTAINING > 4000 CHARACTERS OF TEXT PER CELL Removed code that was replacing all spaces typed by the user in an overriden data type for a new column in an Export Data operation, also improved the data type detection code to flag as invalid data types with parenthesis but without any text inside or where the contents inside the parenthesis are not valid for the specific data type. Bug #17260260 - EXPORT DATA SET TYPE NOT WORKING WITH MEMBER VALUES CONTAINING SPACES Added support for the year data type with a length of 2 or 4 and a validation that valid values are integers between 1901-2155 (for 4-digit years) or between 0-99 (for 2-digit years). Bug #17259915 - EXPORT DATA YEAR DATA TYPE NOT RECOGNIZED IF DECLARED WITH A DISPLAY WIDTH) Fixed code for Export Data operations where users overrode the data type for columns typing Text in the data type combobox, which is a valid data type but was not recognized as such. Bug #17259490 - EXPORT DATA TEXT DATA TYPE NOT RECOGNIZED AS A VALID DATA TYPE Changed the location of the registry where the MySQL for Excel add-in is installed to HKEY_LOCAL_MACHINE instead of HKEY_CURRENT_USER so the add-in is accessible by all users and not only to the user that installed it. For this to work with Excel 2007 a hotfix may be required (see http://support.microsoft.com/kb/976477). MySQL Bug #68746, Bug #16675992 - EXCEL-ADD-IN IS ONLY INSTALLED FOR USER ACCOUNT THAT THE INSTALLATION RUNS UNDER Added support for Excel 2013 Single Document Interface, now that Excel 2013 creates 1 window per workbook also the Excel Add-In maintains an independent custom task pane in each window. MySQL Bug #68792, Bug #17272087 - MYSQL FOR EXCEL SIDEBAR DOES NOT APPEAR IN EXCEL 2013 (WITH WORKAROUND) Included the latest MySQL Utility with a code fix for the COM exception thrown when attempting to open Workbench in the Manage Connections window. Bug #17258966 - MYSQL WORKBENCH NOT OPENED BY CLICKING MANAGE CONNECTIONS HOTLABEL Fixed code for Append Data operations that was not applying a calculated automatic mapping correctly when the source and target tables had different number of columns, some columns with the same name but some of those lying on column indexes beyond the limit of the other source/target table. MySQL Bug #69220, Bug #17278349 - APPEND DOESN'T AUTOMATICALLY DETECT EXCEL COL HEADER WITH SAME NAME AS SQL FIELD Fixed some code for Edit Data operations that was escaping special characters twice (during edition in Excel and then upon sending the query to the MySQL server). MySQL Bug #68669, Bug #17271693 - A BACKSLASH IS INSERTED BEFORE AN APOSTROPHE EDITING TABLE WITH MYSQL FOR EXCEL Upgraded MySQL Utility with latest version that encapsulates dialog base classes and introduces more classes to handle Workbench connections, and removed these from the Excel project. Bug #16500331 - CAN'T DELETE CONNECTIONS CREATED WITHIN ADDIN You can access the MySQL for Excel documentation at http://dev.mysql.com/doc/refman/5.6/en/mysql-for-excel.html You can find our team’s blog at http://blogs.oracle.com/MySQLOnWindows. You can also post questions on our MySQL for Excel forum found at http://forums.mysql.com/. Enjoy and thanks for the support!

Read the article
C# Performance Pitfall – Interop Scenarios Change the Rules

- by Reed

C# and .NET, overall, really do have fantastic performance in my opinion. That being said, the performance characteristics dramatically differ from native programming, and take some relearning if you’re used to doing performance optimization in most other languages, especially C, C++, and similar. However, there are times when revisiting tricks learned in native code play a critical role in performance optimization in C#. I recently ran across a nasty scenario that illustrated to me how dangerous following any fixed rules for optimization can be… The rules in C# when optimizing code are very different than C or C++. Often, they’re exactly backwards. For example, in C and C++, lifting a variable out of loops in order to avoid memory allocations often can have huge advantages. If some function within a call graph is allocating memory dynamically, and that gets called in a loop, it can dramatically slow down a routine. This can be a tricky bottleneck to track down, even with a profiler. Looking at the memory allocation graph is usually the key for spotting this routine, as it’s often “hidden” deep in call graph. For example, while optimizing some of my scientific routines, I ran into a situation where I had a loop similar to: for (i=0; i<numberToProcess; ++i) { // Do some work ProcessElement(element[i]); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } This loop was at a fairly high level in the call graph, and often could take many hours to complete, depending on the input data. As such, any performance optimization we could achieve would be greatly appreciated by our users. After a fair bit of profiling, I noticed that a couple of function calls down the call graph (inside of ProcessElement), there was some code that effectively was doing: // Allocate some data required DataStructure* data = new DataStructure(num); // Call into a subroutine that passed around and manipulated this data highly CallSubroutine(data); // Read and use some values from here double values = data->Foo; // Cleanup delete data; // ... return bar; Normally, if “DataStructure” was a simple data type, I could just allocate it on the stack. However, it’s constructor, internally, allocated it’s own memory using new, so this wouldn’t eliminate the problem. In this case, however, I could change the call signatures to allow the pointer to the data structure to be passed into ProcessElement and through the call graph, allowing the inner routine to reuse the same “data” memory instead of allocating. At the highest level, my code effectively changed to something like: DataStructure* data = new DataStructure(numberToProcess); for (i=0; i<numberToProcess; ++i) { // Do some work ProcessElement(element[i], data); } delete data; Granted, this dramatically reduced the maintainability of the code, so it wasn’t something I wanted to do unless there was a significant benefit. In this case, after profiling the new version, I found that it increased the overall performance dramatically – my main test case went from 35 minutes runtime down to 21 minutes. This was such a significant improvement, I felt it was worth the reduction in maintainability. In C and C++, it’s generally a good idea (for performance) to: Reduce the number of memory allocations as much as possible, Use fewer, larger memory allocations instead of many smaller ones, and Allocate as high up the call stack as possible, and reuse memory I’ve seen many people try to make similar optimizations in C# code. For good or bad, this is typically not a good idea. The garbage collector in .NET completely changes the rules here. In C#, reallocating memory in a loop is not always a bad idea. In this scenario, for example, I may have been much better off leaving the original code alone. The reason for this is the garbage collector. The GC in .NET is incredibly effective, and leaving the allocation deep inside the call stack has some huge advantages. First and foremost, it tends to make the code more maintainable – passing around object references tends to couple the methods together more than necessary, and overall increase the complexity of the code. This is something that should be avoided unless there is a significant reason. Second, (unlike C and C++) memory allocation of a single object in C# is normally cheap and fast. Finally, and most critically, there is a large advantage to having short lived objects. If you lift a variable out of the loop and reuse the memory, its much more likely that object will get promoted to Gen1 (or worse, Gen2). This can cause expensive compaction operations to be required, and also lead to (at least temporary) memory fragmentation as well as more costly collections later. As such, I’ve found that it’s often (though not always) faster to leave memory allocations where you’d naturally place them – deep inside of the call graph, inside of the loops. This causes the objects to stay very short lived, which in turn increases the efficiency of the garbage collector, and can dramatically improve the overall performance of the routine as a whole. In C#, I tend to: Keep variable declarations in the tightest scope possible Declare and allocate objects at usage While this tends to cause some of the same goals (reducing unnecessary allocations, etc), the goal here is a bit different – it’s about keeping the objects rooted for as little time as possible in order to (attempt) to keep them completely in Gen0, or worst case, Gen1. It also has the huge advantage of keeping the code very maintainable – objects are used and “released” as soon as possible, which keeps the code very clean. It does, however, often have the side effect of causing more allocations to occur, but keeping the objects rooted for a much shorter time. Now – nowhere here am I suggesting that these rules are hard, fast rules that are always true. That being said, my time spent optimizing over the years encourages me to naturally write code that follows the above guidelines, then profile and adjust as necessary. In my current project, however, I ran across one of those nasty little pitfalls that’s something to keep in mind – interop changes the rules. In this case, I was dealing with an API that, internally, used some COM objects. In this case, these COM objects were leading to native allocations (most likely C++) occurring in a loop deep in my call graph. Even though I was writing nice, clean managed code, the normal managed code rules for performance no longer apply. After profiling to find the bottleneck in my code, I realized that my inner loop, a innocuous looking block of C# code, was effectively causing a set of native memory allocations in every iteration. This required going back to a “native programming” mindset for optimization. Lifting these variables and reusing them took a 1:10 routine down to 0:20 – again, a very worthwhile improvement. Overall, the lessons here are: Always profile if you suspect a performance problem – don’t assume any rule is correct, or any code is efficient just because it looks like it should be Remember to check memory allocations when profiling, not just CPU cycles Interop scenarios often cause managed code to act very differently than “normal” managed code. Native code can be hidden very cleverly inside of managed wrappers

Read the article
tile_static, tile_barrier, and tiled matrix multiplication with C++ AMP

- by Daniel Moth

We ended the previous post with a mechanical transformation of the C++ AMP matrix multiplication example to the tiled model and in the process introduced tiled_index and tiled_grid. This is part 2. tile_static memory You all know that in regular CPU code, static variables have the same value regardless of which thread accesses the static variable. This is in contrast with non-static local variables, where each thread has its own copy. Back to C++ AMP, the same rules apply and each thread has its own value for local variables in your lambda, whereas all threads see the same global memory, which is the data they have access to via the array and array_view. In addition, on an accelerator like the GPU, there is a programmable cache, a third kind of memory type if you'd like to think of it that way (some call it shared memory, others call it scratchpad memory). Variables stored in that memory share the same value for every thread in the same tile. So, when you use the tiled model, you can have variables where each thread in the same tile sees the same value for that variable, that threads from other tiles do not. The new storage class for local variables introduced for this purpose is called tile_static. You can only use tile_static in restrict(direct3d) functions, and only when explicitly using the tiled model. What this looks like in code should be no surprise, but here is a snippet to confirm your mental image, using a good old regular C array // each tile of threads has its own copy of locA, // shared among the threads of the tile tile_static float locA[16][16]; Note that tile_static variables are scoped and have the lifetime of the tile, and they cannot have constructors or destructors. tile_barrier In amp.h one of the types introduced is tile_barrier. You cannot construct this object yourself (although if you had one, you could use a copy constructor to create another one). So how do you get one of these? You get it, from a tiled_index object. Beyond the 4 properties returning index objects, tiled_index has another property, barrier, that returns a tile_barrier object. The tile_barrier class exposes a single member, the method wait. 15: // Given a tiled_index object named t_idx 16: t_idx.barrier.wait(); 17: // more code …in the code above, all threads in the tile will reach line 16 before a single one progresses to line 17. Note that all threads must be able to reach the barrier, i.e. if you had branchy code in such a way which meant that there is a chance that not all threads could reach line 16, then the code above would be illegal. Tiled Matrix Multiplication Example – part 2 So now that we added to our understanding the concepts of tile_static and tile_barrier, let me obfuscate rewrite the matrix multiplication code so that it takes advantage of tiling. Before you start reading this, I suggest you get a cup of your favorite non-alcoholic beverage to enjoy while you try to fully understand the code. 01: void MatrixMultiplyTiled(vector<float>& vC, const vector<float>& vA, const vector<float>& vB, int M, int N, int W) 02: { 03: static const int TS = 16; 04: array_view<const float,2> a(M, W, vA); 05: array_view<const float,2> b(W, N, vB); 06: array_view<writeonly<float>,2> c(M,N,vC); 07: parallel_for_each(c.grid.tile< TS, TS >(), 08: [=] (tiled_index< TS, TS> t_idx) restrict(direct3d) 09: { 10: int row = t_idx.local[0]; int col = t_idx.local[1]; 11: float sum = 0.0f; 12: for (int i = 0; i < W; i += TS) { 13: tile_static float locA[TS][TS], locB[TS][TS]; 14: locA[row][col] = a(t_idx.global[0], col + i); 15: locB[row][col] = b(row + i, t_idx.global[1]); 16: t_idx.barrier.wait(); 17: for (int k = 0; k < TS; k++) 18: sum += locA[row][k] * locB[k][col]; 19: t_idx.barrier.wait(); 20: } 21: c[t_idx.global] = sum; 22: }); 23: } Notice that all the code up to line 9 is the same as per the changes we made in part 1 of tiling introduction. If you squint, the body of the lambda itself preserves the original algorithm on lines 10, 11, and 17, 18, and 21. The difference being that those lines use new indexing and the tile_static arrays; the tile_static arrays are declared and initialized on the brand new lines 13-15. On those lines we copy from the global memory represented by the array_view objects (a and b), to the tile_static vanilla arrays (locA and locB) – we are copying enough to fit a tile. Because in the code that follows on line 18 we expect the data for this tile to be in the tile_static storage, we need to synchronize the threads within each tile with a barrier, which we do on line 16 (to avoid accessing uninitialized memory on line 18). We also need to synchronize the threads within a tile on line 19, again to avoid the race between lines 14, 15 (retrieving the next set of data for each tile and overwriting the previous set) and line 18 (not being done processing the previous set of data). Luckily, as part of the awesome C++ AMP debugger in Visual Studio there is an option that helps you find such races, but that is a story for another blog post another time. May I suggest reading the next section, and then coming back to re-read and walk through this code with pen and paper to really grok what is going on, if you haven't already? Cool. Why would I introduce this tiling complexity into my code? Funny you should ask that, I was just about to tell you. There is only one reason we tiled our extent, had to deal with finding a good tile size, ensure the number of threads we schedule are correctly divisible with the tile size, had to use a tiled_index instead of a normal index, and had to understand tile_barrier and to figure out where we need to use it, and double the size of our lambda in terms of lines of code: the reason is to be able to use tile_static memory. Why do we want to use tile_static memory? Because accessing tile_static memory is around 10 times faster than accessing the global memory on an accelerator like the GPU, e.g. in the code above, if you can get 150GB/second accessing data from the array_view a, you can get 1500GB/second accessing the tile_static array locA. And since by definition you are dealing with really large data sets, the savings really pay off. We have seen tiled implementations being twice as fast as their non-tiled counterparts. Now, some algorithms will not have performance benefits from tiling (and in fact may deteriorate), e.g. algorithms that require you to go only once to global memory will not benefit from tiling, since with tiling you already have to fetch the data once from global memory! Other algorithms may benefit, but you may decide that you are happy with your code being 150 times faster than the serial-version you had, and you do not need to invest to make it 250 times faster. Also algorithms with more than 3 dimensions, which C++ AMP supports in the non-tiled model, cannot be tiled. Also note that in future releases, we may invest in making the non-tiled model, which already uses tiling under the covers, go the extra step and use tile_static memory on your behalf, but it is obviously way to early to commit to anything like that, and we certainly don't do any of that today. Comments about this post by Daniel Moth welcome at the original blog.

Read the article
SQL University: What and why of database testing

- by Mladen Prajdic

This is a post for a great idea called SQL University started by Jorge Segarra also famously known as SqlChicken on Twitter. It’s a collection of blog posts on different database related topics contributed by several smart people all over the world. So this week is mine and we’ll be talking about database testing and refactoring. In 3 posts we’ll cover: SQLU part 1 - What and why of database testing SQLU part 2 - What and why of database refactoring SQLU part 2 – Tools of the trade With that out of the way let us sharpen our pencils and get going. Why test a database The sad state of the industry today is that there is very little emphasis on testing in general. Test driven development is still a small niche of the programming world while refactoring is even smaller. The cause of this is the inability of developers to convince themselves and their managers that writing tests is beneficial. At the moment they are mostly viewed as waste of time. This is because the average person (let’s not fool ourselves, we’re all average) is unable to think about lower future costs in relation to little more current work. It’s orders of magnitude easier to know about the current costs in relation to current amount of work. That’s why programmers convince themselves testing is a waste of time. However we have to ask ourselves what tests are really about? Maybe finding bugs? No, not really. If we introduce bugs, we’re likely to write test around those bugs too. But yes we can find some bugs with tests. The main point of tests is to have reproducible repeatability in our systems. By having a code base largely covered by tests we can know with better certainty what a small code change can break in other parts of the system. By having repeatability we can make code changes with confidence, since we know we’ll see what breaks in other tests. And here comes the inability to estimate future costs. By spending just a few more hours writing those tests we’d know instantly what broke where. Imagine we fix a reported bug. We check-in the code, deploy it and the users are happy. Until we get a call 2 weeks later about a certain monthly process has stopped working. What we don’t know is that this process was developed by a long gone coworker and for some reason it relied on that same bug we’ve happily fixed. There’s no way we could’ve known that. We say OK and go in and fix the monthly process. But what we have no clue about is that there’s this ETL job that relied on data from that monthly process. Now that we’ve fixed the process it’s giving unexpected (yet correct since we fixed it) data to the ETL job. So we have to fix that too. But there’s this part of the app we coded that relies on data from that exact ETL job. And just like that we enter the “Loop of maintenance horror”. With the loop eventually comes blame. Here’s a nice tip for all developers and DBAs out there: If you make a mistake man up and admit to it. All of the above is valid for any kind of software development. Keeping this in mind the database is nothing other than just a part of the application. But a big part! One reason why testing a database is even more important than testing an application is that one database is usually accessed from multiple applications and processes. This makes it the central and vital part of the enterprise software infrastructure. Knowing all this can we really afford not to have tests? What to test in a database Now that we’ve decided we’ll dive into this testing thing we have to ask ourselves what needs to be tested? The short answer is: everything. The long answer is: read on! There are 2 main ways of doing tests: Black box and White box testing. Black box testing means we have no idea how the system internals are built and we only have access to it’s inputs and outputs. With it we test that the internal changes to the system haven’t caused the input/output behavior of the system to change. The most important thing to test here are the edge conditions. It’s where most programs break. Having good edge condition tests we can be more confident that the systems changes won’t break. White box testing has the full knowledge of the system internals. With it we test the internal system changes, different states of the application, etc… White and Black box tests should be complementary to each other as they are very much interconnected. Testing database routines includes testing stored procedures, views, user defined functions and anything you use to access the data with. Database routines are your input/output interface to the database system. They count as black box testing. We test then for 2 things: Data and schema. When testing schema we only care about the columns and the data types they’re returning. After all the schema is the contract to the out side systems. If it changes we usually have to change the applications accessing it. One helpful T-SQL command when doing schema tests is SET FMTONLY ON. It tells the SQL Server to return only empty results sets. This speeds up tests because it doesn’t return any data to the client. After we’ve validated the schema we have to test the returned data. There no other way to do this but to have expected data known before the tests executes and comparing that data to the database routine output. Testing Authentication and Authorization helps us validate who has access to the SQL Server box (Authentication) and who has access to certain database objects (Authorization). For desktop applications and windows authentication this works well. But the biggest problem here are web apps. They usually connect to the database as a single user. Please ensure that that user is not SA or an account with admin privileges. That is just bad. Load testing ensures us that our database can handle peak loads. One often overlooked tool for load testing is Microsoft’s OSTRESS tool. It’s part of RML utilities (x86, x64) for SQL Server and can help determine if our database server can handle loads like 100 simultaneous users each doing 10 requests per second. SQL Profiler can also help us here by looking at why certain queries are slow and what to do to fix them. One particular problem to think about is how to begin testing existing databases. First thing we have to do is to get to know those databases. We can’t test something when we don’t know how it works. To do this we have to talk to the users of the applications accessing the database, run SQL Profiler to see what queries are being run, use existing documentation to decipher all the object relationships, etc… The way to approach this is to choose one part of the database (say a logical grouping of tables that go together) and filter our traces accordingly. Once we’ve done that we move on to the next grouping and so on until we’ve covered the whole database. Then we move on to the next one. Database Testing is a topic that we can spent many hours discussing but let this be a nice intro to the world of database testing. See you in the next post.

Read the article
Oracle Tutor: Top 10 to Implement Sustainable Policies and Procedures

- by emily.chorba(at)oracle.com

Overview Your organization (executives, managers, and employees) understands the value of having written business process documents (process maps, procedures, instructions, reference documents, and form abstracts). Policies and procedures should be documented because they help to reduce the range of individual decisions and encourage management by exception: the manager only needs to give special attention to unusual problems, not covered by a specific policy or procedure. As more and more procedures are written to cover recurring situations, managers will begin to make decisions which will be consistent from one functional area to the next.Companies should take a project management approach when implementing an environment for a sustainable documentation program and do the following:1. Identify an Executive Champion2. Put together a winning team3. Assign ownership4. Centralize publishing5. Establish the Document Maintenance Process Up Front6. Document critical activities only7. Document actual practice8. Minimize documentation9. Support continuous improvement10. Keep it simple 1. Identify an Executive ChampionAppoint a top down driver. Select one key individual to be a mentor for the procedure planning team. The individual should be a senior manager, such as your company president, CIO, CFO, the vice-president of quality, manufacturing, or engineering. Written policies and procedures can be important supportive aids when known to express the thinking for the chief executive officer and / or the president and to have his or her full support. 2. Put Together a Winning TeamChoose a strong Project Management Leader and staff the procedure planning team with management members from cross functional groups. Make sure team members have the responsibility - and the authority - to make things happen.The winning team should consist of the Documentation Project Manager, Document Owners (one for each functional area), a Document Controller, and Document Specialists (as needed). The Tutor Implementation Guide has complete job descriptions for these roles. 3. Assign Ownership It is virtually impossible to keep process documentation simple and meaningful if employees who are far removed from the activity itself create it. It is impossible to keep documentation up-to-date when responsibility for the document is not clearly understood.Key to the Tutor methodology, therefore, is the concept of ownership. Each document has a single owner, who is responsible for ensuring that the document is necessary and that it reflects actual practice. The owner must be a person who is knowledgeable about the activity and who has the authority to build consensus among the persons who participate in the activity as well as the authority to define or change the way an activity is performed. The owner must be an advocate of the performers and negotiate, not dictate practices.In the Tutor environment, a document's owner is the only person with the authority to approve an update to that document. 4. Centralize Publishing Although it is tempting (especially in a networked environment and with document management software solutions) to decentralize the control of all documents -- with each owner updating and distributing his own -- Tutor promotes centralized publishing by assigning the Document Administrator (gate keeper) to manage the updates and distribution of the procedures library. 5. Establish a Document Maintenance Process Up Front (and stick to it) Everyone in your organization should know they are invited to suggest changes to procedures and should understand exactly what steps to take to do so. Tutor provides a set of procedures to help your company set up a healthy document control system. There are many document management products available to automate some of the document change and maintenance steps. Depending on the size of your organization, a simple document management system can reduce the effort it takes to track and distribute document changes and updates. Whether your company decides to store the written policies and procedures on a file server or in a database, the essential tasks for maintaining documents are the same, though some tasks are automated. 6. Document Critical Activities Only The best way to keep your documentation simple is to reduce the number of process documents to a bare minimum and to include in those documents only as much detail as is absolutely necessary. The first step to reducing process documentation is to document only those activities that are deemed critical. Not all activities require documentation. In fact, some critical activities cannot and should not be standardized. Others may be sufficiently documented with an instruction or a checklist and may not require a procedure. A document should only be created when it enhances the performance of the employee performing the activity. If it does not help the employee, then there is no reason to maintain the document. Activities that represent little risk (such as project status), activities that cannot be defined in terms of specific tasks (such as product research), and activities that can be performed in a variety of ways (such as advertising) often do not require documentation. Sometimes, an activity will evolve to the point where documentation is necessary. For example, an activity performed by single employee may be straightforward and uncomplicated -- that is, until the activity is performed by multiple employees. Sometimes, it is the interaction between co-workers that necessitates documentation; sometimes, it is the complexity or the diversity of the activity.7. Document Actual Practices The only reason to maintain process documentation is to enhance the performance of the employee performing the activity. And documentation can only enhance performance if it reflects reality -- that is, current best practice. Documentation that reflects an unattainable ideal or outdated practices will end up on the shelf, unused and forgotten.Documenting actual practice means (1) auditing the activity to understand how the work is really performed, (2) identifying best practices with employees who are involved in the activity, (3) building consensus so that everyone agrees on a common method, and (4) recording that consensus.8. Minimize Documentation One way to keep it simple is to document at the highest level possible. That is, include in your documents only as much detail as is absolutely necessary.When writing a document, you should ask yourself, What is the purpose of this document? That is, what problem will it solve?By focusing on this question, you can target the critical information.• What questions are the end users likely to have?• What level of detail is required?• Is any of this information extraneous to the document's purpose? Short, concise documents are user friendly and they are easier to keep up to date. 9. Support Continuous Improvement Employees who perform an activity are often in the best position to identify improvements to the process. In other words, continuous improvement is a natural byproduct of the work itself -- but only if the improvements are communicated to all employees who are involved in the process, and only if there is consensus among those employees.Traditionally, process documentation has been used to dictate performance, to limit employees' actions. In the Tutor environment, process documents are used to communicate improvements identified by employees. How does this work? The Tutor methodology requires a process document to reflect actual practice, so the owner of a document must routinely audit its content -- does the document match what the employees are doing? If it doesn't, the owner has the responsibility to evaluate the process, to build consensus among the employees, to identify "best practices," and to communicate these improvements via a document update. Continuous improvement can also be an outgrowth of corrective action -- but only if the solutions to problems are communicated effectively. The goal should be to solve a problem once and only once, which means not only identifying the solution, but ensuring that the solution becomes part of the process. The Tutor system provides the method through which improvements and solutions are documented and communicated to all affected employees in a cost-effective, timely manner; it ensures that improvements are not lost or confined to a single employee. 10. Keep it Simple Process documents don't have to be complex and unfriendly. In fact, the simpler the format and organization, the more likely the documents will be used. And the simpler the method of maintenance, the more likely the documents will be kept up-to-date. Keep it simply by:• Minimizing skills and training required• Following the established Tutor document format and layout• Avoiding technology just for technology's sake No other rule has as major an impact on the success of your internal documentation as -- keep it simple. Learn More For more information about Tutor, visit Oracle.Com or the Tutor Blog. Post your questions at the Tutor Forum. Emily Chorba Principle Product Manager Oracle Tutor & BPM

Read the article
SQL University: What and why of database refactoring

- by Mladen Prajdic

This is a post for a great idea called SQL University started by Jorge Segarra also famously known as SqlChicken on Twitter. It’s a collection of blog posts on different database related topics contributed by several smart people all over the world. So this week is mine and we’ll be talking about database testing and refactoring. In 3 posts we’ll cover: SQLU part 1 - What and why of database testing SQLU part 2 - What and why of database refactoring SQLU part 3 - Tools of the trade This is a second part of the series and in it we’ll take a look at what database refactoring is and why do it. Why refactor a database To know why refactor we first have to know what refactoring actually is. Code refactoring is a process where we change module internals in a way that does not change that module’s input/output behavior. For successful refactoring there is one crucial thing we absolutely must have: Tests. Automated unit tests are the only guarantee we have that we haven’t broken the input/output behavior before refactoring. If you haven’t go back ad read my post on the matter. Then start writing them. Next thing you need is a code module. Those are views, UDFs and stored procedures. By having direct table access we can kiss fast and sweet refactoring good bye. One more point to have a database abstraction layer. And no, ORM’s don’t fall into that category. But also know that refactoring is NOT adding new functionality to your code. Many have fallen into this trap. Don’t be one of them and resist the lure of the dark side. And it’s a strong lure. We developers in general love to add new stuff to our code, but hate fixing our own mistakes or changing existing code for no apparent reason. To be a good refactorer one needs discipline and focus. Now we know that refactoring is all about changing inner workings of existing code. This can be due to performance optimizations, changing internal code workflows or some other reason. This is a typical black box scenario to the outside world. If we upgrade the car engine it still has to drive on the road (preferably faster) and not fly (no matter how cool that would be). Also be aware that white box tests will break when we refactor. What to refactor in a database Refactoring databases doesn’t happen that often but when it does it can include a lot of stuff. Let us look at a few common cases. Adding or removing database schema objects Adding, removing or changing table columns in any way, adding constraints, keys, etc… All of these can be counted as internal changes not visible to the data consumer. But each of these carries a potential input/output behavior change. Dropping a column can result in views not working anymore or stored procedure logic crashing. Adding a unique constraint shows duplicated data that shouldn’t exist. Foreign keys break a truncate table command executed from an application that runs once a month. All these scenarios are very real and can happen. With the proper database abstraction layer fully covered with black box tests we can make sure something like that does not happen (hopefully at all). Changing physical structures Physical structures include heaps, indexes and partitions. We can pretty much add or remove those without changing the data returned by the database. But the performance can be affected. So here we use our performance tests. We do have them, right? Just by adding a single index we can achieve orders of magnitude performance improvement. Won’t that make users happy? But what if that index causes our write operations to crawl to a stop. again we have to test this. There are a lot of things to think about and have tests for. Without tests we can’t do successful refactoring! Fixing bad code We all have some bad code in our systems. We usually refer to that code as code smell as they violate good coding practices. Examples of such code smells are SQL injection, use of SELECT *, scalar UDFs or cursors, etc… Each of those is huge code smell and can result in major code changes. Take SELECT * from example. If we remove a column from a table the client using that SELECT * statement won’t have a clue about that until it runs. Then it will gracefully crash and burn. Not to mention the widely unknown SELECT * view refresh problem that Tomas LaRock (@SQLRockstar on Twitter) and Colin Stasiuk (@BenchmarkIT on Twitter) talk about in detail. Go read about it, it’s informative. Refactoring this includes replacing the * with column names and most likely change to application using the database. Breaking apart huge stored procedures Have you ever seen seen a stored procedure that was 2000 lines long? I have. It’s not pretty. It hurts the eyes and sucks the will to live the next 10 minutes. They are a maintenance nightmare and turn into things no one dares to touch. I’m willing to bet that 100% of time they don’t have a single test on them. Large stored procedures (and functions) are a clear sign that they contain business logic. General opinion on good database coding practices says that business logic has no business in the database. That’s the applications part. Refactoring such behemoths requires writing lots of edge case tests for the stored procedure input/output behavior and then start to refactor it. First we split the logic inside into smaller parts like new stored procedures and UDFs. Those then get called from the master stored procedure. Once we’ve successfully modularized the database code it’s best to transfer that logic into the applications consuming it. This only leaves the stored procedure with common data manipulation logic. Of course this isn’t always possible so having a plethora of performance and behavior unit tests is absolutely necessary to confirm we’ve actually improved the codebase in some way. Refactoring is not a popular chore amongst developers or managers. The former don’t like fixing old code, the latter can’t see the financial benefit. Remember how we talked about being lousy at estimating future costs in the previous post? But there comes a time when it must be done. Hopefully I’ve given you some ideas how to get started. In the last post of the series we’ll take a look at the tools to use and an example of testing and refactoring.

Read the article

Search Results

Search found 13749 results on 550 pages for 'reason'.

Page 162/550 | < Previous Page | 158 159 160 161 162 163 164 165 166 167 168 169 | Next Page >

- by Cédric Menzi

- by Rick Strahl

- by zowens

- by imran_ku07

- by Jon Galloway

- by smisner

- by Ralf Westphal

- by Luis Alvarado

- by Nuri Halperin

- by BuckWoody

- by imran_ku07

- by Ricardo Peres

- by Wes McClure

- by Paul White

- by Geordie

- by ccasares

- by [email protected]

- by Pinal Dave

- by Pinal Dave

- by Javier Treviño

- by Reed

- by Daniel Moth

- by Mladen Prajdic

- by emily.chorba(at)oracle.com

- by Mladen Prajdic

< Previous Page | 158 159 160 161 162 163 164 165 166 167 168 169 | Next Page >