Search Results

Search found 11897 results on 476 pages for 'dean rather'.

Page 148/476 | < Previous Page | 144 145 146 147 148 149 150 151 152 153 154 155 | Next Page >

Using HTML 5 SessionState to save rendered Page Content

- by Rick Strahl

HTML 5 SessionState and LocalStorage are very useful and super easy to use to manage client side state. For building rich client side or SPA style applications it's a vital feature to be able to cache user data as well as HTML content in order to swap pages in and out of the browser's DOM. What might not be so obvious is that you can also use the sessionState and localStorage objects even in classic server rendered HTML applications to provide caching features between pages. These APIs have been around for a long time and are supported by most relatively modern browsers and even all the way back to IE8, so you can use them safely in your Web applications. SessionState and LocalStorage are easy The APIs that make up sessionState and localStorage are very simple. Both object feature the same API interface which is a simple, string based key value store that has getItem, setItem, removeitem, clear and key methods. The objects are also pseudo array objects and so can be iterated like an array with a length property and you have array indexers to set and get values with. Basic usage for storing and retrieval looks like this (using sessionStorage, but the syntax is the same for localStorage - just switch the objects):// set var lastAccess = new Date().getTime(); if (sessionStorage) sessionStorage.setItem("myapp_time", lastAccess.toString()); // retrieve in another page or on a refresh var time = null; if (sessionStorage) time = sessionStorage.getItem("myapp_time"); if (time) time = new Date(time * 1); else time = new Date(); sessionState stores data that is browser session specific and that has a liftetime of the active browser session or window. Shut down the browser or tab and the storage goes away. localStorage uses the same API interface, but the lifetime of the data is permanently stored in the browsers storage area until deleted via code or by clearing out browser cookies (not the cache). Both sessionStorage and localStorage space is limited. The spec is ambiguous about this - supposedly sessionStorage should allow for unlimited size, but it appears that most WebKit browsers support only 2.5mb for either object. This means you have to be careful what you store especially since other applications might be running on the same domain and also use the storage mechanisms. That said 2.5mb worth of character data is quite a bit and would go a long way. The easiest way to get a feel for how sessionState and localStorage work is to look at a simple example. You can go check out the following example online in Plunker: http://plnkr.co/edit/0ICotzkoPjHaWa70GlRZ?p=preview which looks like this: Plunker is an online HTML/JavaScript editor that lets you write and run Javascript code and similar to JsFiddle, but a bit cleaner to work in IMHO (thanks to John Papa for turning me on to it). The sample has two text boxes with counts that update session/local storage every time you click the related button. The counts are 'cached' in Session and Local storage. The point of these examples is that both counters survive full page reloads, and the LocalStorage counter survives a complete browser shutdown and restart. Go ahead and try it out by clicking the Reload button after updating both counters and then shutting down the browser completely and going back to the same URL (with the same browser). What you should see is that reloads leave both counters intact at the counted values, while a browser restart will leave only the local storage counter intact. The code to deal with the SessionStorage (and LocalStorage not shown here) in the example is isolated into a couple of wrapper methods to simplify the code: function getSessionCount() { var count = 0; if (sessionStorage) { var count = sessionStorage.getItem("ss_count"); count = !count ? 0 : count * 1; } $("#txtSession").val(count); return count; } function setSessionCount(count) { if (sessionStorage) sessionStorage.setItem("ss_count", count.toString()); } These two functions essentially load and store a session counter value. The two key methods used here are: sessionStorage.getItem(key); sessionStorage.setItem(key,stringVal); Note that the value given to setItem and return by getItem has to be a string. If you pass another type you get an error. Don't let that limit you though - you can easily enough store JSON data in a variable so it's quite possible to pass complex objects and store them into a single sessionStorage value:var user = { name: "Rick", id="ricks", level=8 } sessionStorage.setItem("app_user",JSON.stringify(user)); to retrieve it:var user = sessionStorage.getItem("app_user"); if (user) user = JSON.parse(user); Simple! If you're using the Chrome Developer Tools (F12) you can also check out the session and local storage state on the Resource tab: You can also use this tool to refresh or remove entries from storage. What we just looked at is a purely client side implementation where a couple of counters are stored. For rich client centric AJAX applications sessionStorage and localStorage provide a very nice and simple API to store application state while the application is running. But you can also use these storage mechanisms to manage server centric HTML applications when you combine server rendering with some JavaScript to perform client side data caching. You can both store some state information and data on the client (ie. store a JSON object and carry it forth between server rendered HTML requests) or you can use it for good old HTTP based caching where some rendered HTML is saved and then restored later. Let's look at the latter with a real life example. Why do I need Client-side Page Caching for Server Rendered HTML? I don't know about you, but in a lot of my existing server driven applications I have lists that display a fair amount of data. Typically these lists contain links to then drill down into more specific data either for viewing or editing. You can then click on a link and go off to a detail page that provides more concise content. So far so good. But now you're done with the detail page and need to get back to the list, so you click on a 'bread crumbs trail' or an application level 'back to list' button and… …you end up back at the top of the list - the scroll position, the current selection in some cases even filters conditions - all gone with the wind. You've left behind the state of the list and are starting from scratch in your browsing of the list from the top. Not cool! Sound familiar? This a pretty common scenario with server rendered HTML content where it's so common to display lists to drill into, only to lose state in the process of returning back to the original list. Look at just about any traditional forums application, or even StackOverFlow to see what I mean here. Scroll down a bit to look at a post or entry, drill in then use the bread crumbs or tab to go back… In some cases returning to the top of a list is not a big deal. On StackOverFlow that sort of works because content is turning around so quickly you probably want to actually look at the top posts. Not always though - if you're browsing through a list of search topics you're interested in and drill in there's no way back to that position. Essentially anytime you're actively browsing the items in the list, that's when state becomes important and if it's not handled the user experience can be really disrupting. Content Caching If you're building client centric SPA style applications this is a fairly easy to solve problem - you tend to render the list once and then update the page content to overlay the detail content, only hiding the list temporarily until it's used again later. It's relatively easy to accomplish this simply by hiding content on the page and later making it visible again. But if you use server rendered content, hanging on to all the detail like filters, selections and scroll position is not quite as easy. Or is it??? This is where sessionStorage comes in handy. What if we just save the rendered content of a previous page, and then restore it when we return to this page based on a special flag that tells us to use the cached version? Let's see how we can do this. A real World Use Case Recently my local ISP asked me to help out with updating an ancient classifieds application. They had a very busy, local classifieds app that was originally an ASP classic application. The old app was - wait for it: frames based - and even though I lobbied against it, the decision was made to keep the frames based layout to allow rapid browsing of the hundreds of posts that are made on a daily basis. The primary reason they wanted this was precisely for the ability to quickly browse content item by item. While I personally hate working with Frames, I have to admit that the UI actually works well with the frames layout as long as you're running on a large desktop screen. You can check out the frames based desktop site here: http://classifieds.gorge.net/ However when I rebuilt the app I also added a secondary view that doesn't use frames. The main reason for this of course was for mobile displays which work horribly with frames. So there's a somewhat mobile friendly interface to the interface, which ditches the frames and uses some responsive design tweaking for mobile capable operation: http://classifeds.gorge.net/mobile (or browse the base url with your browser width under 800px) Here's what the mobile, non-frames view looks like: As you can see this means that the list of classifieds posts now is a list and there's a separate page for drilling down into the item. And of course… originally we ran into that usability issue I mentioned earlier where the browse, view detail, go back to the list cycle resulted in lost list state. Originally in mobile mode you scrolled through the list, found an item to look at and drilled in to display the item detail. Then you clicked back to the list and BAM - you've lost your place. Because there are so many items added on a daily basis the full list is never fully loaded, but rather there's a "Load Additional Listings" entry at the button. Not only did we originally lose our place when coming back to the list, but any 'additionally loaded' items are no longer there because the list was now rendering as if it was the first page hit. The additional listings, and any filters, the selection of an item all were lost. Major Suckage! Using Client SessionStorage to cache Server Rendered Content To work around this problem I decided to cache the rendered page content from the list in SessionStorage. Anytime the list renders or is updated with Load Additional Listings, the page HTML is cached and stored in Session Storage. Any back links from the detail page or the login or write entry forms then point back to the list page with a back=true query string parameter. If the server side sees this parameter it doesn't render the part of the page that is cached. Instead the client side code retrieves the data from the sessionState cache and simply inserts it into the page. It sounds pretty simple, and the overall the process is really easy, but there are a few gotchas that I'll discuss in a minute. But first let's look at the implementation. Let's start with the server side here because that'll give a quick idea of the doc structure. As I mentioned the server renders data from an ASP.NET MVC view. On the list page when returning to the list page from the display page (or a host of other pages) looks like this: https://classifieds.gorge.net/list?back=True The query string value is a flag, that indicates whether the server should render the HTML. Here's what the top level MVC Razor view for the list page looks like:@model MessageListViewModel @{ ViewBag.Title = "Classified Listing"; bool isBack = !string.IsNullOrEmpty(Request.QueryString["back"]); } <form method="post" action="@Url.Action("list")"> <div id="SizingContainer"> @if (!isBack) { @Html.Partial("List_CommandBar_Partial", Model) <div id="PostItemContainer" class="scrollbox" xstyle="-webkit-overflow-scrolling: touch;"> @Html.Partial("List_Items_Partial", Model) @if (Model.RequireLoadEntry) { <div class="postitem loadpostitems" style="padding: 15px;"> <div id="LoadProgress" class="smallprogressright"></div> <div class="control-progress"> Load additional listings... </div> </div> } </div> } </div> </form> As you can see the query string triggers a conditional block that if set is simply not rendered. The content inside of #SizingContainer basically holds the entire page's HTML sans the headers and scripts, but including the filter options and menu at the top. In this case this makes good sense - in other situations the fact that the menu or filter options might be dynamically updated might make you only cache the list rather than essentially the entire page. In this particular instance all of the content works and produces the proper result as both the list along with any filter conditions in the form inputs are restored. Ok, let's move on to the client. On the client there are two page level functions that deal with saving and restoring state. Like the counter example I showed earlier, I like to wrap the logic to save and restore values from sessionState into a separate function because they are almost always used in several places.page.saveData = function(id) { if (!sessionStorage) return; var data = { id: id, scroll: $("#PostItemContainer").scrollTop(), html: $("#SizingContainer").html() }; sessionStorage.setItem("list_html",JSON.stringify(data)); }; page.restoreData = function() { if (!sessionStorage) return; var data = sessionStorage.getItem("list_html"); if (!data) return null; return JSON.parse(data); }; The data that is saved is an object which contains an ID which is the selected element when the user clicks and a scroll position. These two values are used to reset the scroll position when the data is used from the cache. Finally the html from the #SizingContainer element is stored, which makes for the bulk of the document's HTML. In this application the HTML captured could be a substantial bit of data. If you recall, I mentioned that the server side code renders a small chunk of data initially and then gets more data if the user reads through the first 50 or so items. The rest of the items retrieved can be rather sizable. Other than the JSON deserialization that's Ok. Since I'm using SessionStorage the storage space has no immediate limits. Next is the core logic to handle saving and restoring the page state. At first though this would seem pretty simple, and in some cases it might be, but as the following code demonstrates there are a few gotchas to watch out for. Here's the relevant code I use to save and restore:$( function() { … var isBack = getUrlEncodedKey("back", location.href); if (isBack) { // remove the back key from URL setUrlEncodedKey("back", "", location.href); var data = page.restoreData(); // restore from sessionState if (!data) { // no data - force redisplay of the server side default list window.location = "list"; return; } $("#SizingContainer").html(data.html); var el = $(".postitem[data-id=" + data.id + "]"); $(".postitem").removeClass("highlight"); el.addClass("highlight"); $("#PostItemContainer").scrollTop(data.scroll); setTimeout(function() { el.removeClass("highlight"); }, 2500); } else if (window.noFrames) page.saveData(null); // save when page loads $("#SizingContainer").on("click", ".postitem", function() { var id = $(this).attr("data-id"); if (!id) return true; if (window.noFrames) page.saveData(id); var contentFrame = window.parent.frames["Content"]; if (contentFrame) contentFrame.location.href = "show/" + id; else window.location.href = "show/" + id; return false; }); … The code starts out by checking for the back query string flag which triggers restoring from the client cache. If cached the cached data structure is read from sessionStorage. It's important here to check if data was returned. If the user had back=true on the querystring but there is no cached data, he likely bookmarked this page or otherwise shut down the browser and came back to this URL. In that case the server didn't render any detail and we have no cached data, so all we can do is redirect to the original default list view using window.location. If we continued the page would render no data - so make sure to always check the cache retrieval result. Always! If there is data the it's loaded and the data.html data is restored back into the document by simply injecting the HTML back into the document's #SizingContainer element:$("#SizingContainer").html(data.html); It's that simple and it's quite quick even with a fully loaded list of additional items and on a phone. The actual HTML data is stored to the cache on every page load initially and then again when the user clicks on an element to navigate to a particular listing. The former ensures that the client cache always has something in it, and the latter updates with additional information for the selected element. For the click handling I use a data-id attribute on the list item (.postitem) in the list and retrieve the id from that. That id is then used to navigate to the actual entry as well as storing that Id value in the saved cached data. The id is used to reset the selection by searching for the data-id value in the restored elements. The overall process of this save/restore process is pretty straight forward and it doesn't require a bunch of code, yet it yields a huge improvement in the usability of the site on mobile devices (or anybody who uses the non-frames view). Some things to watch out for As easy as it conceptually seems to simply store and retrieve cached content, you have to be quite aware what type of content you are caching. The code above is all that's specific to cache/restore cycle and it works, but it took a few tweaks to the rest of the script code and server code to make it all work. There were a few gotchas that weren't immediately obvious. Here are a few things to pay attention to: Event Handling Logic Timing of manipulating DOM events Inline Script Code Bookmarking to the Cache Url when no cache exists Do you have inline script code in your HTML? That script code isn't going to run if you restore from cache and simply assign or it may not run at the time you think it would normally in the DOM rendering cycle. JavaScript Event Hookups The biggest issue I ran into with this approach almost immediately is that originally I had various static event handlers hooked up to various UI elements that are now cached. If you have an event handler like:$("#btnSearch").click( function() {…}); that works fine when the page loads with server rendered HTML, but that code breaks when you now load the HTML from cache. Why? Because the elements you're trying to hook those events to may not actually be there - yet. Luckily there's an easy workaround for this by using deferred events. With jQuery you can use the .on() event handler instead:$("#SelectionContainer").on("click","#btnSearch", function() {…}); which monitors a parent element for the events and checks for the inner selector elements to handle events on. This effectively defers to runtime event binding, so as more items are added to the document bindings still work. For any cached content use deferred events. Timing of manipulating DOM Elements Along the same lines make sure that your DOM manipulation code follows the code that loads the cached content into the page so that you don't manipulate DOM elements that don't exist just yet. Ideally you'll want to check for the condition to restore cached content towards the top of your script code, but that can be tricky if you have components or other logic that might not all run in a straight line. Inline Script Code Here's another small problem I ran into: I use a DateTime Picker widget I built a while back that relies on the jQuery date time picker. I also created a helper function that allows keyboard date navigation into it that uses JavaScript logic. Because MVC's limited 'object model' the only way to embed widget content into the page is through inline script. This code broken when I inserted the cached HTML into the page because the script code was not available when the component actually got injected into the page. As the last bullet - it's a matter of timing. There's no good work around for this - in my case I pulled out the jQuery date picker and relied on native <input type="date" /> logic instead - a better choice these days anyway, especially since this view is meant to be primarily to serve mobile devices which actually support date input through the browser (unlike desktop browsers of which only WebKit seems to support it). Bookmarking Cached Urls When you cache HTML content you have to make a decision whether you cache on the client and also not render that same content on the server. In the Classifieds app I didn't render server side content so if the user comes to the page with back=True and there is no cached content I have to a have a Plan B. Typically this happens when somebody ends up bookmarking the back URL. The easiest and safest solution for this scenario is to ALWAYS check the cache result to make sure it exists and if not have a safe URL to go back to - in this case to the plain uncached list URL which amounts to effectively redirecting. This seems really obvious in hindsight, but it's easy to overlook and not see a problem until much later, when it's not obvious at all why the page is not rendering anything. Don't use <body> to replace Content Since we're practically replacing all the HTML in the page it may seem tempting to simply replace the HTML content of the <body> tag. Don't. The body tag usually contains key things that should stay in the page and be there when it loads. Specifically script tags and elements and possibly other embedded content. It's best to create a top level DOM element specifically as a placeholder container for your cached content and wrap just around the actual content you want to replace. In the app above the #SizingContainer is that container. Other Approaches The approach I've used for this application is kind of specific to the existing server rendered application we're running and so it's just one approach you can take with caching. However for server rendered content caching this is a pattern I've used in a few apps to retrofit some client caching into list displays. In this application I took the path of least resistance to the existing server rendering logic. Here are a few other ways that come to mind: Using Partial HTML Rendering via AJAXInstead of rendering the page initially on the server, the page would load empty and the client would render the UI by retrieving the respective HTML and embedding it into the page from a Partial View. This effectively makes the initial rendering and the cached rendering logic identical and removes the server having to decide whether this request needs to be rendered or not (ie. not checking for a back=true switch). All the logic related to caching is made on the client in this case. Using JSON Data and Client RenderingThe hardcore client option is to do the whole UI SPA style and pull data from the server and then use client rendering or databinding to pull the data down and render using templates or client side databinding with knockout/angular et al. As with the Partial Rendering approach the advantage is that there's no difference in the logic between pulling the data from cache or rendering from scratch other than the initial check for the cache request. Of course if the app is a full on SPA app, then caching may not be required even - the list could just stay in memory and be hidden and reactivated. I'm sure there are a number of other ways this can be handled as well especially using AJAX. AJAX rendering might simplify the logic, but it also complicates search engine optimization since there's no content loaded initially. So there are always tradeoffs and it's important to look at all angles before deciding on any sort of caching solution in general. State of the Session SessionState and LocalStorage are easy to use in client code and can be integrated even with server centric applications to provide nice caching features of content and data. In this post I've shown a very specific scenario of storing HTML content for the purpose of remembering list view data and state and making the browsing experience for lists a bit more friendly, especially if there's dynamically loaded content involved. If you haven't played with sessionStorage or localStorage I encourage you to give it a try. There's a lot of cool stuff that you can do with this beyond the specific scenario I've covered here… Resources Overview of localStorage (also applies to sessionStorage) Web Storage Compatibility Modernizr Test Suite© Rick Strahl, West Wind Technologies, 2005-2013Posted in JavaScript HTML5 ASP.NET MVC Tweet !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs"); (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

Read the article
Slow login to load-balanced Terminal Server 2008 behind Gateway Server

- by Frans

I have a small load-balanced (using Session Broker) Terminal Server 2008 farm behind a Gateway Server which is accessed from the Internet. The problem I have is that there is a delay of 20-30 seconds if the session broker switches the user to another server during login. I think this is related to the fact that I am forcing the security layer to be RDP rather than SSL. The background The Gateway server has a public routeable IP addres and DNS name so it can be accessed from the Internet and all users come in via this route (the system is used to provide access to hosted applications to external customers). The actual terminal servers only have internal IP addresses. This works really well, except that with a Vista or Windows 7 client, the Remote Desktop client will negotiate with the server to use SSL for the security layer. This then exposes the auto-generated certificate that TS1 or TS2 has - but since they are internal, auto-generated certificates, the client will get a stern warning that the certificate is not valid. I can't give the servers a properly authorised certificate as the servers do not have public routeable IP address or DNS name. Instead, I am using Group Policy to force the connections to be over RDP instead of SSL. \Computer Configuration\Policies\Administrative Templates\Windows Components\Terminal Services\Terminal Server\Security\Require use of specific security layer for remote (RDP) connections The Windows 7 user now gets a much less stern warning that "the server's identity cannot be confirmed" which I can live with. I don't have enough control over the end-user's machines to ask them to install a new root certificate either. TS1 and TS2 are also load-balanced using the Session Broker, which is installed on the Gateway Server. I am using round-robin DNS, so the user's initial connection will go via Gateway1 to either TS1 or TS2. TS1/TS2 will then talk to the session broker and may pass the user to the other server. I.e. the user may get connected to TS2, but after talking to the session broker the user may be passed to TS1, which is where they will run their session. When this switching of servers happens, in my setup, the screen sits with the word "Welcome" for 20-30 seconds after which it flickers, Welcome is shown again and then flashing through nthe normal login screens (i.e. "wait for user profile manager" etc). Having done some research, I think what is happening is that the user is being fully logged on to TS2 (while "Welcome" is shown) before being passed to TS1, where they are then logged in again. It is interesting that normally when you see the ""Welcome" word, the little circle to left rotates. However, it does not rotate during this delay - the screen just looks frozen. This blog post leads me to think that this is because CredSSP is not being used, probably because I am disallowing SSL and forcing RDP. What I have tried I enabled SSL again which removes the "Welcome" delay. However, it seems to introduc a new delay much earlier in the process. Specifically, when the RDP client is saying "initialising connection" - this is now much slower. Quite apart from the fact that my certificate problem precludes me using that solution without considerable difficulty. I tried disabling the load balancing (just remove the servers from the session broker farm) and the connections do not have any delay. The problem is also intermittent in the sense that it only happens when the user gets bumped from one server to another. I tested this by trying to connect directly to TS1 (via the Gateway, of course) and then checking which server I actually got connected to. Just to be sure, I also by-passed the round-robin DNS to see if it had any impact and it doesn't. The setup is essentially in line with MS recommendations here: TS Session Broker Load Balancing Step-by-Step Guide I tried changing to using a dedicated redirector. Basically, rather than using a round-robin DNS, I pointed my DNS to the Gateway server and configured it to be a dedicated redirector (disallow logons, add it to the farm). Same problem, alas. Any ideas or suggestions gratefully received.

Read the article
Internal drives vs USB-3 with external SSD or eSata with External SSD

- by normstorm

I have a need to carry VMWare Virtual Machines with me for work. These are very large files (each VM is 20GB or more) and I carry around about 40 to 50 VM's to simulate different software configurations for different client needs. Key: they won't fit on the internal hard drive of my current laptop. I currently execute the VM's from an external 7200RPM 2.5" USB-2 drive. I keep copies of the VM's on other 5400 external USB-2 drives. The VM's work from this drive, but they are slow, costing me much time and frustration. It can take upwards of 30 minutes just to make a copy of one of the VM's. They can take upwards of 10-15 minutes to fully launch and then they operate sluggishly. I am buying a new laptop (Core I7, 8GB RAM and other high-end specs). I intend to buy an SSD for the O/S volume (C:). This SSD will not be large enough to hold the VM's. I have always wanted a second internal hard drive to operate the VM's. To have two hard drives, though, I am finding that I will have to go to a 17" laptop which would be bulky/heavy. I am instead considering purchasing a 15" laptop with either an eSATA port or USB-3 ports and then purchasing two external drives. One of the drives might be an external SSD (maybe OCX brand) for operating the VM's and the other a 7400RPM 1TB hard drive for carrying around the VM's not currently in use. The question is which options would give me the biggest bang for the buck and the weight: 1) 2nd Internal SSD hard drive. This would mean buying a 17" laptop with two drive "bays". The first bay would hold an SSD drive for the C: drive. I would leave the first bay empty from the manufacture and then purchase/install an aftermarket SSD drive. This second SSD drive would have to be very large (256 GB), which would be expensive. I would still also need another external hard drive for carrying around the VM's not in use. 2) 2nd internal hard drive - 7400 RPM. Again, a 17" laptop would be required, but there are models available with on SSD drive for the C: drive and a second 7200 RPM hard drives. The second drive could probably be large enough to hold the VM's in use as well as those not in use. But would it be fast enough to drive the VM's? 3) USB-3 with External SSD. I could buy a 15" laptop with an SSD drive for the C: drive and a second hard drive for general files. I would operate the VM's from an external USB-3 SSD drive and have a third USB-3 external 7200 RPM drive for holding the VM's not in use. 4) eSATA with External SSD. Ditto, just eSATA instead of USB-3 5) USB-3 with External 7400 RPM drive. Ditto, but the drive running the VM's would be USB-3 attached 7400 RPM drives rather than SSD. 6) eSATA with External 7400 RPM drive. Dittor, but the drive running the VM's would be eSATA attached 7400 RPM drives rather than SSD. Any thoughts on this and any creative solutions?

Read the article
What was scientifically shown to support productivity when organizing/accessing file and folders?

- by Tom Wijsman

I have gathered terabytes of data but it has became a habit to store files and folders to the same folder, that folder could be kind of seen as a Inbox where most files (non-installations) enter my system. This way I end up with a big collections of files that are hard to organize properly, I mostly end up making folders that match their file type but then I still have several gigabytes of data per folder which doesn't make it efficient such that I can productively use the folder. I'd rather do a few clicks than having to search through the files, whether that's by some software product or by looking through the folder. Often the file names themselves are not proper so it would be easier to recognize them if there were few in a folder, rather than thousands of them. Scaling in the structure of directory trees in a computer cluster summarizes this problem as following: The processes of storing and retrieving information are rapidly gaining importance in science as well as society as a whole [1, 2, 3, 4]. A considerable effort is being undertaken, firstly to characterize and describe how publicly available information, for example in the world wide web, is actually organized, and secondly, to design efficient methods to access this information. [1] R. M. Shiffrin and K. B¨orner, Proc. Natl. Acad. Sci. USA 101, 5183 (2004). [2] S. Lawrence, C.L. Giles, Nature 400, 107–109 (1999). [3] R.F.I. Cancho and R.V. Sol, Proc. R. Soc. London, Ser. B 268, 2261 (2001). [4] M. Sigman and G. A. Cecchi, Proc. Natl. Acad. Sci. USA 99, 1742 (2002). It goes further on explaining how the data is usually organized by taking general looks at it, but by looking at the abstract and conclusion it doesn't come with a conclusion or approach which results in a productive organization of a directory hierarchy. So, in essence, this is a problem for which I haven't found a solution yet; and I would love to see a scientific solution to this problem. Upon searching further, I don't seem to find anything useful or free papers that approach this problem so it might be that I'm looking in the wrong place. I've also noted that there are different ways to term this problem, which leads out to different results of papers. Perhaps a paper is out there, but I'm not just using the same terms as that paper uses? They often use more scientific terms. I've once heard a story about an advocate with a laptop which has simply outperformed an advocate with had tons of papers, which shows how proper organization leads to productivity; but that story didn't share details on how the advocate used the laptop or how he had organized his data. But in any case, it was way more useful than how most of us organize our data these days... Advice me how I should organize my data, I'm not looking for suggestions here. I would love to see statistics or scientific measurement approaches that help me confirm that it does help me reach my goal.

Read the article
File copying utility like rsync with error handling like ddrescue, for data recovery from a hard drive with bad sectors or hardware failure

- by purefusion

I have a hard drive with either bad blocks or sectors that are failing to read due to potential mechanical issues, such as a bad disk head, bad motor, or some other issue that is causing the hard drive to read data excruciatingly slowly and with lots of read errors. I'm seeing an average of 50 KB/sec, with some reads dropping below 10 KB/sec, and frequently it gets stuck on a file or sector altogether, usually for quite a long time—from 2-10 minutes or more (when using rsync, before it times out). Speed seems to vary wildly, and it gets stuck on files a lot, and when it finally gets "unstuck" it only seems to last for a short burst before it gets stuck again. The drive is also very quiet with only an occasional sound of files copying (usually when it gets stuck/unstuck for a brief time, before getting stuck again). Thus, there are none of those evil sounds that are normally associated with HDD death. Someone suggested that the problems sounded like they might be caused by a misaligned disk head, which requires a lot of re-reads before it finally reads data with success. Sounds plausible, but I digress... Anyway, the problem with rsync is that it seems to have no decent error handling support. Obviously, it wasn't meant for use in recovering data from failing hard drives, but all the so-called "data recovery" utilities out there that are meant for such use usually focus on recovery of deleted files or messed up partitions, rather than copying files off dying hard drives. Deleted file recovery is not what I need, obviously, so perhaps you can understand my disappointment in not being able to find what I'm after yet. Naturally, this is where you'd probably say "You should use ddrescue!" Well, that's all fine and dandy, but I've already got most of the data backed up, so I just want to recover certain files. I'm not concerned with trying to recover a full partition block-by-block as ddrescue does. I am only interested in rescuing just specific files and directories. Ideally, what I'd like is some sort of cross between rsync and ddrescue: something that lets me specify source and destination as directories of normal files like rsync (rather than two full partitions as ddrescue requires), with a way to skip files with errors in an initial run, and then allows me to attempt recovery of those files with errors in a later run (with a slightly altered command, of course), perhaps even offering an option to specify the number of retry attempts ...just like how ddrescue works with blocks, only I want a utility that works with specific files/directories like rsync does. So am I daydreaming here, or does something out there exist that can do this? Or, maybe even a way to make rsync or ddrescue work in such a way? I'm really open to whatever solutions might work, so long as they let me choose which files I want to "rescue", and can skip files with errors in the initial run, and try/retry those errors again later. So far I've tried rsync with the following options, but it often gets stuck on a file for longer than the timeout, and ideally I'd just like it to move on to the next file and come back later to the files it gets stuck on. I don't think that's possible though. Anyway, here's what I've been using up till now: rsync -avP --stats --block-size=512 --timeout=600 /path/to/source/* /path/to/destination/

Read the article
Using different SSDs types (not only SATA based) as system drive

- by Hubert Kario

Currently I have a Thinkpad X61s and want to make it both a bit faster and a bit more power efficient. For that reason I thought that adding SSD drive would make most sense. Unfortunately, because of financial reasons, buying SSD of over 200GB capacity is out of reach for me (not only it would be worth more than the rest of the laptop, but also I currently have a 500GB drive in it, so even such a drive would be kind of a downgrade for me). During preliminary testing with a cheap Transcend 4GB Class 6 (14MiB/s streaming, 9MiB/s random read) card I experienced boot times to be reduced by half so putting the OS only on it would already would be an improvement. Unfortunately, my system now is about 11GiB in size so anything less than 16GB would be constraining. In this laptop I can connect additional drives on at least 5 different ways: using SATA-ATA converter caddy in the X6 Ultrabase using internal mini PCIe slot using integrated SDHC slot using CardBus (a.k.a PCMCIA or PC Card) slot using USB Thankfully, because I use only Linux on this PC the bootability of them is irrelevant as I can put the /boot partition on internal HDD and / on any of the above mentioned Flash memories (as I already did for the SDHC test). From what I was able to research and from my own experience those options come with rather big downsides or other problems: SATA-ATA caddy It has three downsides: I have to carry the Ultrabse with me at all times (it's not really inconvenient, but those grams do add) and couldn't disconnect it when I want to disconnect the battery It makes the bay unusable for the optical drive and occasional quick access to other hard drives the only caddies I could buy have rather flaky controllers in them so putting my OS on it would hamper its stability Internal mini PCIe slot This would be an ideal solution, if only I could find real PCIe SSDs, not only devices that could talk only SATA or ATA over PCIe mechanical connection (the ones used in Dell Mini or Asus EEE). Theoretically Samsung did release such devices but I couldn't find them in retail anywhere. Integrated SDHC slot It's a nice solution with a single drawback: the fastest 16GB SDHC card on the market can only do around 35MiB/s read and 15MiB/s write while still costing like a normal 40GB SATA SSD that's 10 times faster. Not really cost-effective. CardBus (a.k.a PCMCIA or PC Card) slot Those cards are much faster than the SDHC option (there are ones that can do well over 50MiB/s read in benchmarks) and from what I could find the PCMCIA controller in my laptop does support UDMA so it should be able to deliver comparable speeds. They still cost similarly to SD cards but at least they provide streaming performance comparable to my current HDD. USB That's the worst option. Not only is it limited to 20-30MiB/s by the interface itself the drive would stick out of the laptop so it's a big no no. The question As such I think that going the "CF in a CardBus adapter" route will be the best option. My question is: did anyone try using CF cards in CardBus adapters as system drives with Linux on Thinkpad laptops? Laptops in general? What was the real-world performance? I don't have any CF cards so I can't check how well does it work with suspend/resume, or whatever it's easy to make it work in initramfs (I'm using ArchLinux and SD card was trivial — add 3 modules in single config line and rebuilding initramfs) so any tips/gotchas on this are welcome as well.

Read the article
Scripting Windows Shares - VBS

- by Calvin Piche

So i am totally new to VBS, never used it. I am trying to create multiple shares and i found a Microsoft VBS script that can do this(http://gallery.technet.microsoft.com/scriptcenter/6309d93b-fcc3-4586-b102-a71415244712) My question is, this script only allows for one domain group or user to be added for permissions where i am needing to add a couple with different permissions(got that figured out) Below is the script that i have modified for my needs but just need to add in the second group with the other permissions. If there is an easier way to do this please let me know. 'ShareSetup.vbs '========================================================================== Option Explicit Const FILE_SHARE = 0 Const MAXIMUM_CONNECTIONS = 25 Dim strComputer Dim objWMIService Dim objNewShare strComputer = "." Set objWMIService = GetObject("winmgmts:" & "{impersonationLevel=impersonate}!\\" & strComputer & "\root\cimv2") Set objNewShare = objWMIService.Get("Win32_Share") Call sharesec ("C:\Published Apps\Logs01", "Logs01", "Log01", "Support") Call sharesec2 ("C:\Published Apps\Logs01", "Logs01", "Log01", "Domain Admins") Sub sharesec(Fname,shr,info,account) 'Fname = Folder path, shr = Share name, info = Share Description, account = account or group you are assigning share permissions to Dim FSO Dim Services Dim SecDescClass Dim SecDesc Dim Trustee Dim ACE Dim Share Dim InParam Dim Network Dim FolderName Dim AdminServer Dim ShareName FolderName = Fname AdminServer = "\\" & strComputer ShareName = shr Set Services = GetObject("WINMGMTS:{impersonationLevel=impersonate,(Security)}!" & AdminServer & "\ROOT\CIMV2") Set SecDescClass = Services.Get("Win32_SecurityDescriptor") Set SecDesc = SecDescClass.SpawnInstance_() 'Set Trustee = Services.Get("Win32_Trustee").SpawnInstance_ 'Trustee.Domain = Null 'Trustee.Name = "EVERYONE" 'Trustee.Properties_.Item("SID") = Array(1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0) Set Trustee = SetGroupTrustee("domain", account) 'Replace ACME with your domain name. 'To assign permissions to individual accounts use SetAccountTrustee rather than SetGroupTrustee Set ACE = Services.Get("Win32_Ace").SpawnInstance_ ACE.Properties_.Item("AccessMask") = 1179817 ACE.Properties_.Item("AceFlags") = 3 ACE.Properties_.Item("AceType") = 0 ACE.Properties_.Item("Trustee") = Trustee SecDesc.Properties_.Item("DACL") = Array(ACE) Set Share = Services.Get("Win32_Share") Set InParam = Share.Methods_("Create").InParameters.SpawnInstance_() InParam.Properties_.Item("Access") = SecDesc InParam.Properties_.Item("Description") = "Public Share" InParam.Properties_.Item("Name") = ShareName InParam.Properties_.Item("Path") = FolderName InParam.Properties_.Item("Type") = 0 Share.ExecMethod_ "Create", InParam End Sub Sub sharesec2(Fname,shr,info,account) 'Fname = Folder path, shr = Share name, info = Share Description, account = account or group you are assigning share permissions to Dim FSO Dim Services Dim SecDescClass Dim SecDesc Dim Trustee Dim ACE2 Dim Share Dim InParam Dim Network Dim FolderName Dim AdminServer Dim ShareName FolderName = Fname AdminServer = "\\" & strComputer ShareName = shr Set Services = GetObject("WINMGMTS:{impersonationLevel=impersonate,(Security)}!" & AdminServer & "\ROOT\CIMV2") Set SecDescClass = Services.Get("Win32_SecurityDescriptor") Set SecDesc = SecDescClass.SpawnInstance_() 'Set Trustee = Services.Get("Win32_Trustee").SpawnInstance_ 'Trustee.Domain = Null 'Trustee.Name = "EVERYONE" 'Trustee.Properties_.Item("SID") = Array(1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0) Set Trustee = SetGroupTrustee("domain", account) 'Replace ACME with your domain name. 'To assign permissions to individual accounts use SetAccountTrustee rather than SetGroupTrustee Set ACE2 = Services.Get("Win32_Ace").SpawnInstance_ ACE2.Properties_.Item("AccessMask") = 1179817 ACE2.Properties_.Item("AceFlags") = 3 ACE2.Properties_.Item("AceType") = 0 ACE2.Properties_.Item("Trustee") = Trustee SecDesc.Properties_.Item("DACL") = Array(ACE2) End Sub Function SetAccountTrustee(strDomain, strName) set objTrustee = getObject("Winmgmts: {impersonationlevel=impersonate}!root/cimv2:Win32_Trustee").Spawninstance_ set account = getObject("Winmgmts: {impersonationlevel=impersonate}!root/cimv2:Win32_Account.Name='" & strName & "',Domain='" & strDomain &"'") set accountSID = getObject("Winmgmts: {impersonationlevel=impersonate}!root/cimv2:Win32_SID.SID='" & account.SID &"'") objTrustee.Domain = strDomain objTrustee.Name = strName objTrustee.Properties_.item("SID") = accountSID.BinaryRepresentation set accountSID = nothing set account = nothing set SetAccountTrustee = objTrustee End Function Function SetGroupTrustee(strDomain, strName) Dim objTrustee Dim account Dim accountSID set objTrustee = getObject("Winmgmts: {impersonationlevel=impersonate}!root/cimv2:Win32_Trustee").Spawninstance_ set account = getObject("Winmgmts:{impersonationlevel=impersonate}!root/cimv2:Win32_Group.Name='" & strName & "',Domain='" & strDomain &"'") set accountSID = getObject("Winmgmts: {impersonationlevel=impersonate}!root/cimv2:Win32_SID.SID='" & account.SID &"'") objTrustee.Domain = strDomain objTrustee.Name = strName objTrustee.Properties_.item("SID") = accountSID.BinaryRepresentation set accountSID = nothing set account = nothing set SetGroupTrustee = objTrustee End Function

Read the article
Generating a twitter OAuth access key - the semi-manual way

- by Piet

[UPDATE] Apparently someone at Twitter was listening, or I’m going senile/blind. Let’s call it a combination of both. Instead of following all the steps below, you could just login with the Twitter account you want to use on http://dev.twitter.com, register your application and then click ‘Edit Details’ on the application overview page at http://dev.twitter.com/apps. Next click the ‘Application detail’ button on the right, followed by the ‘My Access Token’ button in order to get your Access Token and Access Token Secret. This makes the old post below rather obsolete. Clearly a case of me thinking everything is a nail and ruby is a hammer (don’t they usually say this about java coders?) [ORIGINAL POST] OAuth is great! OAuth allows your application to use your user’s data without the need to ask for their password. So Twitter made the API much safer for their and your users. Hurray! Free pizza for everyone! Unless of course you’re using the Twitter API for your own needs like running your own bot and don’t need access to other user’s data. In such cases a simple username/password combination is more than enough. I can understand however that the Twitter guys don’t really care that much about these exceptions(?). Most such uses for the API are probably rather spammy in nature. !!! If you have a twitter app that uses the API to access external user’s data: look for another solution. This solution is ONLY meant when you ONLY need access to your own account(s) through the API. Other Solutions Mr Dallas Devries posted a solution here which involves requesting and scraping a one-time PIN. But: I like to minimize the amount of calls I make to twitter’s API or pages to lessen my chances of meeting the fail whale. Also, as soon as the pin isn’t included in a div called ‘oauth_pin’ anymore, this will fail. However, mr Devries’ post was a starting point for my solution, so I’m much obliged to him posting his findings. Authenticating with the Twitter API: old vs new Acessing The Twitter API the old way: require ‘twitter’ httpauth = Twitter::HTTPAuth.new('my_account','my_secret_password') client = Twitter::Base.new(httpauth) client.update(‘Hurray!’) The OAuth way: require 'twitter' oauth = Twitter::OAuth.new('ve4whatafuzzksaMQKjoI', 'KliketyklikspQ6qYALcuNandsomemored8pQ6qYALIG7mbEQY') oauth.authorize_from_access('123-owhfmeyAgfozdyt5hDeprSevsWmPo5rVeroGfsthis', 'fGiinCdqtehMeehiddenymDeAsasaawgGeryye8amh') client = Twitter::Base.new(oauth) client.update(‘Hurray!’) In the above case, ve4whatafuzzksaMQKjoI is the ‘consumer key’ (sometimes also referred to as ‘consumer token’) and KliketyklikspQ6qYALcuNandsomemored8pQ6qYALIG7mbEQY is the ‘consumer secret’. You’ll get these from Twitter when you register your app. 123-owhfmeyAgfozdyt5hDeprSevsWmPo5rVeroGfsthis is the ‘access token’ and fGiinCdqtehMeehiddenymDeAsasaawgGeryye8amh is the ‘access secret’. This combination gives the registered application access to your account. I’ll show you how to obtain these by following the steps below. (Basically you’ll need a bunch of keys and you’ll have to jump a bit through hoops to obtain them for your server/bot. ) How to get these keys 1. Surf to the twitter apps registration page go to http://dev.twitter.com/apps to register your app. Login with your twitter account. 2. Register your application Enter something for Application name, Description, website,… as I said: they make you jump through hoops. If you plan on using the API to post tweets, Your application name and website will be used in the ‘5 minutes ago via…’ line below your tweet. You could use the this to point to a page with info about your bot, or maybe it’s useful for SEO purposes. For application type I choose ‘browser’ and entered http://www.hadermann.be/callback as a ‘Callback URL’. This url returns a 404 error, which is ideal because after giving our account access to our ‘application’ (step 6), it will redirect to this url with an ‘oauth_token’ and ‘oauth_verifier’ in the url. We need to get these from the url. It doesn’t really matter what you enter here though, you could leave it blank because you need to explicitely specify it when generating a request token. You probably want read&write access so set this at ‘Default Access type’. 3. Get your consumer key and consumer secret On the next page, copy/paste your ‘consumer key’ and ‘consumer secret’. You’ll need these later on. You also need these as part of the authentication in your script later on: oauth = Twitter::OAuth.new([consumer key], [consumer secret]) 4. Obtain your request token run the following in IRB to obtain your ‘request token’ Replace my fake consumer key and consumer secret with the one you obtained in step 3. And use something else instead http://www.hadermann.be/callback: although this will only give a 404, you shouldn’t trust me. irb(main):001:0> require 'oauth' irb(main):002:0> c = OAuth::Consumer.new('ve4whatafuzzksaMQKjoI', 'KliketyklikspQ6qYALcuNandsomemored8pQ6qYALIG7mbEQY', {:site => 'http://twitter.com'}) irb(main):003:0> request_token = c.get_request_token(:oauth_callback => 'http://www.hadermann.be/callback') irb(main):004:0> request_token.token => "UrperqaukeWsWt3IAlfbxzyBUFpwWIcWkHP94QH2C1" This (UrperqaukeWsWt3IAlfbxzyBUFpwWIcWkHP94QH2C1) is the request token: Copy/paste this token, you will need this next. 5. Authorize your application surf to https://api.twitter.com/oauth/authorize?oauth_token=[the above token], for example: https://api.twitter.com/oauth/authorize?oauth_token=UrperqaukeWsWt3IAlfbxzyBUFpwWIcWkHP94QH2C1 This will bring you to the ‘An application would like to connect to your account’- screen on Twitter where you can grant access to the app you just registered. If you aren’t still logged in, you need to login first. Click ‘Allow’. Unless you don’t trust yourself. 6. Get your oauth_verifier from the redirected url Your browser will be redirected to your callback url, with an oauth_token and oauth_verifier parameter appended. You’ll need the oauth_verifier. In my case the browser redirected to: http://www.hadermann.be/callback?oauth_token=UrperqaukeWsWt3IAlfbxzyBUFpwWIcWkHP94QH2C1&oauth_verifier=waoOhKo8orpaqvQe6rVi5fti4ejr8hPeZrTewyeag Which returned a 404, giving me the chance to copy/paste my oauth_verifier: waoOhKo8orpaqvQe6rVi5fti4ejr8hPeZrTewyeag 7. Request an access token Back to irb, use the oauth_verifier to request an access token, as follows: irb(main):005:0> at = request_token.get_access_token(:oauth_verifier => 'waoOhKo8orpaqvQe6rVi5fti4ejr8hPeZrTewyeag') irb(main):006:0> at.params[:oauth_token] => "123-owhfmeyAgfozdyt5hDeprSevsWmPo5rVeroGfsthis" irb(main):007:0> at.params[:oauth_token_secret] => "fGiinCdqtehMeehiddenymDeAsasaawgGeryye8amh" We’re there! 123-owhfmeyAgfozdyt5hDeprSevsWmPo5rVeroGfsthis is the access token. fGiinCdqtehMeehiddenymDeAsasaawgGeryye8amh is the access secret. Try it! Try the following to post an update: require 'twitter' oauth = Twitter::OAuth.new('ve4whatafuzzksaMQKjoI', 'KliketyklikspQ6qYALcuNandsomemored8pQ6qYALIG7mbEQY') oauth.authorize_from_access('123-owhfmeyAgfozdyt5hDeprSevsWmPo5rVeroGfsthis', 'fGiinCdqtehMeehiddenymDeAsasaawgGeryye8amh') client = Twitter::Base.new(oauth) client.update(‘Cowabunga!’) Now you can go to your twitter page and delete the tweet if you want to.

Read the article
HttpContext.Items and Server.Transfer/Execute

- by Rick Strahl

A few days ago my buddy Ben Jones pointed out that he ran into a bug in the ScriptContainer control in the West Wind Web and Ajax Toolkit. The problem was basically that when a Server.Transfer call was applied the script container (and also various ClientScriptProxy script embedding routines) would potentially fail to load up the specified scripts. It turns out the problem is due to the fact that the various components in the toolkit use request specific singletons via a Current property. I use a static Current property tied to a Context.Items[] entry to handle this type of operation which looks something like this: /// <summary> /// Current instance of this class which should always be used to /// access this object. There are no public constructors to /// ensure the reference is used as a Singleton to further /// ensure that all scripts are written to the same clientscript /// manager. /// </summary> public static ClientScriptProxy Current { get { if (HttpContext.Current == null) return new ClientScriptProxy(); ClientScriptProxy proxy = null; if (HttpContext.Current.Items.Contains(STR_CONTEXTID)) proxy = HttpContext.Current.Items[STR_CONTEXTID] as ClientScriptProxy; else { proxy = new ClientScriptProxy(); HttpContext.Current.Items[STR_CONTEXTID] = proxy; } return proxy; } } The proxy is attached to a Context.Items[] item which makes the instance Request specific. This works perfectly fine in most situations EXCEPT when you’re dealing with Server.Transfer/Execute requests. Server.Transfer doesn’t cause Context.Items to be cleared so both the current transferred request and the original request’s Context.Items collection apply. For the ClientScriptProxy this causes a problem because script references are tracked on a per request basis in Context.Items to check for script duplication. Once a script is rendered an ID is written into the Context collection and so considered ‘rendered’: // No dupes - ref script include only once if (HttpContext.Current.Items.Contains( STR_SCRIPTITEM_IDENTITIFIER + fileId ) ) return; HttpContext.Current.Items.Add(STR_SCRIPTITEM_IDENTITIFIER + fileId, string.Empty); where the fileId is the script name or unique identifier. The problem is on the Transferred page the item will already exist in Context and so fail to render because it thinks the script has already rendered based on the Context item. Bummer. The workaround for this is simple once you know what’s going on, but in this case it was a bitch to track down because the context items are used in many places throughout this class. The trick is to determine when a request is transferred and then removing the specific keys. The first issue is to determine if a script is in a Trransfer or Execute call: if (HttpContext.Current.CurrentHandler != HttpContext.Current.Handler) Context.Handler is the original handler and CurrentHandler is the actual currently executing handler that is running when a Transfer/Execute is active. You can also use Context.PreviousHandler to get the last handler and chain through the whole list of handlers applied if Transfer calls are nested (dog help us all for the person debugging that). For the ClientScriptProxy the full logic to check for a transfer and remove the code looks like this: /// <summary> /// Clears all the request specific context items which are script references /// and the script placement index. /// </summary> public void ClearContextItemsOnTransfer() { if (HttpContext.Current != null) { // Check for Server.Transfer/Execute calls - we need to clear out Context.Items if (HttpContext.Current.CurrentHandler != HttpContext.Current.Handler) { List<string> Keys = HttpContext.Current.Items.Keys.Cast<string>().Where(s => s.StartsWith(STR_SCRIPTITEM_IDENTITIFIER) || s == STR_ScriptResourceIndex).ToList(); foreach (string key in Keys) { HttpContext.Current.Items.Remove(key); } } } } along with a small update to the Current property getter that sets a global flag to indicate whether the request was transferred: if (!proxy.IsTransferred && HttpContext.Current.Handler != HttpContext.Current.CurrentHandler) { proxy.ClearContextItemsOnTransfer(); proxy.IsTransferred = true; } return proxy; I know this is pretty ugly, but it works and it’s actually minimal fuss without affecting the behavior of the rest of the class. Ben had a different solution that involved explicitly clearing out the Context items and replacing the collection with a manually maintained list of items which also works, but required changes through the code to make this work. In hindsight, it would have been better to use a single object that encapsulates all the ‘persisted’ values and store that object in Context instead of all these individual small morsels. Hindsight is always 20/20 though :-}. If possible use Page.Items ClientScriptProxy is a generic component that can be used from anywhere in ASP.NET, so there are various methods that are not Page specific on this component which is why I used Context.Items, rather than the Page.Items collection.Page.Items would be a better choice since it will sidestep the above Server.Transfer nightmares as the Page is reloaded completely and so any new Page gets a new Items collection. No fuss there. So for the ScriptContainer control, which has to live on the page the behavior is a little different. It is attached to Page.Items (since it’s a control): /// <summary> /// Returns a current instance of this control if an instance /// is already loaded on the page. Otherwise a new instance is /// created, added to the Form and returned. /// /// It's important this function is not called too early in the /// page cycle - it should not be called before Page.OnInit(). /// /// This property is the preferred way to get a reference to a /// ScriptContainer control that is either already on a page /// or needs to be created. Controls in particular should always /// use this property. /// </summary> public static ScriptContainer Current { get { // We need a context for this to work! if (HttpContext.Current == null) return null; Page page = HttpContext.Current.CurrentHandler as Page; if (page == null) throw new InvalidOperationException(Resources.ERROR_ScriptContainer_OnlyWorks_With_PageBasedHandlers); ScriptContainer ctl = null; // Retrieve the current instance ctl = page.Items[STR_CONTEXTID] as ScriptContainer; if (ctl != null) return ctl; ctl = new ScriptContainer(); page.Form.Controls.Add(ctl); return ctl; } } The biggest issue with this approach is that you have to explicitly retrieve the page in the static Current property. Notice again the use of CurrentHandler (rather than Handler which was my original implementation) to ensure you get the latest page including the one that Server.Transfer fired. Server.Transfer and Server.Execute are Evil All that said – this fix is probably for the 2 people who are crazy enough to rely on Server.Transfer/Execute. :-} There are so many weird behavior problems with these commands that I avoid them at all costs. I don’t think I have a single application that uses either of these commands… Related Resources Full source of ClientScriptProxy.cs (repository) Part of the West Wind Web Toolkit Static Singletons for ASP.NET Controls Post © Rick Strahl, West Wind Technologies, 2005-2010Posted in ASP.NET

Read the article
Reporting Services - It's a Wrap!

- by smisner

If you have any experience at all with Reporting Services, you have probably developed a report using the matrix data region. It's handy when you want to generate columns dynamically based on data. If users view a matrix report online, they can scroll horizontally to view all columns and all is well. But if they want to print the report, the experience is completely different and you'll have to decide how you want to handle dynamic columns. By default, when a user prints a matrix report for which the number of columns exceeds the width of the page, Reporting Services determines how many columns can fit on the page and renders one or more separate pages for the additional columns. In this post, I'll explain two techniques for managing dynamic columns. First, I'll show how to use the RepeatRowHeaders property to make it easier to read a report when columns span multiple pages, and then I'll show you how to "wrap" columns so that you can avoid the horizontal page break. Included with this post are the sample RDLs for download. First, let's look at the default behavior of a matrix. A matrix that has too many columns for one printed page (or output to page-based renderer like PDF or Word) will be rendered such that the first page with the row group headers and the inital set of columns, as shown in Figure 1. The second page continues by rendering the next set of columns that can fit on the page, as shown in Figure 2.This pattern continues until all columns are rendered. The problem with the default behavior is that you've lost the context of employee and sales order - the row headers - on the second page. That makes it hard for users to read this report because the layout requires them to flip back and forth between the current page and the first page of the report. You can fix this behavior by finding the RepeatRowHeaders of the tablix report item and changing its value to True. The second (and subsequent pages) of the matrix now look like the image shown in Figure 3. The problem with this approach is that the number of printed pages to flip through is unpredictable when you have a large number of potential columns. What if you want to include all columns on the same page? You can take advantage of the repeating behavior of a tablix and get repeating columns by embedding one tablix inside of another. For this example, I'm using SQL Server 2008 R2 Reporting Services. You can get similar results with SQL Server 2008. (In fact, you could probably do something similar in SQL Server 2005, but I haven't tested it. The steps would be slightly different because you would be working with the old-style matrix as compared to the new-style tablix discussed in this post.) I created a dataset that queries AdventureWorksDW2008 tables: SELECT TOP (100) e.LastName + ', ' + e.FirstName AS EmployeeName, d.FullDateAlternateKey, f.SalesOrderNumber, p.EnglishProductName, sum(SalesAmount) as SalesAmount FROM FactResellerSales AS f INNER JOIN DimProduct AS p ON p.ProductKey = f.ProductKey INNER JOIN DimDate AS d ON d.DateKey = f.OrderDateKey INNER JOIN DimEmployee AS e ON e.EmployeeKey = f.EmployeeKey GROUP BY p.EnglishProductName, d.FullDateAlternateKey, e.LastName + ', ' + e.FirstName, f.SalesOrderNumber ORDER BY EmployeeName, f.SalesOrderNumber, p.EnglishProductName To start the report: Add a matrix to the report body and drag Employee Name to the row header, which also creates a group. Next drag SalesOrderNumber below Employee Name in the Row Groups panel, which creates a second group and a second column in the row header section of the matrix, as shown in Figure 4. Now for some trickiness. Add another column to the row headers. This new column will be associated with the existing EmployeeName group rather than causing BIDS to create a new group. To do this, right-click on the EmployeeName textbox in the bottom row, point to Insert Column, and then click Inside Group-Right. Then add the SalesOrderNumber field to this new column. By doing this, you're creating a report that repeats a set of columns for each EmployeeName/SalesOrderNumber combination that appears in the data. Next, modify the first row group's expression to group on both EmployeeName and SalesOrderNumber. In the Row Groups section, right-click EmployeeName, click Group Properties, click the Add button, and select [SalesOrderNumber]. Now you need to configure the columns to repeat. Rather than use the Columns group of the matrix like you might expect, you're going to use the textbox that belongs to the second group of the tablix as a location for embedding other report items. First, clear out the text that's currently in the third column - SalesOrderNumber - because it's already added as a separate textbox in this report design. Then drag and drop a matrix into that textbox, as shown in Figure 5. Again, you need to do some tricks here to get the appearance and behavior right. We don't really want repeating rows in the embedded matrix, so follow these steps: Click on the Rows label which then displays RowGroup in the Row Groups pane below the report body. Right-click on RowGroup,click Delete Group, and select the option to delete associated rows and columns. As a result, you get a modified matrix which has only a ColumnGroup in it, with a row above a double-dashed line for the column group and a row below the line for the aggregated data. Let's continue: Drag EnglishProductName to the data textbox (below the line). Add a second data row by right-clicking EnglishProductName, pointing to Insert Row, and clicking Below. Add the SalesAmount field to the new data textbox. Now eliminate the column group row without eliminating the group. To do this, right-click the row above the double-dashed line, click Delete Rows, and then select Delete Rows Only in the message box. Now you're ready for the fit and finish phase: Resize the column containing the embedded matrix so that it fits completely. Also, the final column in the matrix is for the column group. You can't delete this column, but you can make it as small as possible. Just click on the matrix to display the row and column handles, and then drag the right edge of the rightmost column to the left to make the column virtually disappear. Next, configure the groups so that the columns of the embedded matrix will wrap. In the Column Groups pane, right-click ColumnGroup1 and click on the expression button (labeled fx) to the right of Group On [EnglishProductName]. Replace the expression with the following: =RowNumber("SalesOrderNumber" ). We use SalesOrderNumber here because that is the name of the group that "contains" the embedded matrix. The next step is to configure the number of columns to display before wrapping. Click any cell in the matrix that is not inside the embedded matrix, and then double-click the second group in the Row Groups pane - SalesOrderNumber. Change the group expression to the following expression: =Ceiling(RowNumber("EmployeeName")/3) The last step is to apply formatting. In my example, I set the SalesAmount textbox's Format property to C2 and also right-aligned the text in both the EnglishProductName and the SalesAmount textboxes. And voila - Figure 6 shows a matrix report with wrapping columns. Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
Issue 15: The Benefits of Oracle Exastack

- by rituchhibber

SOLUTIONS FOCUS The Benefits of Oracle Exastack Paul ThompsonDirector, Alliances and Solutions Partner ProgramsOracle EMEA Alliances & Channels RESOURCES -- Oracle PartnerNetwork (OPN) Oracle Exastack Program Oracle Exastack Ready Oracle Exastack Optimized Oracle Exastack Labs and Enablement Resources Oracle Exastack Labs Video Tour SUBSCRIBE FEEDBACK PREVIOUS ISSUES Exastack is a revolutionary programme supporting Oracle independent software vendor partners across the entire Oracle technology stack. Oracle's core strategy is to engineer software and hardware together, and our ISV strategy is the same. At Oracle we design engineered systems that are pre-integrated to reduce the cost and complexity of IT infrastructures while increasing productivity and performance. Oracle innovates and optimises performance at every layer of the stack to simplify business operations, drive down costs and accelerate business innovation. Our engineered systems are optimised to achieve enterprise performance levels that are unmatched in the industry. Faster time to production is achieved by implementing pre-engineered and pre-assembled hardware and software bundles. Our strategy of delivering a single-vendor stack simplifies and reduces costs associated with purchasing, deploying, and supporting IT environments for our customers and partners. In parallel to this core engineered systems strategy, the Oracle Exastack Program enables our Oracle ISV partners to leverage a scalable, integrated infrastructure that delivers their applications tuned, tested and optimised for high-performance. Specifically, the Oracle Exastack Program helps ISVs run their solutions on the Oracle Exadata Database Machine, Oracle Exalogic Elastic Cloud, and Oracle SPARC SuperCluster T4-4 - integrated systems products in which the software and hardware are engineered to work together. These products provide OPN members with a lower cost and high performance infrastructure for database and application workloads across on-premise and cloud based environments. Ready and Optimized Oracle Partners can now leverage our new Oracle Exastack Program to become Oracle Exastack Ready and Oracle Exastack Optimized. Partners can achieve Oracle Exastack Ready status through their support for Oracle Solaris, Oracle Linux, Oracle VM, Oracle Database, Oracle WebLogic Server, Oracle Exadata Database Machine, Oracle Exalogic Elastic Cloud, and Oracle SPARC SuperCluster T4-4. By doing this, partners can demonstrate to their customers that their applications are available on the latest major releases of these products. The Oracle Exastack Ready programme helps customers readily differentiate Oracle partners from lesser software developers, and identify applications that support Oracle engineered systems. Achieving Oracle Exastack Optimized status demonstrates that an OPN member has proven itself against goals for performance and scalability on Oracle integrated systems. This status enables end customers to readily identify Oracle partners that have tested and tuned their solutions for optimum performance on an Oracle Exadata Database Machine, Oracle Exalogic Elastic Cloud, and Oracle SPARC SuperCluster T4-4. These ISVs can display the Oracle Exadata Optimized, Oracle Exalogic Optimized or Oracle SPARC SuperCluster Optimized logos on websites and on all their collateral to show that they have tested and tuned their application for optimum performance. Deliver higher value to customers Oracle's investment in engineered systems enables ISV partners to deliver higher value to customer business processes. New innovations are enabled through extreme performance unachievable through traditional best-of-breed multi-vendor server/software approaches. Core product requirements can be launched faster, enabling ISVs to focus research and development investment on core competencies in order to bring value to market as quickly as possible. Through Exastack, partners no longer have to worry about the underlying product stack, which allows greater focus on the development of intellectual property above the stack. Partners are not burdened by platform issues and can concentrate simply on furthering their applications. The advantage to end customers is that partners can focus all efforts on business functionality, rather than bullet-proofing underlying technologies, and so will inevitably deliver application updates faster. Exastack provides ISVs with a number of flexible deployment options, such as on-premise or Cloud, while maintaining one single code base for applications regardless of customer deployment preference. Customers buying their solutions from Exastack ISVs can therefore be confident in deploying on their own networks, on private clouds or into a public cloud. The underlying platform will support all conceivable deployments, enabling a focus on the ISV's application itself that wouldn't be possible with other vendor partners. It stands to reason that Exastack accelerates time to value as well as lowering implementation costs all round. There is a big competitive advantage in partners being able to offer customers an optimised, pre-configured solution rather than an assortment of components and a suggested fit. Once a customer has decided to buy an Oracle Exastack Ready or Optimized partner solution, it will be up and running without any need for the customer to conduct testing of its own. Operational costs and complexity are also reduced, thanks to streamlined customer support through standardised configurations and pro-active monitoring. 'Engineered to Work Together' is a significant statement of Oracle strategy. It guarantees smoother deployment of a single vendor solution, clear ownership with no finger-pointing and the peace of mind of the Oracle Support Centre underpinning the entire product stack. Next steps Every OPN member with packaged applications must seriously consider taking steps to become Exastack Ready, or Exastack Optimized at the first opportunity. That first step down the track is to talk to an expert on the OPN Portal, at the Oracle Partner Business Center or to discuss the next steps with the closest Oracle account manager. Oracle Exastack lab environments and other technical enablement resources are available for OPN members wishing to further their knowledge of Oracle Exastack and qualify their applications for Oracle Exastack Optimized. New Boot Camps and Guided Learning Paths (GLPs), tailored specifically for ISVs, are available for Oracle Exadata Database Machine, Oracle Exalogic Elastic Cloud, Oracle Linux, Oracle Solaris, Oracle Database, and Oracle WebLogic Server. More information about these GLPs and Boot Camps (including delivery dates and locations) are posted on the OPN Competency Center and corresponding OPN Knowledge Zones. Learn more about Oracle Exastack labs and ISV specific enablement resources. "Oracle Specialized partners are of course front-and-centre, with potential customers clearly directed to those partners and to Exadata Ready partners as a matter of priority." --More OpenWorld 2011 highlights for Oracle partners and customers Oracle Application Testing Suite 9.3 application testing solution for Web, SOA and Oracle Applications Oracle Application Express Release 4.1 improving the development of database-centric Web 2.0 applications and reports Oracle Unified Directory 11g helping customers manage the critical identity information that drives their business applications Oracle SOA Suite for healthcare integration Oracle Enterprise Pack for Eclipse 11g demonstrating continued commitment to the developer and open source communities Oracle Coherence 3.7.1, the latest release of the industry's leading distributed in-memory data grid Oracle Process Accelerators helping to simplify and accelerate time-to-value for customers' business process management initiatives Oracle's JD Edwards EnterpriseOne on the iPad meeting the increasingly mobile demands of today's workforces Oracle CRM On Demand Release 19 Innovation Pack introducing industry-leading hosted call centre and enterprise-marketing capabilities designed to drive further revenue and productivity while reducing costs and improving the customer experience Oracle's Primavera Portfolio Management 9 for businesses delivering on project portfolio goals with increased versatility, transparency and accuracy Oracle's PeopleSoft Human Capital Management (HCM) 9.1 On Demand Standard Edition helping customers manage their long-term investment in enterprise-wide business applications New versions of Oracle FLEXCUBE Universal Banking and Oracle FLEXCUBE Investor Servicing for Financial Institutions, as well as Oracle Financial Services Enterprise Case Management, Oracle Financial Services Pricing Management, Oracle Financial Management Analytics and Oracle Tax Analytics Oracle Utilities Network Management System 1.11 offering new modelling and analysis features to improve distribution-grid management for electric utilities Oracle Communications Network Charging and Control 4.4 helping communications service providers (CSPs) offer their customers more flexible charging options Plus many, many more technology announcements, enhancements, momentum news and community updates -- Oracle OpenWorld 2012 A date has already been set for Oracle OpenWorld 2012. Held once again in San Francisco, exhibitors, partners, customers and Oracle people will gather from 30 September until 4 November to meet, network and learn together with the rest of the global Oracle community. Register now for Oracle OpenWorld 2012 and save $$$! We'll reward your early planning for Oracle OpenWorld 2012 with reduced rates. Super Saver deals are now available! -- Back to the welcome page

Read the article
BNF – how to read syntax?

- by Piotr Rodak

A few days ago I read post of Jen McCown (blog) about her idea of blogging about random articles from Books Online. I think this is a great idea, even if Jen says that it’s not exciting or sexy. I noticed that many of the questions that appear on forums and other media arise from pure fact that people asking questions didn’t bother to read and understand the manual – Books Online. Jen came up with a brilliant, concise acronym that describes very well the category of posts about Books Online – RTFM365. I take liberty of tagging this post with the same acronym. I often come across questions of type – ‘Hey, i am trying to create a table, but I am getting an error’. The error often says that the syntax is invalid. 1 CREATE TABLE dbo.Employees 2 (guid uniqueidentifier CONSTRAINT DEFAULT Guid_Default NEWSEQUENTIALID() ROWGUIDCOL, 3 Employee_Name varchar(60) 4 CONSTRAINT Guid_PK PRIMARY KEY (guid) ); 5 The answer is usually(1), ‘Ok, let me check it out.. Ah yes – you have to put name of the DEFAULT constraint before the type of constraint: 1 CREATE TABLE dbo.Employees 2 (guid uniqueidentifier CONSTRAINT Guid_Default DEFAULT NEWSEQUENTIALID() ROWGUIDCOL, 3 Employee_Name varchar(60) 4 CONSTRAINT Guid_PK PRIMARY KEY (guid) ); Why many people stumble on syntax errors? Is the syntax poorly documented? No, the issue is, that correct syntax of the CREATE TABLE statement is documented very well in Books Online and is.. intimidating. Many people can be taken aback by the rather complex block of code that describes all intricacies of the statement. However, I don’t know better way of defining syntax of the statement or command. The notation that is used to describe syntax in Books Online is a form of Backus-Naur notatiion, called BNF for short sometimes. This is a notation that was invented around 50 years ago, and some say that even earlier, around 400 BC – would you believe? Originally it was used to define syntax of, rather ancient now, ALGOL programming language (in 1950’s, not in ancient India). If you look closer at the definition of the BNF, it turns out that the principles of this syntax are pretty simple. Here are a few bullet points: italic_text is a placeholder for your identifier <italic_text_in_angle_brackets> is a definition which is described further. [everything in square brackets] is optional {everything in curly brackets} is obligatory everything | separated | by | operator is an alternative ::= “assigns” definition to an identifier Yes, it looks like these six simple points give you the key to understand even the most complicated syntax definitions in Books Online. Books Online contain an article about syntax conventions – have you ever read it? Let’s have a look at fragment of the CREATE TABLE statement: 1 CREATE TABLE 2 [ database_name . [ schema_name ] . | schema_name . ] table_name 3 ( { <column_definition> | <computed_column_definition> 4 | <column_set_definition> } 5 [ <table_constraint> ] [ ,...n ] ) 6 [ ON { partition_scheme_name ( partition_column_name ) | filegroup 7 | "default" } ] 8 [ { TEXTIMAGE_ON { filegroup | "default" } ] 9 [ FILESTREAM_ON { partition_scheme_name | filegroup 10 | "default" } ] 11 [ WITH ( <table_option> [ ,...n ] ) ] 12 [ ; ] Let’s look at line 2 of the above snippet: This line uses rules 3 and 5 from the list. So you know that you can create table which has specified one of the following. just name – table will be created in default user schema schema name and table name – table will be created in specified schema database name, schema name and table name – table will be created in specified database, in specified schema database name, .., table name – table will be created in specified database, in default schema of the user. Note that this single line of the notation describes each of the naming schemes in deterministic way. The ‘optionality’ of the schema_name element is nested within database_name.. section. You can use either database_name and optional schema name, or just schema name – this is specified by the pipe character ‘|’. The error that user gets with execution of the first script fragment in this post is as follows: Msg 156, Level 15, State 1, Line 2 Incorrect syntax near the keyword 'DEFAULT'. Ok, let’s have a look how to find out the correct syntax. Line number 3 of the BNF fragment above contains reference to <column_definition>. Since column_definition is in angle brackets, we know that this is a reference to notion described further in the code. And indeed, the very next fragment of BNF contains syntax of the column definition. 1 <column_definition> ::= 2 column_name <data_type> 3 [ FILESTREAM ] 4 [ COLLATE collation_name ] 5 [ NULL | NOT NULL ] 6 [ 7 [ CONSTRAINT constraint_name ] DEFAULT constant_expression ] 8 | [ IDENTITY [ ( seed ,increment ) ] [ NOT FOR REPLICATION ] 9 ] 10 [ ROWGUIDCOL ] [ <column_constraint> [ ...n ] ] 11 [ SPARSE ] Look at line 7 in the above fragment. It says, that the column can have a DEFAULT constraint which, if you want to name it, has to be prepended with [CONSTRAINT constraint_name] sequence. The name of the constraint is optional, but I strongly recommend you to make the effort of coming up with some meaningful name yourself. So the correct syntax of the CREATE TABLE statement from the beginning of the article is like this: 1 CREATE TABLE dbo.Employees 2 (guid uniqueidentifier CONSTRAINT Guid_Default DEFAULT NEWSEQUENTIALID() ROWGUIDCOL, 3 Employee_Name varchar(60) 4 CONSTRAINT Guid_PK PRIMARY KEY (guid) ); That is practically everything you should know about BNF. I encourage you to study the syntax definitions for various statements and commands in Books Online, you can find really interesting things hidden there. Technorati Tags: SQL Server,t-sql,BNF,syntax (1) No, my answer usually is a question – ‘What error message? What does it say?’. You’d be surprised to know how many people think I can go through time and space and look at their screen at the moment they received the error.

Read the article
Now Available – Windows Azure SDK 1.6

- by Shaun

Microsoft has just announced the Windows Azure SDK 1.6 and the Windows Azure Tools for Visual Studio 1.6. Now people can download the latest product through the WebPI. After you downloaded and installed the SDK you will find that The SDK 1.6 can be stayed side by side with the SDK 1.5, which means you can still using the 1.5 assemblies. But the Visual Studio Tools would be upgraded to 1.6. Different from the previous SDK, in this version it includes 4 components: Windows Azure Authoring Tools, Windows Azure Emulators, Windows Azure Libraries for .NET 1.6 and the Windows Azure Tools for Microsoft Visual Studio 2010. There are some significant upgrades in this version, which are Publishing Enhancement: More easily connect to the Windows Azure when publish your application by retrieving a publish setting file. It will let you configure some settings of the deployment, without getting back to the developer portal. Multi-profiles: The publish settings, cloud configuration files, etc. will be stored in one or more MSBuild files. It will be much easier to switch the settings between vary build environments. MSBuild Command-line Build Support. In-Place Upgrade Support. Publishing Enhancement So let’s have a look about the new features of the publishing. Just create a new Windows Azure project in Visual Studio 2010 with a MVC 3 Web Role, and right-click the Windows Azure project node in the solution explorer, then select Publish, we will find the new publish dialog. In this version the first thing we need to do is to connect to our Windows Azure subscription. Click the “Sign in to download credentials” link, we will be navigated to the login page to provide the Live ID. The Windows Azure Tool will generate a certificate file and uploaded to the subscriptions those belong to us. Then we will download a PUBLISHSETTINGS file, which contains the credentials and subscriptions information. The Visual Studio Tool will generate a certificate and deployed to the subscriptions you have as the Management Certificate. The VS Tool will use this certificate to connect to the subscription in the next step. In the next step, I would back to the Visual Studio (the publish dialog should be stilling opened) and click the Import button, select the PUBLISHSETTINGS file I had just downloaded. Then all my subscriptions will be shown in the dropdown list. Select a subscription that I want the application to be published and press the Next button, then we can select the hosted service, environment, build configuration and service configuration shown in the dialog. In this version we can create a new hosted service directly here rather than go back to the developer portal. Just select the <Create New …> item in the hosted service. What we need to do is to provide the hosted service name and the location. Once clicked the OK, after several seconds the hosted service will be established. If we went to the developer portal we will find the new hosted service in my subscription. a) Currently we cannot select the Affinity Group when create a new hosted service through the Visual Studio Publish dialog. b) Although we can specify the hosted service name and DNS prefixing through the developer portal, we cannot do so from the VS Tool, which means the DNS prefixing would be the same as what we specified for the hosted service name. For example, we specified our hosted service name as “Sdk16Demo”, so the public URL would be http://sdk16demo.cloudapp.net/. After created a new hosted service we can select the cloud environment (production or staging), the build configuration (release or debug), and the service configuration (cloud or local). And we can set the Remote Desktop by check the related checkbox as well. One thing should be note is that, in this version when we set the Remote Desktop settings we don’t need to specify a certificate by default. This is because the Visual Studio will generate a new certificate for us by default. But we can still specify an existing certificate for RDC, by clicking the “More Options” button. Visual Studio Tool will create another certificate for the Remote Desktop connection. It will NOT use the certificate that managing the subscription. We also can select the “Advanced Settings” page to specify the deployment label, storage account, IntelliTrace and .NET profiling information, etc.. Press Next button, the dialog will display all settings I had just specified and it will save them as a new profile. The last step is to click the Publish button. Since we enabled the Remote Desktop feature, the first step of publishing was uploading the certificate. And then it will verify the storage account we specified and upload the package, then finally created the website in Windows Azure. Multi-Profiles After published, if we back to the Visual Studio we can find a AZUREPUBXML file under the Profiles folder in the Azure project. It includes all settings we specified before. If we publish this project again, we can just use the current settings (hosted service, environment, RDC, etc.) from this profile without input them again. And this is very useful when we have more than one deployment settings. For example it would be able to have one AZUREPUBXML profile for deploying to testing environment (debug building, less roles with RDC and IntelliTrace) and one for production (release building, more roles but without IntelliTrace). In-Place Upgrade Support Let’s change some codes in the MVC pages and click the Publish menu from the azure project node. No need to specify any settings, here we can use the pervious settings by loading the azure profile file (AZUREPUBXML). After clicked the Publish button the VS Tool brought a dialog to us to indicate that there’s a deployment available in the hosted service environment, and prompt to REPLACE it or not. Notice that in this version, the dialog tool said “replace” rather than “delete”, which means by default the VS Tool will use In-Place Upgrade when we deploy to a hosted service that has a deployment already exist. After click Yes the VS Tool will upload the package and perform the In-Place Upgrade. If we back to the developer portal we can find that the status of the hosted service was turned to “Updating…”. But in the previous SDK, it will try to delete the whole deployment and publish a new one. Summary When the Microsoft announced the features that allows the changing VM size via In-Place Upgrade, they also mentioned that in the next few versions the user experience of publishing the azure application would be improved. The target was trying to accomplish the whole publish experience in Visual Studio, which means no need to touch developer portal any more. In the SDK 1.6 we can see from the new publish dialog, as a developer we can do the whole process, includes creating hosted service, specifying the environment, configuration, remote desktop, etc. values without going back the the developer portal. Hope this helps, Shaun All documents and related graphics, codes are provided "AS IS" without warranty of any kind. Copyright © Shaun Ziyan Xu. This work is licensed under the Creative Commons License.

Read the article
Windows Azure VMs - New "Stopped" VM Options Provide Cost-effective Flexibility for On-Demand Workloads

- by KeithMayer

Originally posted on: http://geekswithblogs.net/KeithMayer/archive/2013/06/22/windows-azure-vms---new-stopped-vm-options-provide-cost-effective.aspxDidn’t make it to TechEd this year? Don’t worry! This month, we’ll be releasing a new article series that highlights the Best of TechEd announcements and technical information for IT Pros. Today’s article focuses on a new, much-heralded enhancement to Windows Azure Infrastructure Services to make it more cost-effective for spinning VMs up and down on-demand on the Windows Azure cloud platform. NEW! VMs that are shutdown from the Windows Azure Management Portal will no longer continue to accumulate compute charges while stopped! Previous to this enhancement being available, the Azure platform maintained fabric resource reservations for VMs, even in a shutdown state, to ensure consistent resource availability when starting those VMs in the future. And, this meant that VMs had to be exported and completely deprovisioned when not in use to avoid compute charges. In this article, I'll provide more details on the scenarios that this enhancement best fits, and I'll also review the new options and considerations that we now have for performing safe shutdowns of Windows Azure VMs. Which scenarios does the new enhancement best fit? Being able to easily shutdown VMs from the Windows Azure Management Portal without continued compute charges is a great enhancement for certain cloud use cases, such as: On-demand dev/test/lab environments - Freely start and stop lab VMs so that they are only accumulating compute charges when being actively used. "Bursting" load-balanced web applications - Provision a number of load-balanced VMs, but keep the minimum number of VMs running to support "normal" loads. Easily start-up the remaining VMs only when needed to support peak loads. Disaster Recovery - Start-up "cold" VMs when needed to recover from disaster scenarios. BUT ... there is a consideration to keep in mind when using the Windows Azure Management Portal to shutdown VMs: although performing a VM shutdown via the Windows Azure Management Portal causes that VM to no longer accumulate compute charges, it also deallocates the VM from fabric resources to which it was previously assigned. These fabric resources include compute resources such as virtual CPU cores and memory, as well as network resources, such as IP addresses. This means that when the VM is later started after being shutdown from the portal, the VM could be assigned a different IP address or placed on a different compute node within the fabric. In some cases, you may want to shutdown VMs using the old approach, where fabric resource assignments are maintained while the VM is in a shutdown state. Specifically, you may wish to do this when temporarily shutting down or restarting a "7x24" VM as part of a maintenance activity. Good news - you can still revert back to the old VM shutdown behavior when necessary by using the alternate VM shutdown approaches listed below. Let's walk through each approach for performing a VM Shutdown action on Windows Azure so that we can understand the benefits and considerations of each... How many ways can I shutdown a VM? In Windows Azure Infrastructure Services, there's three general ways that can be used to safely shutdown VMs: Shutdown VM via Windows Azure Management Portal Shutdown Guest Operating System inside the VM Stop VM via Windows PowerShell using Windows Azure PowerShell Module Although each of these options performs a safe shutdown of the guest operation system and the VM itself, each option handles the VM shutdown end state differently. Shutdown VM via Windows Azure Management Portal When clicking the Shutdown button at the bottom of the Virtual Machines page in the Windows Azure Management Portal, the VM is safely shutdown and "deallocated" from fabric resources. Shutdown button on Virtual Machines page in Windows Azure Management Portal When the shutdown process completes, the VM will be shown on the Virtual Machines page with a "Stopped ( Deallocated )" status as shown in the figure below. Virtual Machine in a "Stopped (Deallocated)" Status "Deallocated" means that the VM configuration is no longer being actively associated with fabric resources, such as virtual CPUs, memory and networks. In this state, the VM will not continue to allocate compute charges, but since fabric resources are deallocated, the VM could receive a different internal IP address ( called "Dynamic IPs" or "DIPs" in Windows Azure ) the next time it is started. TIP: If you are leveraging this shutdown option and consistency of DIPs is important to applications running inside your VMs, you should consider using virtual networks with your VMs. Virtual networks permit you to assign a specific IP Address Space for use with VMs that are assigned to that virtual network. As long as you start VMs in the same order in which they were originally provisioned, each VM should be reassigned the same DIP that it was previously using. What about consistency of External IP Addresses? Great question! External IP addresses ( called "Virtual IPs" or "VIPs" in Windows Azure ) are associated with the cloud service in which one or more Windows Azure VMs are running. As long as at least 1 VM inside a cloud service remains in a "Running" state, the VIP assigned to a cloud service will be preserved. If all VMs inside a cloud service are in a "Stopped ( Deallocated )" status, then the cloud service may receive a different VIP when VMs are next restarted. TIP: If consistency of VIPs is important for the cloud services in which you are running VMs, consider keeping one VM inside each cloud service in the alternate VM shutdown state listed below to preserve the VIP associated with the cloud service. Shutdown Guest Operating System inside the VM When performing a Guest OS shutdown or restart ( ie., a shutdown or restart operation initiated from the Guest OS running inside the VM ), the VM configuration will not be deallocated from fabric resources. In the figure below, the VM has been shutdown from within the Guest OS and is shown with a "Stopped" VM status rather than the "Stopped ( Deallocated )" VM status that was shown in the previous figure. Note that it may require a few minutes for the Windows Azure Management Portal to reflect that the VM is in a "Stopped" state in this scenario, because we are performing an OS shutdown inside the VM rather than through an Azure management endpoint. Virtual Machine in a "Stopped" Status VMs shown in a "Stopped" status will continue to accumulate compute charges, because fabric resources are still being reserved for these VMs. However, this also means that DIPs and VIPs are preserved for VMs in this state, so you don't have to worry about VMs and cloud services getting different IP addresses when they are started in the future. Stop VM via Windows PowerShell In the latest version of the Windows Azure PowerShell Module, a new -StayProvisioned parameter has been added to the Stop-AzureVM cmdlet. This new parameter provides the flexibility to choose the VM configuration end result when stopping VMs using PowerShell: When running the Stop-AzureVM cmdlet without the -StayProvisioned parameter specified, the VM will be safely stopped and deallocated; that is, the VM will be left in a "Stopped ( Deallocated )" status just like the end result when a VM Shutdown operation is performed via the Windows Azure Management Portal. When running the Stop-AzureVM cmdlet with the -StayProvisioned parameter specified, the VM will be safely stopped but fabric resource reservations will be preserved; that is the VM will be left in a "Stopped" status just like the end result when performing a Guest OS shutdown operation. So, with PowerShell, you can choose how Windows Azure should handle VM configuration and fabric resource reservations when stopping VMs on a case-by-case basis. TIP: It's important to note that the -StayProvisioned parameter is only available in the latest version of the Windows Azure PowerShell Module. So, if you've previously downloaded this module, be sure to download and install the latest version to get this new functionality. Want to Learn More about Windows Azure Infrastructure Services? To learn more about Windows Azure Infrastructure Services, be sure to check-out these additional FREE resources: Become our next "Early Expert"! Complete the Early Experts "Cloud Quest" and build a multi-VM lab network in the cloud for FREE! Build some cool scenarios! Check out our list of over 20+ Step-by-Step Lab Guides based on key scenarios that IT Pros are implementing on Windows Azure Infrastructure Services TODAY! Looking forward to seeing you in the Cloud! - Keith Build Your Lab! Download Windows Server 2012 Don’t Have a Lab? Build Your Lab in the Cloud with Windows Azure Virtual Machines Want to Get Certified? Join our Windows Server 2012 "Early Experts" Study Group

Read the article
Best of Breed vs. Suite – Oracle’s SaaS Delivers Both

- by yaldahhakim

Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin;} The debate of which is better: “best of breed” business applications vs. an integrated suite is certainly not a new conversation. This has been argued between IT vendors and CIOs for years. It’s also important to clarify that “best of breed” does not necessarily translate into being the richest functionality; rather it’s often about just having the best fit solution to solve a specific business problem or need. So what does cloud have to do with the niche vs. suite debate? Consuming business applications in a cloud or SaaS deployment model can change the best of breed vs. suite discussion - if the cloud is done right. It’s having your cake and eating it too only better: you don’t have to gather all the ingredients or wait to bake your cake, and you can adjust how big of slice you take. Before you eat, it’s worth pausing to recall much of what we learned about IT over the last decade. These basic IT principles still hold true even though the financial model has changed from buying to renting. In other words, what’s under the technology hood still matters. Architecture and development methodologies like building an application based on open standards so it works with other systems - is still important. Data and information silos, complex integrations, and proprietary technologies that lock you in, are still bad. While some may argue that IT no longer matters with cloud, the opposite is actually true. If anything cloud can help return IT back to its rightful place as key strategic asset vs. a liability on the balance sheet. The “I” in CIO was never meant to stand for “integration” yet it’s amazing how much time and money is poured into these types of initiatives for most organizations each year. Rather the “I” needs to stand for “innovation”. This is where Oracle SaaS can uniquely help. Oracle’s application strategy has not really changed over the years. It’s always been about bringing the best and richest functionality across the enterprise to our customers while leveraging a common, standards-based, and enterprise-grade platform. So not jut best fit, but the best capabilities based on the input of thousands of enterprise customers across the globe. Oracle invests billions in R&D every year to add new capabilities to the broadest cloud portfolio in the industry, spanning across functional pillars like CRM, HCM, ERP, etc. And where it makes sense, Oracle combines key strategic acquisitions to complement organic functionality. The result is best of breed delivered in a suite. Again this is not something new. The game changer now with cloud is that it impacts HOW Oracle customers adopt the richest, most modern applications across the business – and continue on getting it. Consuming oracle applications in the cloud means you can adopt new capabilities and updates very quickly and easily. There’s no hardware to buy or software to manage. Oracle does it for you. Low upfront costs and an OpEx financial model is the easy part. Oracle Cloud Applications take it a big step further. For organizations that demand having the latest and richest functionality and accelerating the time to value from their IT investment, Oracle Cloud is the right path. It’s about holistically changing the “hows” and the “whys” of the organization by leveraging transformational innovations like social, mobile, and big data in a consistent and more powerful way. Not just about sales force automation or talent management. These technologies should impact all parts of the company and Oracle Cloud is the enterprise-grade delivery vehicle. Oracle SaaS helps break down barriers of adoption and is eases the headache of upgrades, investing in new supporting hardware, or adding internal expertise to manage it all. With Oracle Cloud, customers can get best of breed capabilities in either a full suite model or a la carte. And because it’s entirely built on open standards, it’s built to co-exist with existing IT investments. Updates can be automatic or delayed based on a customer’s requirements. And it’s complete – a full suite of cross pillar functionality. Even better, if you don’t like it, need more or less, just turn the dial up or down. Just like your utility bill, you pay for what you use, and can consume more or less power whenever you need it. Lower cost, lower investment risk, without compromising on functionality, security, or performance. Technology still matters in the cloud. So our cloud customers also like that when they adopt our cloud applications, they also get the best underlying technology, from the middleware and database platform down to infrastructure and Oracle’s engineered systems. Therefore it’s not just the greatest and latest in application functionality, but everything underneath that makes it work is also the latest and greatest. The best of breed technology stack powering best of breed business applications, and all delivered in a subscription based model. The best of both worlds. Yep, that’s the idea.

Read the article
Exploring the Excel Services REST API

- by jamiet

Over the last few years Analysis Services guru Chris Webb and I have been on something of a crusade to enable better access to data that is locked up in countless Excel workbooks that litter the hard drives of enterprise PCs. The most prominent manifestation of that crusade up to now has been a forum thread that Chris began on Microsoft Answers entitled Excel Web App API? Chris began that thread with: I was wondering whether there was an API for the Excel Web App? Specifically, I was wondering if it was possible (or if it will be possible in the future) to expose data in a spreadsheet in the Excel Web App as an OData feed, in the way that it is possible with Excel Services? Up to recently the last 10 words of that paragraph "in the way that it is possible with Excel Services" had completely washed over me however a comment on my recent blog post Thoughts on ExcelMashup.com (and a rant) by Josh Booker in which Josh said: Excel Services is a service application built for sharepoint 2010 which exposes a REST API for excel documents. We're looking forward to pros like you giving it a try now that Office365 makes sharepoint more easily accessible. Can't wait for your future blog about using REST API to load data from Excel on Offce 365 in SSIS. made me think that perhaps the Excel Services REST API is something I should be looking into and indeed that is what I have been doing over the past few days. And you know what? I'm rather impressed with some of what Excel Services' REST API has to offer. Unfortunately Excel Services' REST API also has one debilitating aspect that renders this blog post much less useful than it otherwise would be; namely that it is not publicly available from the Excel Web App on SkyDrive. Therefore all I can do in this blog post is show you screenshots of what the REST API provides in Sharepoint rather than linking you directly to those REST resources; that's a great shame because one of the benefits of a REST API is that it is easily and ubiquitously demonstrable from a web browser. Instead I am hosting a workbook on Sharepoint in Office 365 because that does include Excel Services' REST API but, again, all I can do is show you screenshots. N.B. If anyone out there knows how to make Office-365-hosted spreadsheets publicly-accessible (i.e. without requiring a username/password) please do let me know (because knowing which forum on which to ask the question is an exercise in futility). In order to demonstrate Excel Services' REST API I needed some decent data and for that I used the World Tourism Organization Statistics Database and Yearbook - United Nations World Tourism Organization dataset hosted on Azure Datamarket (its free, by the way); this dataset "provides comprehensive information on international tourism worldwide and offers a selection of the latest available statistics on international tourist arrivals, tourism receipts and expenditure" and you can explore the data for yourself here. If you want to play along at home by viewing the data as it exists in Excel then it can be viewed here. Let's dive in. The root of Excel Services' REST API is the model resource which resides at: http://server/_vti_bin/ExcelRest.aspx/Documents/TourismExpenditureInMillionsOfUSD.xlsx/model Note that this is true for every workbook hosted in a Sharepoint document library - each Excel workbook is a RESTful resource. (Update: Mark Stacey on Twitter tells me that "It's turned off by default in onpremise Sharepoint (1 tickbox to turn on though)". Thanks Mark!) The data is provided as an ATOM feed but I have Firefox's feed reading ability turned on so you don't see the underlying XML goo. As you can see there are four top level resources, Ranges, Charts, Tables and PivotTables; exploring one of those resources is where things get interesting. Let's take a look at the Tables Resource: http://server/_vti_bin/ExcelRest.aspx/Documents/TourismExpenditureInMillionsOfUSD.xlsx/model/Tables Our workbook contains only one table, called ‘Table1’ (to reiterate, you can explore this table yourself here). Viewing that table via the REST API is pretty easy, we simply append the name of the table onto our previous URI: http://server/_vti_bin/ExcelRest.aspx/Documents/TourismExpenditureInMillionsOfUSD.xlsx/model/Tables('Table1') As you can see, that quite simply gives us a representation of the data in that table. What you cannot see from this screenshot is that this is pure HTML that is being served up; that is all well and good but actually we can do more interesting things. If we specify that the data should be returned not as HTML but as: http://server/_vti_bin/ExcelRest.aspx/Documents/TourismExpenditureInMillionsOfUSD.xlsx/model/Tables('Table1')?$format=image then that data comes back as a pure image and can be used in any web page where you would ordinarily use images. This is the thing that I really like about Excel Services’ REST API – we can embed an image in any web page but instead of being a copy of the data, that image is actually live – if the underlying data in the workbook were to change then hitting refresh will show a new image. Pretty cool, no? The same is true of any Charts or Pivot Tables in your workbook - those can be embedded as images too and if the underlying data changes, boom, the image in your web page changes too. There is a lot of data in the workbook so the image returned by that previous URI is too large to show here so instead let’s take a look at a different resource, this time a range: http://server/_vti_bin/ExcelRest.aspx/Documents/TourismExpenditureInMillionsOfUSD.xlsx/model/Ranges('Data!A1|C15') That URI returns cells A1 to C15 from a worksheet called “Data”: And if we ask for that as an image again: http://server/_vti_bin/ExcelRest.aspx/Documents/TourismExpenditureInMillionsOfUSD.xlsx/model/Ranges('Data!A1|C15')?$format=image Were this image resource not behind a username/password then this would be a live image of the data in the workbook as opposed to one that I had to copy and upload elsewhere. Nonetheless I hope this little wrinkle doesn't detract from the inate value of what I am trying to articulate here; that an existing image in a web page can be changed on-the-fly simply by inserting some data into an Excel workbook. I for one think that that is very cool indeed! I think that's enough in the way of demo for now as this shows what is possible using Excel Services' REST API. Of course, not all features work quite how I would like and here is a bulleted list of some of my more negative feedback: The URIs are pig-ugly. Are "_vti_bin" & "ExcelRest.aspx" really necessary as part of the URI? Would this not be better: http://server/Documents/TourismExpenditureInMillionsOfUSD.xlsx/Model/Tables(‘Table1’) That URI provides the necessary addressability and is a lot easier to remember. Discoverability of these resources is not easy, we essentially have to handcrank a URI ourselves. Take the example of embedding a chart into a blog post - would it not be better if I could browse first through the document library to an Excel workbook and THEN through the workbook to the chart/range/table that I am interested in? Call it a wizard if you like. That would be really cool and would, I am sure, promote this feature and cut down on the copy-and-paste disease that the REST API is meant to alleviate. The resources that I demonstrated can be returned as feeds as well as images or HTML simply by changing the format parameter to ?$format=atom however for some inexplicable reason they don't return OData and no-one on the Excel Services team can tell me why (believe me, I have asked). $format is an OData parameter however other useful parameters such as $top and $filter are not supported. It would be nice if they were. Although I haven't demonstrated it here Excel Services' REST API does provide a makeshift way of altering the data by changing the value of specific cells however what it does not allow you to do is add new data into the workbook. Google Docs allows this and was one of the motivating factors for Chris Webb's forum post that I linked to above. None of this works for Excel workbooks hosted on SkyDrive This blog post is as long as it needs to be for a short introduction so I'll stop now. If you want to know more than I recommend checking out a few links: Excel Services REST API documentation on MSDNSo what does REST on Excel Services look like??? by Shahar PrishExcel Services in SharePoint 2010 REST API Syntax by Christian Stich. Any thoughts? Let's hear them in the comments section below! @Jamiet

Read the article
SQL Server Split() Function

- by HighAltitudeCoder

Title goes here Ever wanted a dbo.Split() function, but not had the time to debug it completely? Let me guess - you are probably working on a stored procedure with 50 or more parameters; two or three of them are parameters of differing types, while the other 47 or so all of the same type (id1, id2, id3, id4, id5...). Worse, you've found several other similar stored procedures with the ONLY DIFFERENCE being the number of like parameters taped to the end of the parameter list. If this is the situation you find yourself in now, you may be wondering, "why am I working with three different copies of what is basically the same stored procedure, and why am I having to maintain changes in three different places? Can't I have one stored procedure that accomplishes the job of all three? My answer to you: YES! Here is the Split() function I've created. /****************************************************************************** Split.sql ******************************************************************************/ /****************************************************************************** Split a delimited string into sub-components and return them as a table. Parameter 1: Input string which is to be split into parts. Parameter 2: Delimiter which determines the split points in input string. Works with space or spaces as delimiter. Split() is apostrophe-safe. SYNTAX: SELECT * FROM Split('Dvorak,Debussy,Chopin,Holst', ',') SELECT * FROM Split('Denver|Seattle|San Diego|New York', '|') SELECT * FROM Split('Denver is the super-awesomest city of them all.', ' ') ******************************************************************************/ USE AdventureWorks GO IF EXISTS (SELECT * FROM sysobjects WHERE xtype = 'TF' AND name = 'Split' ) BEGIN DROP FUNCTION Split END GO CREATE FUNCTION Split ( @InputString VARCHAR(8000), @Delimiter VARCHAR(50) ) RETURNS @Items TABLE ( Item VARCHAR(8000) ) AS BEGIN IF @Delimiter = ' ' BEGIN SET @Delimiter = ',' SET @InputString = REPLACE(@InputString, ' ', @Delimiter) END IF (@Delimiter IS NULL OR @Delimiter = '') SET @Delimiter = ',' --INSERT INTO @Items VALUES (@Delimiter) -- Diagnostic --INSERT INTO @Items VALUES (@InputString) -- Diagnostic DECLARE @Item VARCHAR(8000) DECLARE @ItemList VARCHAR(8000) DECLARE @DelimIndex INT SET @ItemList = @InputString SET @DelimIndex = CHARINDEX(@Delimiter, @ItemList, 0) WHILE (@DelimIndex != 0) BEGIN SET @Item = SUBSTRING(@ItemList, 0, @DelimIndex) INSERT INTO @Items VALUES (@Item) -- Set @ItemList = @ItemList minus one less item SET @ItemList = SUBSTRING(@ItemList, @DelimIndex+1, LEN(@ItemList)-@DelimIndex) SET @DelimIndex = CHARINDEX(@Delimiter, @ItemList, 0) END -- End WHILE IF @Item IS NOT NULL -- At least one delimiter was encountered in @InputString BEGIN SET @Item = @ItemList INSERT INTO @Items VALUES (@Item) END -- No delimiters were encountered in @InputString, so just return @InputString ELSE INSERT INTO @Items VALUES (@InputString) RETURN END -- End Function GO ---- Set Permissions --GRANT SELECT ON Split TO UserRole1 --GRANT SELECT ON Split TO UserRole2 --GO The syntax is basically as follows: SELECT <fields> FROM Table 1 JOIN Table 2 ON ... JOIN Table 3 ON ... WHERE LOGICAL CONDITION A AND LOGICAL CONDITION B AND LOGICAL CONDITION C AND TABLE2.Id IN (SELECT * FROM Split(@IdList, ',')) @IdList is a parameter passed into the stored procedure, and the comma (',') is the delimiter you have chosen to split the parameter list on. You can also use it like this: SELECT <fields> FROM Table 1 JOIN Table 2 ON ... JOIN Table 3 ON ... WHERE LOGICAL CONDITION A AND LOGICAL CONDITION B AND LOGICAL CONDITION C HAVING COUNT(SELECT * FROM Split(@IdList, ',') Similarly, it can be used in other aggregate functions at run-time: SELECT MIN(SELECT * FROM Split(@IdList, ','), <fields> FROM Table 1 JOIN Table 2 ON ... JOIN Table 3 ON ... WHERE LOGICAL CONDITION A AND LOGICAL CONDITION B AND LOGICAL CONDITION C GROUP BY <fields> Now that I've (hopefully effectively) explained the benefits to using this function and implementing it in one or more of your database objects, let me warn you of a caveat that you are likely to encounter. You may have a team member who waits until the right moment to ask you a pointed question: "Doesn't this function just do the same thing as using the IN function? Why didn't you just use that instead? In other words, why bother with this function?" What's happening is, one or more team members has failed to understand the reason for implementing this kind of function in the first place. (Note: this is THE MOST IMPORTANT ASPECT OF THIS POST). Allow me to outline a few pros to implementing this function, so you may effectively parry this question. Touche. 1) Code consolidation. You don't have to maintain what is basically the same code and logic, but with varying numbers of the same parameter in several SQL objects. I'm not going to go into the cons related to using this function, because the afore mentioned team member is probably more than adept at pointing these out. Remember, the real positive contribution is ou are decreasing the liklihood that your team fails to update all (x) duplicate copies of what are basically the same stored procedure, and so on... This is the classic downside to duplicate code. It is a virus, and you should kill it. You might be better off rejecting your team member's question, and responding with your own: "Would you rather maintain the same logic in multiple different stored procedures, and hope that the team doesn't forget to always update all of them at the same time?". In his head, he might be thinking "yes, I would like to maintain several different copies of the same stored procedure", although you probably will not get such a direct response. 2) Added flexibility - you can use the Split function elsewhere, and for splitting your data in different ways. Plus, you can use any kind of delimiter you wish. How can you know today the ways in which you might want to examine your data tomorrow? Segue to my next point. 3) Because the function takes a delimiter parameter, you can split the data in any number of ways. This greatly increases the utility of such a function and enables your team to work with the data in a variety of different ways in the future. You can split on a single char, symbol, word, or group of words. You can split on spaces. (The list goes on... test it out). Finally, you can dynamically define the behavior of a stored procedure (or other SQL object) at run time, through the use of this function. Rather than have several objects that accomplish almost the same thing, why not have only one instead?

Read the article
Fun with Aggregates

- by Paul White

There are interesting things to be learned from even the simplest queries. For example, imagine you are given the task of writing a query to list AdventureWorks product names where the product has at least one entry in the transaction history table, but fewer than ten. One possible query to meet that specification is: SELECT p.Name FROM Production.Product AS p JOIN Production.TransactionHistory AS th ON p.ProductID = th.ProductID GROUP BY p.ProductID, p.Name HAVING COUNT_BIG(*) < 10; That query correctly returns 23 rows (execution plan and data sample shown below): The execution plan looks a bit different from the written form of the query: the base tables are accessed in reverse order, and the aggregation is performed before the join. The general idea is to read all rows from the history table, compute the count of rows grouped by ProductID, merge join the results to the Product table on ProductID, and finally filter to only return rows where the count is less than ten. This ‘fully-optimized’ plan has an estimated cost of around 0.33 units. The reason for the quote marks there is that this plan is not quite as optimal as it could be – surely it would make sense to push the Filter down past the join too? To answer that, let’s look at some other ways to formulate this query. This being SQL, there are any number of ways to write logically-equivalent query specifications, so we’ll just look at a couple of interesting ones. The first query is an attempt to reverse-engineer T-SQL from the optimized query plan shown above. It joins the result of pre-aggregating the history table to the Product table before filtering: SELECT p.Name FROM ( SELECT th.ProductID, cnt = COUNT_BIG(*) FROM Production.TransactionHistory AS th GROUP BY th.ProductID ) AS q1 JOIN Production.Product AS p ON p.ProductID = q1.ProductID WHERE q1.cnt < 10; Perhaps a little surprisingly, we get a slightly different execution plan: The results are the same (23 rows) but this time the Filter is pushed below the join! The optimizer chooses nested loops for the join, because the cardinality estimate for rows passing the Filter is a bit low (estimate 1 versus 23 actual), though you can force a merge join with a hint and the Filter still appears below the join. In yet another variation, the < 10 predicate can be ‘manually pushed’ by specifying it in a HAVING clause in the “q1” sub-query instead of in the WHERE clause as written above. The reason this predicate can be pushed past the join in this query form, but not in the original formulation is simply an optimizer limitation – it does make efforts (primarily during the simplification phase) to encourage logically-equivalent query specifications to produce the same execution plan, but the implementation is not completely comprehensive. Moving on to a second example, the following query specification results from phrasing the requirement as “list the products where there exists fewer than ten correlated rows in the history table”: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) < 10 ); Unfortunately, this query produces an incorrect result (86 rows): The problem is that it lists products with no history rows, though the reasons are interesting. The COUNT_BIG(*) in the EXISTS clause is a scalar aggregate (meaning there is no GROUP BY clause) and scalar aggregates always produce a value, even when the input is an empty set. In the case of the COUNT aggregate, the result of aggregating the empty set is zero (the other standard aggregates produce a NULL). To make the point really clear, let’s look at product 709, which happens to be one for which no history rows exist: -- Scalar aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709; -- Vector aggregate SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = 709 GROUP BY th.ProductID; The estimated execution plans for these two statements are almost identical: You might expect the Stream Aggregate to have a Group By for the second statement, but this is not the case. The query includes an equality comparison to a constant value (709), so all qualified rows are guaranteed to have the same value for ProductID and the Group By is optimized away. In fact there are some minor differences between the two plans (the first is auto-parameterized and qualifies for trivial plan, whereas the second is not auto-parameterized and requires cost-based optimization), but there is nothing to indicate that one is a scalar aggregate and the other is a vector aggregate. This is something I would like to see exposed in show plan so I suggested it on Connect. Anyway, the results of running the two queries show the difference at runtime: The scalar aggregate (no GROUP BY) returns a result of zero, whereas the vector aggregate (with a GROUP BY clause) returns nothing at all. Returning to our EXISTS query, we could ‘fix’ it by changing the HAVING clause to reject rows where the scalar aggregate returns zero: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID HAVING COUNT_BIG(*) BETWEEN 1 AND 9 ); The query now returns the correct 23 rows: Unfortunately, the execution plan is less efficient now – it has an estimated cost of 0.78 compared to 0.33 for the earlier plans. Let’s try adding a redundant GROUP BY instead of changing the HAVING clause: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY th.ProductID HAVING COUNT_BIG(*) < 10 ); Not only do we now get correct results (23 rows), this is the execution plan: I like to compare that plan to quantum physics: if you don’t find it shocking, you haven’t understood it properly :) The simple addition of a redundant GROUP BY has resulted in the EXISTS form of the query being transformed into exactly the same optimal plan we found earlier. What’s more, in SQL Server 2008 and later, we can replace the odd-looking GROUP BY with an explicit GROUP BY on the empty set: SELECT p.Name FROM Production.Product AS p WHERE EXISTS ( SELECT * FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ); I offer that as an alternative because some people find it more intuitive (and it perhaps has more geek value too). Whichever way you prefer, it’s rather satisfying to note that the result of the sub-query does not exist for a particular correlated value where a vector aggregate is used (the scalar COUNT aggregate always returns a value, even if zero, so it always ‘EXISTS’ regardless which ProductID is logically being evaluated). The following query forms also produce the optimal plan and correct results, so long as a vector aggregate is used (you can probably find more equivalent query forms): WHERE Clause SELECT p.Name FROM Production.Product AS p WHERE ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) < 10; APPLY SELECT p.Name FROM Production.Product AS p CROSS APPLY ( SELECT NULL FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () HAVING COUNT_BIG(*) < 10 ) AS ca (dummy); FROM Clause SELECT q1.Name FROM ( SELECT p.Name, cnt = ( SELECT COUNT_BIG(*) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID GROUP BY () ) FROM Production.Product AS p ) AS q1 WHERE q1.cnt < 10; This last example uses SUM(1) instead of COUNT and does not require a vector aggregate…you should be able to work out why :) SELECT q.Name FROM ( SELECT p.Name, cnt = ( SELECT SUM(1) FROM Production.TransactionHistory AS th WHERE th.ProductID = p.ProductID ) FROM Production.Product AS p ) AS q WHERE q.cnt < 10; The semantics of SQL aggregates are rather odd in places. It definitely pays to get to know the rules, and to be careful to check whether your queries are using scalar or vector aggregates. As we have seen, query plans do not show in which ‘mode’ an aggregate is running and getting it wrong can cause poor performance, wrong results, or both. © 2012 Paul White Twitter: @SQL_Kiwi email: [email protected]

Read the article
Exploring the Excel Services REST API

- by jamiet

Over the last few years Analysis Services guru Chris Webb and I have been on something of a crusade to enable better access to data that is locked up in countless Excel workbooks that litter the hard drives of enterprise PCs. The most prominent manifestation of that crusade up to now has been a forum thread that Chris began on Microsoft Answers entitled Excel Web App API? Chris began that thread with: I was wondering whether there was an API for the Excel Web App? Specifically, I was wondering if it was possible (or if it will be possible in the future) to expose data in a spreadsheet in the Excel Web App as an OData feed, in the way that it is possible with Excel Services? Up to recently the last 10 words of that paragraph "in the way that it is possible with Excel Services" had completely washed over me however a comment on my recent blog post Thoughts on ExcelMashup.com (and a rant) by Josh Booker in which Josh said: Excel Services is a service application built for sharepoint 2010 which exposes a REST API for excel documents. We're looking forward to pros like you giving it a try now that Office365 makes sharepoint more easily accessible. Can't wait for your future blog about using REST API to load data from Excel on Offce 365 in SSIS. made me think that perhaps the Excel Services REST API is something I should be looking into and indeed that is what I have been doing over the past few days. And you know what? I'm rather impressed with some of what Excel Services' REST API has to offer. Unfortunately Excel Services' REST API also has one debilitating aspect that renders this blog post much less useful than it otherwise would be; namely that it is not publicly available from the Excel Web App on SkyDrive. Therefore all I can do in this blog post is show you screenshots of what the REST API provides in Sharepoint rather than linking you directly to those REST resources; that's a great shame because one of the benefits of a REST API is that it is easily and ubiquitously demonstrable from a web browser. Instead I am hosting a workbook on Sharepoint in Office 365 because that does include Excel Services' REST API but, again, all I can do is show you screenshots. N.B. If anyone out there knows how to make Office-365-hosted spreadsheets publicly-accessible (i.e. without requiring a username/password) please do let me know (because knowing which forum on which to ask the question is an exercise in futility). In order to demonstrate Excel Services' REST API I needed some decent data and for that I used the World Tourism Organization Statistics Database and Yearbook - United Nations World Tourism Organization dataset hosted on Azure Datamarket (its free, by the way); this dataset "provides comprehensive information on international tourism worldwide and offers a selection of the latest available statistics on international tourist arrivals, tourism receipts and expenditure" and you can explore the data for yourself here. If you want to play along at home by viewing the data as it exists in Excel then it can be viewed here. Let's dive in. The root of Excel Services' REST API is the model resource which resides at: http://server/_vti_bin/ExcelRest.aspx/Documents/TourismExpenditureInMillionsOfUSD.xlsx/model Note that this is true for every workbook hosted in a Sharepoint document library - each Excel workbook is a RESTful resource. (Update: Mark Stacey on Twitter tells me that "It's turned off by default in onpremise Sharepoint (1 tickbox to turn on though)". Thanks Mark!) The data is provided as an ATOM feed but I have Firefox's feed reading ability turned on so you don't see the underlying XML goo. As you can see there are four top level resources, Ranges, Charts, Tables and PivotTables; exploring one of those resources is where things get interesting. Let's take a look at the Tables Resource: http://server/_vti_bin/ExcelRest.aspx/Documents/TourismExpenditureInMillionsOfUSD.xlsx/model/Tables Our workbook contains only one table, called ‘Table1’ (to reiterate, you can explore this table yourself here). Viewing that table via the REST API is pretty easy, we simply append the name of the table onto our previous URI: http://server/_vti_bin/ExcelRest.aspx/Documents/TourismExpenditureInMillionsOfUSD.xlsx/model/Tables('Table1') As you can see, that quite simply gives us a representation of the data in that table. What you cannot see from this screenshot is that this is pure HTML that is being served up; that is all well and good but actually we can do more interesting things. If we specify that the data should be returned not as HTML but as: http://server/_vti_bin/ExcelRest.aspx/Documents/TourismExpenditureInMillionsOfUSD.xlsx/model/Tables('Table1')?$format=image then that data comes back as a pure image and can be used in any web page where you would ordinarily use images. This is the thing that I really like about Excel Services’ REST API – we can embed an image in any web page but instead of being a copy of the data, that image is actually live – if the underlying data in the workbook were to change then hitting refresh will show a new image. Pretty cool, no? The same is true of any Charts or Pivot Tables in your workbook - those can be embedded as images too and if the underlying data changes, boom, the image in your web page changes too. There is a lot of data in the workbook so the image returned by that previous URI is too large to show here so instead let’s take a look at a different resource, this time a range: http://server/_vti_bin/ExcelRest.aspx/Documents/TourismExpenditureInMillionsOfUSD.xlsx/model/Ranges('Data!A1|C15') That URI returns cells A1 to C15 from a worksheet called “Data”: And if we ask for that as an image again: http://server/_vti_bin/ExcelRest.aspx/Documents/TourismExpenditureInMillionsOfUSD.xlsx/model/Ranges('Data!A1|C15')?$format=image Were this image resource not behind a username/password then this would be a live image of the data in the workbook as opposed to one that I had to copy and upload elsewhere. Nonetheless I hope this little wrinkle doesn't detract from the inate value of what I am trying to articulate here; that an existing image in a web page can be changed on-the-fly simply by inserting some data into an Excel workbook. I for one think that that is very cool indeed! I think that's enough in the way of demo for now as this shows what is possible using Excel Services' REST API. Of course, not all features work quite how I would like and here is a bulleted list of some of my more negative feedback: The URIs are pig-ugly. Are "_vti_bin" & "ExcelRest.aspx" really necessary as part of the URI? Would this not be better: http://server/Documents/TourismExpenditureInMillionsOfUSD.xlsx/Model/Tables(‘Table1’) That URI provides the necessary addressability and is a lot easier to remember. Discoverability of these resources is not easy, we essentially have to handcrank a URI ourselves. Take the example of embedding a chart into a blog post - would it not be better if I could browse first through the document library to an Excel workbook and THEN through the workbook to the chart/range/table that I am interested in? Call it a wizard if you like. That would be really cool and would, I am sure, promote this feature and cut down on the copy-and-paste disease that the REST API is meant to alleviate. The resources that I demonstrated can be returned as feeds as well as images or HTML simply by changing the format parameter to ?$format=atom however for some inexplicable reason they don't return OData and no-one on the Excel Services team can tell me why (believe me, I have asked). $format is an OData parameter however other useful parameters such as $top and $filter are not supported. It would be nice if they were. Although I haven't demonstrated it here Excel Services' REST API does provide a makeshift way of altering the data by changing the value of specific cells however what it does not allow you to do is add new data into the workbook. Google Docs allows this and was one of the motivating factors for Chris Webb's forum post that I linked to above. None of this works for Excel workbooks hosted on SkyDrive This blog post is as long as it needs to be for a short introduction so I'll stop now. If you want to know more than I recommend checking out a few links: Excel Services REST API documentation on MSDNSo what does REST on Excel Services look like??? by Shahar PrishExcel Services in SharePoint 2010 REST API Syntax by Christian Stich. Any thoughts? Let's hear them in the comments section below! @Jamiet

Read the article
Physical Directories vs. MVC View Paths

- by Rick Strahl

This post falls into the bucket of operator error on my part, but I want to share this anyway because it describes an issue that has bitten me a few times now and writing it down might keep it a little stronger in my mind. I've been working on an MVC project the last few days, and at the end of a long day I accidentally moved one of my View folders from the MVC Root Folder to the project root. It must have been at the very end of the day before shutting down because tests and manual site navigation worked fine just before I quit for the night. I checked in changes and called it a night. Next day I came back, started running the app and had a lot of breaks with certain views. Oddly custom routes to these controllers/views worked, but stock /{controller}/{action} routes would not. After a bit of spelunking I realized that "Hey one of my View Folders is missing", which made some sense given the error messages I got. I looked in the recycle bin - nothing there, so rather than try to figure out what the hell happened, just restored from my last SVN checkin. At this point the folders are back… but… view access still ends up breaking for this set of views. Specifically I'm getting the Yellow Screen of Death with: CS0103: The name 'model' does not exist in the current context Here's the full error: Server Error in '/ClassifiedsWeb' Application. Compilation ErrorDescription: An error occurred during the compilation of a resource required to service this request. Please review the following specific error details and modify your source code appropriately.Compiler Error Message: CS0103: The name 'model' does not exist in the current contextSource Error: Line 1: @model ClassifiedsWeb.EntryViewModel Line 2: @{ Line 3: ViewBag.Title = Model.Entry.Title + " - " + ClassifiedsBusiness.App.Configuration.ApplicationName; Source File: c:\Projects2010\Clients\GorgeNet\Classifieds\ClassifiedsWeb\Classifieds\Show.cshtml Line: 1 Compiler Warning Messages: Show Detailed Compiler Output: Show Complete Compilation Source: Version Information: Microsoft .NET Framework Version:4.0.30319; ASP.NET Version:4.0.30319.272 Here's what's really odd about this error: The views now do exist in the /Views/Classifieds folder of the project, but it appears like MVC is trying to execute the views directly. This is getting pretty weird, man! So I hook up some break points in my controllers to see if my controller actions are getting fired - and sure enough it turns out they are not - but only for those views that were previously 'lost' and then restored from SVN. WTF? At this point I'm thinking that I must have messed up one of the config files, but after some more spelunking and realizing that all the other Controller views work, I give up that idea. Config's gotta be OK if other controllers and views are working. Root Folders and MVC Views don't mix As I mentioned the problem was the fact that I inadvertantly managed to drag my View folder to the root folder of the project. Here's what this looks like in my FUBAR'd project structure after I copied back /Views/Classifieds folder from SVN: There's the actual root folder in the /Views folder and the accidental copy that sits of the root. I of course did not notice the /Classifieds folder at the root because it was excluded and didn't show up in the project. Now, before you call me a complete idiot remember that this happened by accident - an accidental drag probably just before shutting down for the night. :-) So why does this break? MVC should be happy with views in the /Views/Classifieds folder right? While MVC might be happy, IIS is not. The fact that there is a physical folder on disk takes precedence over MVC's routing. In other words if a URL exists that matches a route the pysical path is accessed first. What happens here is that essentially IIS is trying to execute the .cshtml pages directly without ever routing to the Controller methods. In the error page I showed above my clue should have been that the view was served as: c:\Projects2010\Clients\GorgeNet\Classifieds\ClassifiedsWeb\Classifieds\Show.cshtml rather than c:\Projects2010\Clients\GorgeNet\Classifieds\ClassifiedsWeb\Views\Classifieds\Show.cshtml But of course I didn't notice that right away, just skimming to the end and looking at the file name. The reason that /classifieds/list actually fires that file is that the ASP.NET Web Pages engine looks for physical files on disk that match a path. IOW, when calling Web Pages you drop the .cshtml off the Razor page and IIS will serve that just fine. So: /classifieds/list looks and tries to find /classifieds/list.cshtml and executes that script. And that is exactly what's happening. Web Pages is trying to execute the .cshtml file and it fails because Web Pages knows nothing about the @model tag which is an MVC specific template extension. This is why my breakpoints in the controller methods didn't fire and it also explains why the error mentions that the @model key word is invalid (@model is an MVC provided template enhancement to the Razor Engine). The solution of course is super simple: Delete the accidentally created root folder and the problem is solved. Routing and Physical Paths I've run into problems with this before actually. In the past I've had a number of applications that had a physical /Admin folder which also would conflict with an MVC Admin controller. More than once I ended up wondering why the index route (/Admin/) was not working properly. If a physical /Admin folder exists /Admin will not route to the Index action (or whatever default action you have set up, but instead try to list the directory or show the default document in the folder. The only way to force the index page through MVC is to explicitly use /Admin/Index. Makes perfect sense once you realize the physical folder is there, but that's easy to forget in an MVC application. As you might imagine after a few times of running into this I gave up on the Admin folder and moved everything into MVC views to handle those operations. Still it's one of those things that can easily bite you, because the behavior and error messages seem to point at completely different problems. Moral of the story is: If you see routing problems where routes are not reaching obvious controller methods, always check to make sure there's isn't a physical path being mapped by IIS instead. That way you won't feel stupid like I did after trying a million things for about an hour before discovering my sloppy mousing behavior :-)© Rick Strahl, West Wind Technologies, 2005-2012Posted in MVC IIS7 Tweet !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs"); (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

Read the article
Thread placement policies on NUMA systems - update

- by Dave

In a prior blog entry I noted that Solaris used a "maximum dispersal" placement policy to assign nascent threads to their initial processors. The general idea is that threads should be placed as far away from each other as possible in the resource topology in order to reduce resource contention between concurrently running threads. This policy assumes that resource contention -- pipelines, memory channel contention, destructive interference in the shared caches, etc -- will likely outweigh (a) any potential communication benefits we might achieve by packing our threads more densely onto a subset of the NUMA nodes, and (b) benefits of NUMA affinity between memory allocated by one thread and accessed by other threads. We want our threads spread widely over the system and not packed together. Conceptually, when placing a new thread, the kernel picks the least loaded node NUMA node (the node with lowest aggregate load average), and then the least loaded core on that node, etc. Furthermore, the kernel places threads onto resources -- sockets, cores, pipelines, etc -- without regard to the thread's process membership. That is, initial placement is process-agnostic. Keep reading, though. This description is incorrect. On Solaris 10 on a SPARC T5440 with 4 x T2+ NUMA nodes, if the system is otherwise unloaded and we launch a process that creates 20 compute-bound concurrent threads, then typically we'll see a perfect balance with 5 threads on each node. We see similar behavior on an 8-node x86 x4800 system, where each node has 8 cores and each core is 2-way hyperthreaded. So far so good; this behavior seems in agreement with the policy I described in the 1st paragraph. I recently tried the same experiment on a 4-node T4-4 running Solaris 11. Both the T5440 and T4-4 are 4-node systems that expose 256 logical thread contexts. To my surprise, all 20 threads were placed onto just one NUMA node while the other 3 nodes remained completely idle. I checked the usual suspects such as processor sets inadvertently left around by colleagues, processors left offline, and power management policies, but the system was configured normally. I then launched multiple concurrent instances of the process, and, interestingly, all the threads from the 1st process landed on one node, all the threads from the 2nd process landed on another node, and so on. This happened even if I interleaved thread creating between the processes, so I was relatively sure the effect didn't related to thread creation time, but rather that placement was a function of process membership. I this point I consulted the Solaris sources and talked with folks in the Solaris group. The new Solaris 11 behavior is intentional. The kernel is no longer using a simple maximum dispersal policy, and thread placement is process membership-aware. Now, even if other nodes are completely unloaded, the kernel will still try to pack new threads onto the home lgroup (socket) of the primordial thread until the load average of that node reaches 50%, after which it will pick the next least loaded node as the process's new favorite node for placement. On the T4-4 we have 64 logical thread contexts (strands) per socket (lgroup), so if we launch 48 concurrent threads we will find 32 placed on one node and 16 on some other node. If we launch 64 threads we'll find 32 and 32. That means we can end up with our threads clustered on a small subset of the nodes in a way that's quite different that what we've seen on Solaris 10. So we have a policy that allows process-aware packing but reverts to spreading threads onto other nodes if a node becomes too saturated. It turns out this policy was enabled in Solaris 10, but certain bugs suppressed the mixed packing/spreading behavior. There are configuration variables in /etc/system that allow us to dial the affinity between nascent threads and their primordial thread up and down: see lgrp_expand_proc_thresh, specifically. In the OpenSolaris source code the key routine is mpo_update_tunables(). This method reads the /etc/system variables and sets up some global variables that will subsequently be used by the dispatcher, which calls lgrp_choose() in lgrp.c to place nascent threads. Lgrp_expand_proc_thresh controls how loaded an lgroup must be before we'll consider homing a process's threads to another lgroup. Tune this value lower to have it spread your process's threads out more. To recap, the 'new' policy is as follows. Threads from the same process are packed onto a subset of the strands of a socket (50% for T-series). Once that socket reaches the 50% threshold the kernel then picks another preferred socket for that process. Threads from unrelated processes are spread across sockets. More precisely, different processes may have different preferred sockets (lgroups). Beware that I've simplified and elided details for the purposes of explication. The truth is in the code. Remarks: It's worth noting that initial thread placement is just that. If there's a gross imbalance between the load on different nodes then the kernel will migrate threads to achieve a better and more even distribution over the set of available nodes. Once a thread runs and gains some affinity for a node, however, it becomes "stickier" under the assumption that the thread has residual cache residency on that node, and that memory allocated by that thread resides on that node given the default "first-touch" page-level NUMA allocation policy. Exactly how the various policies interact and which have precedence under what circumstances could the topic of a future blog entry. The scheduler is work-conserving. The x4800 mentioned above is an interesting system. Each of the 8 sockets houses an Intel 7500-series processor. Each processor has 3 coherent QPI links and the system is arranged as a glueless 8-socket twisted ladder "mobius" topology. Nodes are either 1 or 2 hops distant over the QPI links. As an aside the mapping of logical CPUIDs to physical resources is rather interesting on Solaris/x4800. On SPARC/Solaris the CPUID layout is strictly geographic, with the highest order bits identifying the socket, the next lower bits identifying the core within that socket, following by the pipeline (if present) and finally the logical thread context ("strand") on the core. But on Solaris on the x4800 the CPUID layout is as follows. [6:6] identifies the hyperthread on a core; bits [5:3] identify the socket, or package in Intel terminology; bits [2:0] identify the core within a socket. Such low-level details should be of interest only if you're binding threads -- a bad idea, the kernel typically handles placement best -- or if you're writing NUMA-aware code that's aware of the ambient placement and makes decisions accordingly. Solaris introduced the so-called critical-threads mechanism, which is expressed by putting a thread into the FX scheduling class at priority 60. The critical-threads mechanism applies to placement on cores, not on sockets, however. That is, it's an intra-socket policy, not an inter-socket policy. Solaris 11 introduces the Power Aware Dispatcher (PAD) which packs threads instead of spreading them out in an attempt to be able to keep sockets or cores at lower power levels. Maximum dispersal may be good for performance but is anathema to power management. PAD is off by default, but power management polices constitute yet another confounding factor with respect to scheduling and dispatching. If your threads communicate heavily -- one thread reads cache lines last written by some other thread -- then the new dense packing policy may improve performance by reducing traffic on the coherent interconnect. On the other hand if your threads in your process communicate rarely, then it's possible the new packing policy might result on contention on shared computing resources. Unfortunately there's no simple litmus test that says whether packing or spreading is optimal in a given situation. The answer varies by system load, application, number of threads, and platform hardware characteristics. Currently we don't have the necessary tools and sensoria to decide at runtime, so we're reduced to an empirical approach where we run trials and try to decide on a placement policy. The situation is quite frustrating. Relatedly, it's often hard to determine just the right level of concurrency to optimize throughput. (Understanding constructive vs destructive interference in the shared caches would be a good start. We could augment the lines with a small tag field indicating which strand last installed or accessed a line. Given that, we could augment the CPU with performance counters for misses where a thread evicts a line it installed vs misses where a thread displaces a line installed by some other thread.)

Read the article
Parent Objects

- by Ali Bahrami

Support for Parent Objects was added in Solaris 11 Update 1. The following material is adapted from the PSARC arc case, and the Solaris Linker and Libraries Manual. A "plugin" is a shared object, usually loaded via dlopen(), that is used by a program in order to allow the end user to add functionality to the program. Examples of plugins include those used by web browsers (flash, acrobat, etc), as well as mdb and elfedit modules. The object that loads the plugin at runtime is called the "parent object". Unlike most object dependencies, the parent is not identified by name, but by its status as the object doing the load. Historically, building a good plugin is has been more complicated than it should be: A parent and its plugin usually share a 2-way dependency: The plugin provides one or more routines for the parent to call, and the parent supplies support routines for use by the plugin for things like memory allocation and error reporting. It is a best practice to build all objects, including plugins, with the -z defs option, in order to ensure that the object specifies all of its dependencies, and is self contained. However: The parent is usually an executable, which cannot be linked to via the usual library mechanisms provided by the link editor. Even if the parent is a shared object, which could be a normal library dependency to the plugin, it may be desirable to build plugins that can be used by more than one parent, in which case embedding a dependency NEEDED entry for one of the parents is undesirable. The usual way to build a high quality plugin with -z defs uses a special mapfile provided by the parent. This mapfile defines the parent routines, specifying the PARENT attribute (see example below). This works, but is inconvenient, and error prone. The symbol table in the parent already describes what it makes available to plugins — ideally the plugin would obtain that information directly rather than from a separate mapfile. The new -z parent option to ld allows a plugin to link to the parent and access the parent symbol table. This differs from a typical dependency: No NEEDED record is created. The relationship is recorded as a logical connection to the parent, rather than as an explicit object name However, it operates in the same manner as any other dependency in terms of making symbols available to the plugin. When the -z parent option is used, the link-editor records the basename of the parent object in the dynamic section, using the new tag DT_SUNW_PARENT. This is an informational tag, which is not used by the runtime linker to locate the parent, but which is available for diagnostic purposes. The ld(1) manpage documentation for the -z parent option is: -z parent=object Specifies a "parent object", which can be an executable or shared object, against which to link the output object. This option is typically used when creating "plugin" shared objects intended to be loaded by an executable at runtime via the dlopen() function. The symbol table from the parent object is used to satisfy references from the plugin object. The use of the -z parent option makes symbols from the object calling dlopen() available to the plugin. Example For this example, we use a main program, and a plugin. The parent provides a function named parent_callback() for the plugin to call. The plugin provides a function named plugin_func() to the parent: % cat main.c #include <stdio.h> #include <dlfcn.h> #include <link.h> void parent_callback(void) { printf("plugin_func() has called parent_callback()\n"); } int main(int argc, char **argv) { typedef void plugin_func_t(void); void *hdl; plugin_func_t *plugin_func; if (argc != 2) { fprintf(stderr, "usage: main plugin\n"); return (1); } if ((hdl = dlopen(argv[1], RTLD_LAZY)) == NULL) { fprintf(stderr, "unable to load plugin: %s\n", dlerror()); return (1); } plugin_func = (plugin_func_t *) dlsym(hdl, "plugin_func"); if (plugin_func == NULL) { fprintf(stderr, "unable to find plugin_func: %s\n", dlerror()); return (1); } (*plugin_func)(); return (0); } % cat plugin.c #include <stdio.h> extern void parent_callback(void); void plugin_func(void) { printf("parent has called plugin_func() from plugin.so\n"); parent_callback(); } Building this in the traditional manner, without -zdefs: % cc -o main main.c % cc -G -o plugin.so plugin.c % ./main ./plugin.so parent has called plugin_func() from plugin.so plugin_func() has called parent_callback() As noted above, when building any shared object, the -z defs option is recommended, in order to ensure that the object is self contained and specifies all of its dependencies. However, the use of -z defs prevents the plugin object from linking due to the unsatisfied symbol from the parent object: % cc -zdefs -G -o plugin.so plugin.c Undefined first referenced symbol in file parent_callback plugin.o ld: fatal: symbol referencing errors. No output written to plugin.so A mapfile can be used to specify to ld that the parent_callback symbol is supplied by the parent object. % cat plugin.mapfile $mapfile_version 2 SYMBOL_SCOPE { global: parent_callback { FLAGS = PARENT }; }; % cc -zdefs -Mplugin.mapfile -G -o plugin.so plugin.c However, the -z parent option to ld is the most direct solution to this problem, allowing the plugin to actually link against the parent object, and obtain the available symbols from it. An added benefit of using -z parent instead of a mapfile, is that the name of the parent object is recorded in the dynamic section of the plugin, and can be displayed by the file utility: % cc -zdefs -zparent=main -G -o plugin.so plugin.c % elfdump -d plugin.so | grep PARENT [0] SUNW_PARENT 0xcc main % file plugin.so plugin.so: ELF 32-bit LSB dynamic lib 80386 Version 1, parent main, dynamically linked, not stripped % ./main ./plugin.so parent has called plugin_func() from plugin.so plugin_func() has called parent_callback() We can also observe this in elfedit plugins on Solaris systems running Solaris 11 Update 1 or newer: % file /usr/lib/elfedit/dyn.so /usr/lib/elfedit/dyn.so: ELF 32-bit LSB dynamic lib 80386 Version 1, parent elfedit, dynamically linked, not stripped, no debugging information available Related Other Work The GNU ld has an option named --just-symbols that can be used in a similar manner: --just-symbols=filename Read symbol names and their addresses from filename, but do not relocate it or include it in the output. This allows your output file to refer symbolically to absolute locations of memory defined in other programs. You may use this option more than once. -z parent is a higher level operation aimed specifically at simplifying the construction of high quality plugins. Although it employs the same operation, it differs from --just symbols in 2 significant ways: There can only be one parent. The parent is recorded in the created object, and can be displayed by 'file', or other similar tools.

Read the article
Anatomy of a .NET Assembly - Custom attribute encoding

- by Simon Cooper

In my previous post, I covered how field, method, and other types of signatures are encoded in a .NET assembly. Custom attribute signatures differ quite a bit from these, which consequently affects attribute specifications in C#. Custom attribute specifications In C#, you can apply a custom attribute to a type or type member, specifying a constructor as well as the values of fields or properties on the attribute type: public class ExampleAttribute : Attribute { public ExampleAttribute(int ctorArg1, string ctorArg2) { ... } public Type ExampleType { get; set; } } [Example(5, "6", ExampleType = typeof(string))] public class C { ... } How does this specification actually get encoded and stored in an assembly? Specification blob values Custom attribute specification signatures use the same building blocks as other types of signatures; the ELEMENT_TYPE structure. However, they significantly differ from other types of signatures, in that the actual parameter values need to be stored along with type information. There are two types of specification arguments in a signature blob; fixed args and named args. Fixed args are the arguments to the attribute type constructor, named arguments are specified after the constructor arguments to provide a value to a field or property on the constructed attribute type (PropertyName = propValue) Values in an attribute blob are limited to one of the basic types (one of the number types, character, or boolean), a reference to a type, an enum (which, in .NET, has to use one of the integer types as a base representation), or arrays of any of those. Enums and the basic types are easy to store in a blob - you simply store the binary representation. Strings are stored starting with a compressed integer indicating the length of the string, followed by the UTF8 characters. Array values start with an integer indicating the number of elements in the array, then the item values concatentated together. Rather than using a coded token, Type values are stored using a string representing the type name and fully qualified assembly name (for example, MyNs.MyType, MyAssembly, Version=1.0.0.0, Culture=neutral, PublicKeyToken=0123456789abcdef). If the type is in the current assembly or mscorlib then just the type name can be used. This is probably done to prevent direct references between assemblies solely because of attribute specification arguments; assemblies can be loaded in the reflection-only context and attribute arguments still processed, without loading the entire assembly. Fixed and named arguments Each entry in the CustomAttribute metadata table contains a reference to the object the attribute is applied to, the attribute constructor, and the specification blob. The number and type of arguments to the constructor (the fixed args) can be worked out by the method signature referenced by the attribute constructor, and so the fixed args can simply be concatenated together in the blob without any extra type information. Named args are different. These specify the value to assign to a field or property once the attribute type has been constructed. In the CLR, fields and properties can be overloaded just on their type; different fields and properties can have the same name. Therefore, to uniquely identify a field or property you need: Whether it's a field or property (indicated using byte values 0x53 and 0x54, respectively) The field or property type The field or property name After the fixed arg values is a 2-byte number specifying the number of named args in the blob. Each named argument has the above information concatenated together, mostly using the basic ELEMENT_TYPE values, in the same way as a method or field signature. A Type argument is represented using the byte 0x50, and an enum argument is represented using the byte 0x55 followed by a string specifying the name and assembly of the enum type. The named argument property information is followed by the argument value, using the same encoding as fixed args. Boxed objects This would be all very well, were it not for object and object[]. Arguments and properties of type object allow a value of any allowed argument type to be specified. As a result, more information needs to be specified in the blob to interpret the argument bytes as the correct type. So, the argument value is simple prepended with the type of the value by specifying the ELEMENT_TYPE or name of the enum the value represents. For named arguments, a field or property of type object is represented using the byte 0x51, with the actual type specified in the argument value. Some examples... All property signatures start with the 2-byte value 0x0001. Similar to my previous post in the series, names in capitals correspond to a particular byte value in the ELEMENT_TYPE structure. For strings, I'll simply give the string value, rather than the length and UTF8 encoding in the actual blob. I'll be using the following enum and attribute types to demonstrate specification encodings: class AttrAttribute : Attribute { public AttrAttribute() {} public AttrAttribute(Type[] tArray) {} public AttrAttribute(object o) {} public AttrAttribute(MyEnum e) {} public AttrAttribute(ushort x, int y) {} public AttrAttribute(string str, Type type1, Type type2) {} public int Prop1 { get; set; } public object Prop2 { get; set; } public object[] ObjectArray; } enum MyEnum : int { Val1 = 1, Val2 = 2 } Now, some examples: Here, the the specification binds to the (ushort, int) attribute constructor, with fixed args only. The specification blob starts off with a prolog, followed by the two constructor arguments, then the number of named arguments (zero): [Attr(42, 84)] 0x0001 0x002a 0x00000054 0x0000 An example of string and type encoding: [Attr("MyString", typeof(Array), typeof(System.Windows.Forms.Form))] 0x0001 "MyString" "System.Array" "System.Windows.Forms.Form, System.Windows.Forms, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089" 0x0000 As you can see, the full assembly specification of a type is only needed if the type isn't in the current assembly or mscorlib. Note, however, that the C# compiler currently chooses to fully-qualify mscorlib types anyway. An object argument (this binds to the object attribute constructor), and two named arguments (a null string is represented by 0xff and the empty string by 0x00) [Attr((ushort)40, Prop1 = 12, Prop2 = "")] 0x0001 U2 0x0028 0x0002 0x54 I4 "Prop1" 0x0000000c 0x54 0x51 "Prop2" STRING 0x00 Right, more complicated now. A type array as a fixed argument: [Attr(new[] { typeof(string), typeof(object) })] 0x0001 0x00000002 // the number of elements "System.String" "System.Object" 0x0000 An enum value, which is simply represented using the underlying value. The CLR works out that it's an enum using information in the attribute constructor signature: [Attr(MyEnum.Val1)] 0x0001 0x00000001 0x0000 And finally, a null array, and an object array as a named argument: [Attr((Type[])null, ObjectArray = new object[] { (byte)2, typeof(decimal), null, MyEnum.Val2 })] 0x0001 0xffffffff 0x0001 0x53 SZARRAY 0x51 "ObjectArray" 0x00000004 U1 0x02 0x50 "System.Decimal" STRING 0xff 0x55 "MyEnum" 0x00000002 As you'll notice, a null object is encoded as a null string value, and a null array is represented using a length of -1 (0xffffffff). How does this affect C#? So, we can now explain why the limits on attribute arguments are so strict in C#. Attribute specification blobs are limited to basic numbers, enums, types, and arrays. As you can see, this is because the raw CLR encoding can only accommodate those types. Special byte patterns have to be used to indicate object, string, Type, or enum values in named arguments; you can't specify an arbitary object type, as there isn't a generalised way of encoding the resulting value in the specification blob. In particular, decimal values can't be encoded, as it isn't a 'built-in' CLR type that has a native representation (you'll notice that decimal constants in C# programs are compiled as several integer arguments to DecimalConstantAttribute). Jagged arrays also aren't natively supported, although you can get around it by using an array as a value to an object argument: [Attr(new object[] { new object[] { new Type[] { typeof(string) } }, 42 })] Finally... Phew! That was a bit longer than I thought it would be. Custom attribute encodings are complicated! Hopefully this series has been an informative look at what exactly goes on inside a .NET assembly. In the next blog posts, I'll be carrying on with the 'Inside Red Gate' series.

Read the article
Advanced Continuous Delivery to Azure from TFS, Part 1: Good Enough Is Not Great

- by jasont

The folks over on the TFS / Visual Studio team have been working hard at releasing a steady stream of new features for their new hosted Team Foundation Service in the cloud. One of the most significant features released was simple continuous delivery of your solution into your Azure deployments. The original announcement from Brian Harry can be found here. Team Foundation Service is a great platform for .Net developers who are used to working with TFS on-premises. I’ve been using it since it became available at the //BUILD conference in 2011, and when I recently came to work at Stackify, it was one of the first changes I made. Managing work items is much easier than the tool we were using previously, although there are some limitations (more on that in another blog post). However, when continuous deployment was made available, it blew my mind. It was the killer feature I didn’t know I needed. Not to say that I wasn’t previously an advocate for continuous delivery; just that it was always a pain to set up and configure. Having it hosted - and a one-click setup – well, that’s just the best thing since sliced bread. It made perfect sense: my source code is in the cloud, and my deployment is in the cloud. Great! I can queue up a build from my iPad or phone and just let it go! I quickly tore through the quick setup and saw it all work… sort of. This will be the first in a three part series on how to take the building block of Team Foundation Service continuous delivery and build a CD model that will actually work for any team deploying something more advanced than a “Hello World” example. Part 1: Good Enough Is Not Great Part 2: A Model That Works: Branching and Multiple Deployment Environments Part 3: Other Considerations: SQL, Custom Tasks, Etc Good Enough Is Not Great There. I’ve said it. I certainly hope no one on the TFS team is offended, but it’s the truth. Let’s take a look under the hood and understand how it works, and then why it’s not enough to handle real world CD as-is. How it works. (note that I’ve skipped a couple of steps; I already have my accounts set up and something deployed to Azure) The first step is to establish some oAuth magic between your Azure management portal and your TFS Instance. You do this via the management portal. Once it’s done, you have a new build process template in your TFS instance. (Image lifted from the documentation) From here, you’ll get the usual prompts for security, allowing access, etc. But you’ll also get to pick which Solution in your source control to build. Here’s what the bulk of the build definition looks like. All I’ve had to do is add in the solution to build (notice that mine is from a specific branch – Release – more on that later) and I’ve changed the configuration. I trigger the build, and voila! I have an Azure deployment a few minutes later. The beauty of this is that it’s all in the cloud and I’m not waiting for my machine to compile and upload the package. (I also had to enable the build definition first – by default it is created in disabled state, probably a good thing since it will trigger on every.single.checkin by default.) I get to see a history of deployments from the Azure portal, and can link into TFS to see the associated changesets and work items. You’ll notice also that this build definition also automatically put my code in the Staging slot of my Azure deployment – more on this soon. For now, I can VIP swap and be in production. (P.S. I hate VIP swap and “production” and “staging” in Azure. More on that later too.) That’s it. That’s the default out-of-box experience. Easy, right? But it’s full of room for improvement, so let’s get into that…. The Problems Nothing is perfect (except my code – it’s always perfect), and neither is Continuous Deployment without a bit of work to help it fit your dev team’s process. So what are the issues? Issue 1: Staging vs QA vs Prod vs whatever other environments your team may have. This, for me, is the big hairy one. Remember how this automatically deployed to staging rather than prod for us? There are a couple of issues with this model: If I want to deliver to prod, it requires intervention on my part after deployment (via a VIP swap). If I truly want to promote between environments (i.e. Nightly Build –> Stable QA –> Production) I likely have configuration changes between each environment such as database connection strings and this process (and the VIP swap) doesn’t account for this. Yet. Issue 2: Branching and delivering on every check-in. As I mentioned above, I have set this up to target a specific branch – Release – of my code. For the purposes of this example, I have adopted the “basic” branching strategy as defined by the ALM Rangers. This basically establishes a “Main” trunk where you branch off Dev and Release branches. Granted, the Release branch is usually the only thing you will deploy to production, but you certainly don’t want to roll to production automatically when you merge to the Release branch and check-in (unless you like the thrill of it, and in that case, I like your style, cowboy….). Rather, you have nightly build and QA environments, or if you’ve adopted the feature-branch model you have environments for those. Those are the environments you want to continuously deploy to. But that takes us back to Issue 1: we currently have a 1:1 solution to Azure deployment target. Issue 3: SQL and other custom tasks. Let’s be honest and address the elephant in the room: I need to get some sleep because I see an elephant in the room. But seriously, I can’t think of an application I have touched in the last 10 years that doesn’t need to consider SQL changes when deploying code and upgrading an environment. Microsoft seems perfectly content to ignore this elephant for now: yes, they’ve added Data Tier Applications. But let’s be honest with ourselves again: no one really uses it, and it’s not suitable for anything more complex than a Hello World sample project database. Why? Because it doesn’t fit well into a great source control story. Developers make stored procedure and table changes all day long while coding complex applications, and if someone forgets to go update the DACPAC before the automated deployment, you have a broken build until it’s completed. Developers – not just DBAs – also like to work with SQL in SQL tools, not in Visual Studio. I’m really picking on SQL because that’s generally the biggest concern that I hear. But we need to account for any custom tasks as well in the build process. The Solutions… ? We’ve taken a look at how this all works, and addressed the shortcomings. In my next post (which I promise will be very, very soon), I will detail how I’ve overcome these shortcomings and used this foundation to create a mature, flexible model for deploying my app – any version, any time, to any environment.

Read the article
Curing the Database-Application mismatch

- by Phil Factor

If an application requires access to a database, then you have to be able to deploy it so as to be version-compatible with the database, in phase. If you can deploy both together, then the application and database must normally be deployed at the same version in which they, together, passed integration and functional testing. When a single database supports more than one application, then the problem gets more interesting. I’ll need to be more precise here. It is actually the application-interface definition of the database that needs to be in a compatible ‘version’. Most databases that get into production have no separate application-interface; in other words they are ‘close-coupled’. For this vast majority, the whole database is the application-interface, and applications are free to wander through the bowels of the database scot-free. If you’ve spurned the perceived wisdom of application architects to have a defined application-interface within the database that is based on views and stored procedures, any version-mismatch will be as sensitive as a kitten. A team that creates an application that makes direct access to base tables in a database will have to put a lot of energy into keeping Database and Application in sync, to say nothing of having to tackle issues such as security and audit. It is not the obvious route to development nirvana. I’ve been in countless tense meetings with application developers who initially bridle instinctively at the apparent restrictions of being ‘banned’ from the base tables or routines of a database. There is no good technical reason for needing that sort of access that I’ve ever come across. Everything that the application wants can be delivered via a set of views and procedures, and with far less pain for all concerned: This is the application-interface. If more than zero developers are creating a database-driven application, then the project will benefit from the loose-coupling that an application interface brings. What is important here is that the database development role is separated from the application development role, even if it is the same developer performing both roles. The idea of an application-interface with a database is as old as I can remember. The big corporate or government databases generally supported several applications, and there was little option. When a new application wanted access to an existing corporate database, the developers, and myself as technical architect, would have to meet with hatchet-faced DBAs and production staff to work out an interface. Sure, they would talk up the effort involved for budgetary reasons, but it was routine work, because it decoupled the database from its supporting applications. We’d be given our own stored procedures. One of them, I still remember, had ninety-two parameters. All database access was encapsulated in one application-module. If you have a stable defined application-interface with the database (Yes, one for each application usually) you need to keep the external definitions of the components of this interface in version control, linked with the application source, and carefully track and negotiate any changes between database developers and application developers. Essentially, the application development team owns the interface definition, and the onus is on the Database developers to implement it and maintain it, in conformance. Internally, the database can then make all sorts of changes and refactoring, as long as source control is maintained. If the application interface passes all the comprehensive integration and functional tests for the particular version they were designed for, nothing is broken. Your performance-testing can ‘hang’ on the same interface, since databases are judged on the performance of the application, not an ‘internal’ database process. The database developers have responsibility for maintaining the application-interface, but not its definition, as they refactor the database. This is easily tested on a daily basis since the tests are normally automated. In this setting, the deployment can proceed if the more stable application-interface, rather than the continuously-changing database, passes all tests for the version of the application. Normally, if all goes well, a database with a well-designed application interface can evolve gracefully without changing the external appearance of the interface, and this is confirmed by integration tests that check the interface, and which hopefully don’t need to be altered at all often. If the application is rapidly changing its ‘domain model’ in the light of an increased understanding of the application domain, then it can change the interface definitions and the database developers need only implement the interface rather than refactor the underlying database. The test team will also have to redo the functional and integration tests which are, of course ‘written to’ the definition. The Database developers will find it easier if these tests are done before their re-wiring job to implement the new interface. If, at the other extreme, an application receives no further development work but survives unchanged, the database can continue to change and develop to keep pace with the requirements of the other applications it supports, and needs only to take care that the application interface is never broken. Testing is easy since your automated scripts to test the interface do not need to change. The database developers will, of course, maintain their own source control for the database, and will be likely to maintain versions for all major releases. However, this will not need to be shared with the applications that the database servers. On the other hand, the definition of the application interfaces should be within the application source. Changes in it have to be subject to change-control procedures, as they will require a chain of tests. Once you allow, instead of an application-interface, an intimate relationship between application and database, we are in the realms of impedance mismatch, over and above the obvious security problems. Part of this impedance problem is a difference in development practices. Whereas the application has to be regularly built and integrated, this isn’t necessarily the case with the database. An RDBMS is inherently multi-user and self-integrating. If the developers work together on the database, then a subsequent integration of the database on a staging server doesn’t often bring nasty surprises. A separate database-integration process is only needed if the database is deliberately built in a way that mimics the application development process, but which hampers the normal database-development techniques. This process is like demanding a official walking with a red flag in front of a motor car. In order to closely coordinate databases with applications, entire databases have to be ‘versioned’, so that an application version can be matched with a database version to produce a working build without errors. There is no natural process to ‘version’ databases. Each development project will have to define a system for maintaining the version level. A curious paradox occurs in development when there is no formal application-interface. When the strains and cracks happen, the extra meetings, bureaucracy, and activity required to maintain accurate deployments looks to IT management like work. They see activity, and it looks good. Work means progress. Management then smile on the design choices made. In IT, good design work doesn’t necessarily look good, and vice versa.

Read the article

Search Results

Search found 11897 results on 476 pages for 'dean rather'.

Page 148/476 | < Previous Page | 144 145 146 147 148 149 150 151 152 153 154 155 | Next Page >

- by Rick Strahl

- by Frans

- by normstorm

- by Tom Wijsman

- by purefusion

- by Hubert Kario

- by Calvin Piche

- by Piet

- by Rick Strahl

- by smisner

- by rituchhibber

- by Piotr Rodak

- by Shaun

- by KeithMayer

- by yaldahhakim

- by jamiet

- by HighAltitudeCoder

- by Paul White

- by jamiet

- by Rick Strahl

- by Dave

- by Ali Bahrami

- by Simon Cooper

- by jasont

- by Phil Factor

< Previous Page | 144 145 146 147 148 149 150 151 152 153 154 155 | Next Page >