Search Results

Search found 8275 results on 331 pages for 'bad appz'.

Page 326/331 | < Previous Page | 322 323 324 325 326 327 328 329 330 331 | Next Page >

how to click on a button in python

- by Ciobanu Alexandru

Trying to build some bot for clicking "skip-ad" button on a page. So far, i manage to use Mechanize to load a web-driver browser and to connect to some page but Mechanize module do not support js directly so now i need something like Selenium if i understand correct. I am also a beginner in programming so please be specific. How can i use Selenium or if there is any other solution, please explain details. This is the inner html code for the button: <a id="skip-ad" class="btn btn-inverse" onclick="open_url('http://imgur.com/gallery/tDK9V68', 'go'); return false;" style="font-weight: bold; " target="_blank" href="http://imgur.com/gallery/tDK9V68"> … </a> And this is my source so far: #!/usr/bin/python # FILENAME: test.py import mechanize import os, time from random import choice, randrange prox_list = [] #list of common UAS to apply to each connection attempt to impersonate browsers user_agent_strings = [ 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1468.0 Safari/537.36', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1', 'Opera/9.80 (Windows NT 6.0) Presto/2.12.388 Version/12.14', 'Opera/9.80 (Macintosh; Intel Mac OS X 10.6.8; U; fr) Presto/2.9.168 Version/11.52', 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:23.0) Gecko/20131011 Firefox/23.0', 'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; Media Center PC 6.0; InfoPath.3; MS-RTC LM 8; Zune 4.7', 'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0; .NET CLR 2.0.50727; SLCC2; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; Zune 4.0; Tablet PC 2.0; InfoPath.3; .NET4.0C; .NET4.0E)', 'Mozilla/5.0 (compatible; MSIE 10.6; Windows NT 6.1; Trident/5.0; InfoPath.2; SLCC1; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET CLR 2.0.50727) 3gpp-gba UNTRUSTED/1.0', 'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0; chromeframe/11.0.696.57)', 'Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; InfoPath.1; SV1; .NET CLR 3.8.36217; WOW64; en-US)', 'Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; GTB7.4; InfoPath.2; SV1; .NET CLR 3.3.69573; WOW64; en-US)' ] def load_proxy_list(target): #loads and parses the proxy list file = open(target, 'r') count = 0 for line in file: prox_list.append(line) count += 1 print "Loaded " + str(count) + " proxies!" load_proxy_list('proxies.txt') #for i in range(1,(len(prox_list) - 1)): # depreceated for overloading for i in range(1,30): br = mechanize.Browser() #pick a random UAS to add some extra cover to the bot br.addheaders = [('User-agent', choice(user_agent_strings))] print "----------------------------------------------------" #This is bad internet ethics br.set_handle_robots(False) #choose a proxy proxy = choice(prox_list) br.set_proxies({"http": proxy}) br.set_debug_http(True) try: print "Trying connection with: " + str(proxy) #currently using: BTC CoinURL - Grooveshark Broadcast br.open("http://cur.lv/4czwj") print "Opened successfully!" #act like a nice little drone and view the ads sleep_time_on_link = randrange(17.0,34.0) time.sleep(sleep_time_on_link) except mechanize.HTTPError, e: print "Oops Request threw " + str(e.code) #future versions will handle codes properly, 404 most likely means # the ad-linker has noticed bot-traffic and removed the link # or the used proxy is terrible. We will either geo-locate # proxies beforehand and pick good hosts, or ditch the link # which is worse case scenario, account is closed because of botting except mechanize.URLError, e: print "Oops! Request was refused, blacklisting proxy!" + str(e) prox_list.remove(proxy) del br #close browser entirely #wait between 5-30 seconds like a good little human sleep_time = randrange(5.0, 30.0) print "Waiting for %.1f seconds like a good bot." % (sleep_time) time.sleep(sleep_time)

Read the article
Why this code is not working on linux server ?

- by user488001

Hello Experts, I am new in Zend Framework, and this code is use for downloading contents. This code is working in localhost but when i tried to execute in linux server it shows error file not found. public function downloadAnnouncementsAction() { $file= $this-_getParam('file'); $file = str_replace("%2F","/",$this-_getParam('file')); // Allow direct file download (hotlinking)? // Empty - allow hotlinking // If set to nonempty value (Example: example.com) will only allow downloads when referrer contains this text define('ALLOWED_REFERRER', ''); // Download folder, i.e. folder where you keep all files for download. // MUST end with slash (i.e. "/" ) define('BASE_DIR','file_upload'); // log downloads? true/false define('LOG_DOWNLOADS',true); // log file name define('LOG_FILE','downloads.log'); // Allowed extensions list in format 'extension' => 'mime type' // If myme type is set to empty string then script will try to detect mime type // itself, which would only work if you have Mimetype or Fileinfo extensions // installed on server. $allowed_ext = array ( // audio 'mp3' => 'audio/mpeg', 'wav' => 'audio/x-wav', // video 'mpeg' => 'video/mpeg', 'mpg' => 'video/mpeg', 'mpe' => 'video/mpeg', 'mov' => 'video/quicktime', 'avi' => 'video/x-msvideo' ); // If hotlinking not allowed then make hackers think there are some server problems if (ALLOWED_REFERRER !== '' && (!isset($_SERVER['HTTP_REFERER']) || strpos(strtoupper($_SERVER['HTTP_REFERER']),strtoupper(ALLOWED_REFERRER)) === false) ) { die("Internal server error. Please contact system administrator."); } // Make sure program execution doesn't time out // Set maximum script execution time in seconds (0 means no limit) set_time_limit(0); if (!isset($file) || empty($file)) { die("Please specify file name for download."); } // Nullbyte hack fix if (strpos($file, "\0") !== FALSE) die(''); // Get real file name. // Remove any path info to avoid hacking by adding relative path, etc. $fname = basename($file); // Check if the file exists // Check in subfolders too function find_file ($dirname, $fname, &$file_path) { $dir = opendir($dirname); while ($file = readdir($dir)) { if (empty($file_path) && $file != '.' && $file != '..') { if (is_dir($dirname.'/'.$file)) { find_file($dirname.'/'.$file, $fname, $file_path); } else { if (file_exists($dirname.'/'.$fname)) { $file_path = $dirname.'/'.$fname; return; } } } } } // find_file // get full file path (including subfolders) $file_path = ''; find_file(BASE_DIR, $fname, $file_path); if (!is_file($file_path)) { die("File does not exist. Make sure you specified correct file name."); } // file size in bytes $fsize = filesize($file_path); // file extension $fext = strtolower(substr(strrchr($fname,"."),1)); // check if allowed extension if (!array_key_exists($fext, $allowed_ext)) { die("Not allowed file type."); } // get mime type if ($allowed_ext[$fext] == '') { $mtype = ''; // mime type is not set, get from server settings if (function_exists('mime_content_type')) { $mtype = mime_content_type($file_path); } else if (function_exists('finfo_file')) { $finfo = finfo_open(FILEINFO_MIME); // return mime type $mtype = finfo_file($finfo, $file_path); finfo_close($finfo); } if ($mtype == '') { $mtype = "application/force-download"; } } else { // get mime type defined by admin $mtype = $allowed_ext[$fext]; } // Browser will try to save file with this filename, regardless original filename. // You can override it if needed. if (!isset($_GET['fc']) || empty($_GET['fc'])) { $asfname = $fname; } else { // remove some bad chars $asfname = str_replace(array('"',"'",'\\','/'), '', $_GET['fc']); if ($asfname === '') $asfname = 'NoName'; } // set headers header("Pragma: public"); header("Expires: 0"); header("Cache-Control: must-revalidate, post-check=0, pre-check=0"); header("Cache-Control: public"); header("Content-Description: File Transfer"); header("Content-Type: $mtype"); header("Content-Disposition: attachment; filename=\"$asfname\""); header("Content-Transfer-Encoding: binary"); header("Content-Length: " . $fsize); // download // @readfile($file_path); $file = @fopen($file_path,"rb"); if ($file) { while(!feof($file)) { print(fread($file, 1024*8)); flush(); if (connection_status()!=0) { @fclose($file); die(); } } @fclose($file); } // log downloads if (!LOG_DOWNLOADS) die(); $f = @fopen(LOG_FILE, 'a+'); if ($f) { @fputs($f, date("m.d.Y g:ia")." ".$_SERVER['REMOTE_ADDR']." ".$fname."\n"); @fclose($f); } } please Help...

Read the article
Integrating JavaScript Unit Tests with Visual Studio

- by Stephen Walther

Modern ASP.NET web applications take full advantage of client-side JavaScript to provide better interactivity and responsiveness. If you are building an ASP.NET application in the right way, you quickly end up with lots and lots of JavaScript code. When writing server code, you should be writing unit tests. One big advantage of unit tests is that they provide you with a safety net that enable you to safely modify your existing code – for example, fix bugs, add new features, and make performance enhancements -- without breaking your existing code. Every time you modify your code, you can execute your unit tests to verify that you have not broken anything. For the same reason that you should write unit tests for your server code, you should write unit tests for your client code. JavaScript is just as susceptible to bugs as C#. There is no shortage of unit testing frameworks for JavaScript. Each of the major JavaScript libraries has its own unit testing framework. For example, jQuery has QUnit, Prototype has UnitTestJS, YUI has YUI Test, and Dojo has Dojo Objective Harness (DOH). The challenge is integrating a JavaScript unit testing framework with Visual Studio. Visual Studio and Visual Studio ALM provide fantastic support for server-side unit tests. You can easily view the results of running your unit tests in the Visual Studio Test Results window. You can set up a check-in policy which requires that all unit tests pass before your source code can be committed to the source code repository. In addition, you can set up Team Build to execute your unit tests automatically. Unfortunately, Visual Studio does not provide “out-of-the-box” support for JavaScript unit tests. MS Test, the unit testing framework included in Visual Studio, does not support JavaScript unit tests. As soon as you leave the server world, you are left on your own. The goal of this blog entry is to describe one approach to integrating JavaScript unit tests with MS Test so that you can execute your JavaScript unit tests side-by-side with your C# unit tests. The goal is to enable you to execute JavaScript unit tests in exactly the same way as server-side unit tests. You can download the source code described by this project by scrolling to the end of this blog entry. Rejected Approach: Browser Launchers One popular approach to executing JavaScript unit tests is to use a browser as a test-driver. When you use a browser as a test-driver, you open up a browser window to execute and view the results of executing your JavaScript unit tests. For example, QUnit – the unit testing framework for jQuery – takes this approach. The following HTML page illustrates how you can use QUnit to create a unit test for a function named addNumbers(). <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html> <head> <title>Using QUnit</title> <link rel="stylesheet" href="http://github.com/jquery/qunit/raw/master/qunit/qunit.css" type="text/css" /> </head> <body> <h1 id="qunit-header">QUnit example</h1> <h2 id="qunit-banner"></h2> <div id="qunit-testrunner-toolbar"></div> <h2 id="qunit-userAgent"></h2> <ol id="qunit-tests"></ol> <div id="qunit-fixture">test markup, will be hidden</div> <script type="text/javascript" src="http://code.jquery.com/jquery-latest.js"></script> <script type="text/javascript" src="http://github.com/jquery/qunit/raw/master/qunit/qunit.js"></script> <script type="text/javascript"> // The function to test function addNumbers(a, b) { return a+b; } // The unit test test("Test of addNumbers", function () { equals(4, addNumbers(1,3), "1+3 should be 4"); }); </script> </body> </html> This test verifies that calling addNumbers(1,3) returns the expected value 4. When you open this page in a browser, you can see that this test does, in fact, pass. The idea is that you can quickly refresh this QUnit HTML JavaScript test driver page in your browser whenever you modify your JavaScript code. In other words, you can keep a browser window open and keep refreshing it over and over while you are developing your application. That way, you can know very quickly whenever you have broken your JavaScript code. While easy to setup, there are several big disadvantages to this approach to executing JavaScript unit tests: You must view your JavaScript unit test results in a different location than your server unit test results. The JavaScript unit test results appear in the browser and the server unit test results appear in the Visual Studio Test Results window. Because all of your unit test results don’t appear in a single location, you are more likely to introduce bugs into your code without noticing it. Because your unit tests are not integrated with Visual Studio – in particular, MS Test -- you cannot easily include your JavaScript unit tests when setting up check-in policies or when performing automated builds with Team Build. A more sophisticated approach to using a browser as a test-driver is to automate the web browser. Instead of launching the browser and loading the test code yourself, you use a framework to automate this process. There are several different testing frameworks that support this approach: · Selenium – Selenium is a very powerful framework for automating browser tests. You can create your tests by recording a Firefox session or by writing the test driver code in server code such as C#. You can learn more about Selenium at http://seleniumhq.org/. LTAF – The ASP.NET team uses the Lightweight Test Automation Framework to test JavaScript code in the ASP.NET framework. You can learn more about LTAF by visiting the project home at CodePlex: http://aspnet.codeplex.com/releases/view/35501 jsTestDriver – This framework uses Java to automate the browser. jsTestDriver creates a server which can be used to automate multiple browsers simultaneously. This project is located at http://code.google.com/p/js-test-driver/ TestSwam – This framework, created by John Resig, uses PHP to automate the browser. Like jsTestDriver, the framework creates a test server. You can open multiple browsers that are automated by the test server. Learn more about TestSwarm by visiting the following address: https://github.com/jeresig/testswarm/wiki Yeti – This is the framework introduced by Yahoo for automating browser tests. Yeti uses server-side JavaScript and depends on Node.js. Learn more about Yeti at http://www.yuiblog.com/blog/2010/08/25/introducing-yeti-the-yui-easy-testing-interface/ All of these frameworks are great for integration tests – however, they are not the best frameworks to use for unit tests. In one way or another, all of these frameworks depend on executing tests within the context of a “living and breathing” browser. If you create an ASP.NET Unit Test then Visual Studio will launch a web server before executing the unit test. Why is launching a web server so bad? It is not the worst thing in the world. However, it does introduce dependencies that prevent your code from being tested in isolation. One of the defining features of a unit test -- versus an integration test – is that a unit test tests code in isolation. Another problem with launching a web server when performing unit tests is that launching a web server can be slow. If you cannot execute your unit tests quickly, you are less likely to execute your unit tests each and every time you make a code change. You are much more likely to fall into the pit of failure. Launching a browser when performing a JavaScript unit test has all of the same disadvantages as launching a web server when performing an ASP.NET unit test. Instead of testing a unit of JavaScript code in isolation, you are testing JavaScript code within the context of a particular browser. Using the frameworks listed above for integration tests makes perfect sense. However, I want to consider a different approach for creating unit tests for JavaScript code. Using Server-Side JavaScript for JavaScript Unit Tests A completely different approach to executing JavaScript unit tests is to perform the tests outside of any browser. If you really want to test JavaScript then you should test JavaScript and leave the browser out of the testing process. There are several ways that you can execute JavaScript on the server outside the context of any browser: Rhino – Rhino is an implementation of JavaScript written in Java. The Rhino project is maintained by the Mozilla project. Learn more about Rhino at http://www.mozilla.org/rhino/ V8 – V8 is the open-source Google JavaScript engine written in C++. This is the JavaScript engine used by the Chrome web browser. You can download V8 and embed it in your project by visiting http://code.google.com/p/v8/ JScript – JScript is the JavaScript Script Engine used by Internet Explorer (up to but not including Internet Explorer 9), Windows Script Host, and Active Server Pages. Internet Explorer is still the most popular web browser. Therefore, I decided to focus on using the JScript Script Engine to execute JavaScript unit tests. Using the Microsoft Script Control There are two basic ways that you can pass JavaScript to the JScript Script Engine and execute the code: use the Microsoft Windows Script Interfaces or use the Microsoft Script Control. The difficult and proper way to execute JavaScript using the JScript Script Engine is to use the Microsoft Windows Script Interfaces. You can learn more about the Script Interfaces by visiting http://msdn.microsoft.com/en-us/library/t9d4xf28(VS.85).aspx The main disadvantage of using the Script Interfaces is that they are difficult to use from .NET. There is a great series of articles on using the Script Interfaces from C# located at http://www.drdobbs.com/184406028. I picked the easier alternative and used the Microsoft Script Control. The Microsoft Script Control is an ActiveX control that provides a higher level abstraction over the Window Script Interfaces. You can download the Microsoft Script Control from here: http://www.microsoft.com/downloads/en/details.aspx?FamilyID=d7e31492-2595-49e6-8c02-1426fec693ac After you download the Microsoft Script Control, you need to add a reference to it to your project. Select the Visual Studio menu option Project, Add Reference to open the Add Reference dialog. Select the COM tab and add the Microsoft Script Control 1.0. Using the Script Control is easy. You call the Script Control AddCode() method to add JavaScript code to the Script Engine. Next, you call the Script Control Run() method to run a particular JavaScript function. The reference documentation for the Microsoft Script Control is located at the MSDN website: http://msdn.microsoft.com/en-us/library/aa227633%28v=vs.60%29.aspx Creating the JavaScript Code to Test To keep things simple, let’s imagine that you want to test the following JavaScript function named addNumbers() which simply adds two numbers together: MvcApplication1\Scripts\Math.js function addNumbers(a, b) { return 5; } Notice that the addNumbers() method always returns the value 5. Right-now, it will not pass a good unit test. Create this file and save it in your project with the name Math.js in your MVC project’s Scripts folder (Save the file in your actual MVC application and not your MVC test application). Creating the JavaScript Test Helper Class To make it easier to use the Microsoft Script Control in unit tests, we can create a helper class. This class contains two methods: LoadFile() – Loads a JavaScript file. Use this method to load the JavaScript file being tested or the JavaScript file containing the unit tests. ExecuteTest() – Executes the JavaScript code. Use this method to execute a JavaScript unit test. Here’s the code for the JavaScriptTestHelper class: JavaScriptTestHelper.cs using System; using System.IO; using Microsoft.VisualStudio.TestTools.UnitTesting; using MSScriptControl; namespace MvcApplication1.Tests { public class JavaScriptTestHelper : IDisposable { private ScriptControl _sc; private TestContext _context; /// <summary> /// You need to use this helper with Unit Tests and not /// Basic Unit Tests because you need a Test Context /// </summary> /// <param name="testContext">Unit Test Test Context</param> public JavaScriptTestHelper(TestContext testContext) { if (testContext == null) { throw new ArgumentNullException("TestContext"); } _context = testContext; _sc = new ScriptControl(); _sc.Language = "JScript"; _sc.AllowUI = false; } /// <summary> /// Load the contents of a JavaScript file into the /// Script Engine. /// </summary> /// <param name="path">Path to JavaScript file</param> public void LoadFile(string path) { var fileContents = File.ReadAllText(path); _sc.AddCode(fileContents); } /// <summary> /// Pass the path of the test that you want to execute. /// </summary> /// <param name="testMethodName">JavaScript function name</param> public void ExecuteTest(string testMethodName) { dynamic result = null; try { result = _sc.Run(testMethodName, new object[] { }); } catch { var error = ((IScriptControl)_sc).Error; if (error != null) { var description = error.Description; var line = error.Line; var column = error.Column; var text = error.Text; var source = error.Source; if (_context != null) { var details = String.Format("{0} \r\nLine: {1} Column: {2}", source, line, column); _context.WriteLine(details); } } throw new AssertFailedException(error.Description); } } public void Dispose() { _sc = null; } } } Notice that the JavaScriptTestHelper class requires a Test Context to be instantiated. For this reason, you can use the JavaScriptTestHelper only with a Visual Studio Unit Test and not a Basic Unit Test (These are two different types of Visual Studio project items). Add the JavaScriptTestHelper file to your MVC test application (for example, MvcApplication1.Tests). Creating the JavaScript Unit Test Next, we need to create the JavaScript unit test function that we will use to test the addNumbers() function. Create a folder in your MVC test project named JavaScriptTests and add the following JavaScript file to this folder: MvcApplication1.Tests\JavaScriptTests\MathTest.js /// <reference path="JavaScriptUnitTestFramework.js"/> function testAddNumbers() { // Act var result = addNumbers(1, 3); // Assert assert.areEqual(4, result, "addNumbers did not return right value!"); } The testAddNumbers() function takes advantage of another JavaScript library named JavaScriptUnitTestFramework.js. This library contains all of the code necessary to make assertions. Add the following JavaScriptnitTestFramework.js to the same folder as the MathTest.js file: MvcApplication1.Tests\JavaScriptTests\JavaScriptUnitTestFramework.js var assert = { areEqual: function (expected, actual, message) { if (expected !== actual) { throw new Error("Expected value " + expected + " is not equal to " + actual + ". " + message); } } }; There is only one type of assertion supported by this file: the areEqual() assertion. Most likely, you would want to add additional types of assertions to this file to make it easier to write your JavaScript unit tests. Deploying the JavaScript Test Files This step is non-intuitive. When you use Visual Studio to run unit tests, Visual Studio creates a new folder and executes a copy of the files in your project. After you run your unit tests, your Visual Studio Solution will contain a new folder named TestResults that includes a subfolder for each test run. You need to configure Visual Studio to deploy your JavaScript files to the test run folder or Visual Studio won’t be able to find your JavaScript files when you execute your unit tests. You will get an error that looks something like this when you attempt to execute your unit tests: You can configure Visual Studio to deploy your JavaScript files by adding a Test Settings file to your Visual Studio Solution. It is important to understand that you need to add this file to your Visual Studio Solution and not a particular Visual Studio project. Right-click your Solution in the Solution Explorer window and select the menu option Add, New Item. Select the Test Settings item and click the Add button. After you create a Test Settings file for your solution, you can indicate that you want a particular folder to be deployed whenever you perform a test run. Select the menu option Test, Edit Test Settings to edit your test configuration file. Select the Deployment tab and select your MVC test project’s JavaScriptTest folder to deploy. Click the Apply button and the Close button to save the changes and close the dialog. Creating the Visual Studio Unit Test The very last step is to create the Visual Studio unit test (the MS Test unit test). Add a new unit test to your MVC test project by selecting the menu option Add New Item and selecting the Unit Test project item (Do not select the Basic Unit Test project item): The difference between a Basic Unit Test and a Unit Test is that a Unit Test includes a Test Context. We need this Test Context to use the JavaScriptTestHelper class that we created earlier. Enter the following test method for the new unit test: [TestMethod] public void TestAddNumbers() { var jsHelper = new JavaScriptTestHelper(this.TestContext); // Load JavaScript files jsHelper.LoadFile("JavaScriptUnitTestFramework.js"); jsHelper.LoadFile(@"..\..\..\MvcApplication1\Scripts\Math.js"); jsHelper.LoadFile("MathTest.js"); // Execute JavaScript Test jsHelper.ExecuteTest("testAddNumbers"); } This code uses the JavaScriptTestHelper to load three files: JavaScripUnitTestFramework.js – Contains the assert functions. Math.js – Contains the addNumbers() function from your MVC application which is being tested. MathTest.js – Contains the JavaScript unit test function. Next, the test method calls the JavaScriptTestHelper ExecuteTest() method to execute the testAddNumbers() JavaScript function. Running the Visual Studio JavaScript Unit Test After you complete all of the steps described above, you can execute the JavaScript unit test just like any other unit test. You can use the keyboard combination CTRL-R, CTRL-A to run all of the tests in the current Visual Studio Solution. Alternatively, you can use the buttons in the Visual Studio toolbar to run the tests: (Unfortunately, the Run All Impacted Tests button won’t work correctly because Visual Studio won’t detect that your JavaScript code has changed. Therefore, you should use either the Run Tests in Current Context or Run All Tests in Solution options instead.) The results of running the JavaScript tests appear side-by-side with the results of running the server tests in the Test Results window. For example, if you Run All Tests in Solution then you will get the following results: Notice that the TestAddNumbers() JavaScript test has failed. That is good because our addNumbers() function is hard-coded to always return the value 5. If you double-click the failing JavaScript test, you can view additional details such as the JavaScript error message and the line number of the JavaScript code that failed: Summary The goal of this blog entry was to explain an approach to creating JavaScript unit tests that can be easily integrated with Visual Studio and Visual Studio ALM. I described how you can use the Microsoft Script Control to execute JavaScript on the server. By taking advantage of the Microsoft Script Control, we were able to execute our JavaScript unit tests side-by-side with all of our other unit tests and view the results in the standard Visual Studio Test Results window. You can download the code discussed in this blog entry from here: http://StephenWalther.com/downloads/Blog/JavaScriptUnitTesting/JavaScriptUnitTests.zip Before running this code, you need to first install the Microsoft Script Control which you can download from here: http://www.microsoft.com/downloads/en/details.aspx?FamilyID=d7e31492-2595-49e6-8c02-1426fec693ac

Read the article
Microsoft and jQuery

- by Rick Strahl

The jQuery JavaScript library has been steadily getting more popular and with recent developments from Microsoft, jQuery is also getting ever more exposure on the ASP.NET platform including now directly from Microsoft. jQuery is a light weight, open source DOM manipulation library for JavaScript that has changed how many developers think about JavaScript. You can download it and find more information on jQuery on www.jquery.com. For me jQuery has had a huge impact on how I develop Web applications and was probably the main reason I went from dreading to do JavaScript development to actually looking forward to implementing client side JavaScript functionality. It has also had a profound impact on my JavaScript skill level for me by seeing how the library accomplishes things (and often reviewing the terse but excellent source code). jQuery made an uncomfortable development platform (JavaScript + DOM) a joy to work on. Although jQuery is by no means the only JavaScript library out there, its ease of use, small size, huge community of plug-ins and pure usefulness has made it easily the most popular JavaScript library available today. As a long time jQuery user, I’ve been excited to see the developments from Microsoft that are bringing jQuery to more ASP.NET developers and providing more integration with jQuery for ASP.NET’s core features rather than relying on the ASP.NET AJAX library. Microsoft and jQuery – making Friends jQuery is an open source project but in the last couple of years Microsoft has really thrown its weight behind supporting this open source library as a supported component on the Microsoft platform. When I say supported I literally mean supported: Microsoft now offers actual tech support for jQuery as part of their Product Support Services (PSS) as jQuery integration has become part of several of the ASP.NET toolkits and ships in several of the default Web project templates in Visual Studio 2010. The ASP.NET MVC 3 framework (still in Beta) also uses jQuery for a variety of client side support features including client side validation and we can look forward toward more integration of client side functionality via jQuery in both MVC and WebForms in the future. In other words jQuery is becoming an optional but included component of the ASP.NET platform. PSS support means that support staff will answer jQuery related support questions as part of any support incidents related to ASP.NET which provides some piece of mind to some corporate development shops that require end to end support from Microsoft. In addition to including jQuery and supporting it, Microsoft has also been getting involved in providing development resources for extending jQuery’s functionality via plug-ins. Microsoft’s last version of the Microsoft Ajax Library – which is the successor to the native ASP.NET AJAX Library – included some really cool functionality for client templates, databinding and localization. As it turns out Microsoft has rebuilt most of that functionality using jQuery as the base API and provided jQuery plug-ins of these components. Very recently these three plug-ins were submitted and have been approved for inclusion in the official jQuery plug-in repository and been taken over by the jQuery team for further improvements and maintenance. Even more surprising: The jQuery-templates component has actually been approved for inclusion in the next major update of the jQuery core in jQuery V1.5, which means it will become a native feature that doesn’t require additional script files to be loaded. Imagine this – an open source contribution from Microsoft that has been accepted into a major open source project for a core feature improvement. Microsoft has come a long way indeed! What the Microsoft Involvement with jQuery means to you For Microsoft jQuery support is a strategic decision that affects their direction in client side development, but nothing stopped you from using jQuery in your applications prior to Microsoft’s official backing and in fact a large chunk of developers did so readily prior to Microsoft’s announcement. Official support from Microsoft brings a few benefits to developers however. jQuery support in Visual Studio 2010 means built-in support for jQuery IntelliSense, automatically added jQuery scripts in many projects types and a common base for client side functionality that actually uses what most developers are already using. If you have already been using jQuery and were worried about straying from the Microsoft line and their internal Microsoft Ajax Library – worry no more. With official support and the change in direction towards jQuery Microsoft is now following along what most in the ASP.NET community had already been doing by using jQuery, which is likely the reason for Microsoft’s shift in direction in the first place. ASP.NET AJAX and the Microsoft AJAX Library weren’t bad technology – there was tons of useful functionality buried in these libraries. However, these libraries never got off the ground, mainly because early incarnations were squarely aimed at control/component developers rather than application developers. For all the functionality that these controls provided for control developers they lacked in useful and easily usable application developer functionality that was easily accessible in day to day client side development. The result was that even though Microsoft shipped support for these tools in the box (in .NET 3.5 and 4.0), other than for the internal support in ASP.NET for things like the UpdatePanel and the ASP.NET AJAX Control Toolkit as well as some third party vendors, the Microsoft client libraries were largely ignored by the developer community opening the door for other client side solutions. Microsoft seems to be acknowledging developer choice in this case: Many more developers were going down the jQuery path rather than using the Microsoft built libraries and there seems to be little sense in continuing development of a technology that largely goes unused by the majority of developers. Kudos for Microsoft for recognizing this and gracefully changing directions. Note that even though there will be no further development in the Microsoft client libraries they will continue to be supported so if you’re using them in your applications there’s no reason to start running for the exit in a panic and start re-writing everything with jQuery. Although that might be a reasonable choice in some cases, jQuery and the Microsoft libraries work well side by side so that you can leave existing solutions untouched even as you enhance them with jQuery. The Microsoft jQuery Plug-ins – Solid Core Features One of the most interesting developments in Microsoft’s embracing of jQuery is that Microsoft has started contributing to jQuery via standard mechanism set for jQuery developers: By submitting plug-ins. Microsoft took some of the nicest new features of the unpublished Microsoft Ajax Client Library and re-wrote these components for jQuery and then submitted them as plug-ins to the jQuery plug-in repository. Accepted plug-ins get taken over by the jQuery team and that’s exactly what happened with the three plug-ins submitted by Microsoft with the templating plug-in even getting slated to be published as part of the jQuery core in the next major release (1.5). The following plug-ins are provided by Microsoft: jQuery Templates – a client side template rendering engine jQuery Data Link – a client side databinder that can synchronize changes without code jQuery Globalization – provides formatting and conversion features for dates and numbers The first two are ports of functionality that was slated for the Microsoft Ajax Library while functionality for the globalization library provides functionality that was already found in the original ASP.NET AJAX library. To me all three plug-ins address a pressing need in client side applications and provide functionality I’ve previously used in other incarnations, but with more complete implementations. Let’s take a close look at these plug-ins. jQuery Templates http://api.jquery.com/category/plugins/templates/ Client side templating is a key component for building rich JavaScript applications in the browser. Templating on the client lets you avoid from manually creating markup by creating DOM nodes and injecting them individually into the document via code. Rather you can create markup templates – similar to the way you create classic ASP server markup – and merge data into these templates to render HTML which you can then inject into the document or replace existing content with. Output from templates are rendered as a jQuery matched set and can then be easily inserted into the document as needed. Templating is key to minimize client side code and reduce repeated code for rendering logic. Instead a single template can be used in many places for updating and adding content to existing pages. Further if you build pure AJAX interfaces that rely entirely on client rendering of the initial page content, templates allow you to a use a single markup template to handle all rendering of each specific HTML section/element. I’ve used a number of different client rendering template engines with jQuery in the past including jTemplates (a PHP style templating engine) and a modified version of John Resig’s MicroTemplating engine which I built into my own set of libraries because it’s such a commonly used feature in my client side applications. jQuery templates adds a much richer templating model that allows for sub-templates and access to the data items. Like John Resig’s original Micro Template engine, the core basics of the templating engine create JavaScript code which means that templates can include JavaScript code. To give you a basic idea of how templates work imagine I have an application that downloads a set of stock quotes based on a symbol list then displays them in the document. To do this you can create an ‘item’ template that describes how each of the quotes is renderd as a template inside of the document: <script id="stockTemplate" type="text/x-jquery-tmpl"> <div id="divStockQuote" class="errordisplay" style="width: 500px;"> <div class="label">Company:</div><div><b>${Company}(${Symbol})</b></div> <div class="label">Last Price:</div><div>${LastPrice}</div> <div class="label">Net Change:</div><div> {{if NetChange > 0}} <b style="color:green" >${NetChange}</b> {{else}} <b style="color:red" >${NetChange}</b> {{/if}} </div> <div class="label">Last Update:</div><div>${LastQuoteTimeString}</div> </div> </script> The ‘template’ is little more than HTML with some markup expressions inside of it that define the template language. Notice the embedded ${} expressions which reference data from the quote objects returned from an AJAX call on the server. You can embed any JavaScript or value expression in these template expressions. There are also a number of structural commands like {{if}} and {{each}} that provide for rudimentary logic inside of your templates as well as commands ({{tmpl}} and {{wrap}}) for nesting templates. You can find more about the full set of markup expressions available in the documentation. To load up this data you can use code like the following: <script type="text/javascript"> //var Proxy = new ServiceProxy("../PageMethods/PageMethodsService.asmx/"); $(document).ready(function () { $("#btnGetQuotes").click(GetQuotes); }); function GetQuotes() { var symbols = $("#txtSymbols").val().split(","); $.ajax({ url: "../PageMethods/PageMethodsService.asmx/GetStockQuotes", data: JSON.stringify({ symbols: symbols }), // parameter map type: "POST", // data has to be POSTed contentType: "application/json", timeout: 10000, dataType: "json", success: function (result) { var quotes = result.d; var jEl = $("#stockTemplate").tmpl(quotes); $("#quoteDisplay").empty().append(jEl); }, error: function (xhr, status) { alert(status + "\r\n" + xhr.responseText); } }); }; </script> In this case an ASMX AJAX service is called to retrieve the stock quotes. The service returns an array of quote objects. The result is returned as an object with the .d property (in Microsoft service style) that returns the actual array of quotes. The template is applied with: var jEl = $("#stockTemplate").tmpl(quotes); which selects the template script tag and uses the .tmpl() function to apply the data to it. The result is a jQuery matched set of elements that can then be appended to the quote display element in the page. The template is merged against an array in this example. When the result is an array the template is automatically applied to each each array item. If you pass a single data item – like say a stock quote – the template works exactly the same way but is applied only once. Templates also have access to a $data item which provides the current data item and information about the tempalte that is currently executing. This makes it possible to keep context within the context of the template itself and also to pass context from a parent template to a child template which is very powerful. Templates can be evaluated by using the template selector and calling the .tmpl() function on the jQuery matched set as shown above or you can use the static $.tmpl() function to provide a template as a string. This allows you to dynamically create templates in code or – more likely – to load templates from the server via AJAX calls. In short there are options The above shows off some of the basics, but there’s much for functionality available in the template engine. Check the documentation link for more information and links to additional examples. The plug-in download also comes with a number of examples that demonstrate functionality. jQuery templates will become a native component in jQuery Core 1.5, so it’s definitely worthwhile checking out the engine today and get familiar with this interface. As much as I’m stoked about templating becoming part of the jQuery core because it’s such an integral part of many applications, there are also a couple shortcomings in the current incarnation: Lack of Error Handling Currently if you embed an expression that is invalid it’s simply not rendered. There’s no error rendered into the template nor do the various template functions throw errors which leaves finding of bugs as a runtime exercise. I would like some mechanism – optional if possible – to be able to get error info of what is failing in a template when it’s rendered. No String Output Templates are always rendered into a jQuery matched set and there’s no way that I can see to directly render to a string. String output can be useful for debugging as well as opening up templating for creating non-HTML string output. Limited JavaScript Access Unlike John Resig’s original MicroTemplating Engine which was entirely based on JavaScript code generation these templates are limited to a few structured commands that can ‘execute’. There’s no code execution inside of script code which means you’re limited to calling expressions available in global objects or the data item passed in. This may or may not be a big deal depending on the complexity of your template logic. Error handling has been discussed quite a bit and it’s likely there will be some solution to that particualar issue by the time jQuery templates ship. The others are relatively minor issues but something to think about anyway. jQuery Data Link http://api.jquery.com/category/plugins/data-link/ jQuery Data Link provides the ability to do two-way data binding between input controls and an underlying object’s properties. The typical scenario is linking a textbox to a property of an object and have the object updated when the text in the textbox is changed and have the textbox change when the value in the object or the entire object changes. The plug-in also supports converter functions that can be applied to provide the conversion logic from string to some other value typically necessary for mapping things like textbox string input to say a number property and potentially applying additional formatting and calculations. In theory this sounds great, however in reality this plug-in has some serious usability issues. Using the plug-in you can do things like the following to bind data: person = { firstName: "rick", lastName: "strahl"}; $(document).ready( function() { // provide for two-way linking of inputs $("form").link(person); // bind to non-input elements explicitly $("#objFirst").link(person, { firstName: { name: "objFirst", convertBack: function (value, source, target) { $(target).text(value); } } }); $("#objLast").link(person, { lastName: { name: "objLast", convertBack: function (value, source, target) { $(target).text(value); } } }); }); This code hooks up two-way linking between a couple of textboxes on the page and the person object. The first line in the .ready() handler provides mapping of object to form field with the same field names as properties on the object. Note that .link() does NOT bind items into the textboxes when you call .link() – changes are mapped only when values change and you move out of the field. Strike one. The two following commands allow manual binding of values to specific DOM elements which is effectively a one-way bind. You specify the object and a then an explicit mapping where name is an ID in the document. The converter is required to explicitly assign the value to the element. Strike two. You can also detect changes to the underlying object and cause updates to the input elements bound. Unfortunately the syntax to do this is not very natural as you have to rely on the jQuery data object. To update an object’s properties and get change notification looks like this: function updateFirstName() { $(person).data("firstName", person.firstName + " (code updated)"); } This works fine in causing any linked fields to be updated. In the bindings above both the firstName input field and objFirst DOM element gets updated. But the syntax requires you to use a jQuery .data() call for each property change to ensure that the changes are tracked properly. Really? Sure you’re binding through multiple layers of abstraction now but how is that better than just manually assigning values? The code savings (if any) are going to be minimal. As much as I would like to have a WPF/Silverlight/Observable-like binding mechanism in client script, this plug-in doesn’t help much towards that goal in its current incarnation. While you can bind values, the ‘binder’ is too limited to be really useful. If initial values can’t be assigned from the mappings you’re going to end up duplicating work loading the data using some other mechanism. There’s no easy way to re-bind data with a different object altogether since updates trigger only through the .data members. Finally, any non-input elements have to be bound via code that’s fairly verbose and frankly may be more voluminous than what you might write by hand for manual binding and unbinding. Two way binding can be very useful but it has to be easy and most importantly natural. If it’s more work to hook up a binding than writing a couple of lines to do binding/unbinding this sort of thing helps very little in most scenarios. In talking to some of the developers the feature set for Data Link is not complete and they are still soliciting input for features and functionality. If you have ideas on how you want this feature to be more useful get involved and post your recommendations. As it stands, it looks to me like this component needs a lot of love to become useful. For this component to really provide value, bindings need to be able to be refreshed easily and work at the object level, not just the property level. It seems to me we would be much better served by a model binder object that can perform these binding/unbinding tasks in bulk rather than a tool where each link has to be mapped first. I also find the choice of creating a jQuery plug-in questionable – it seems a standalone object – albeit one that relies on the jQuery library – would provide a more intuitive interface than the current forcing of options onto a plug-in style interface. Out of the three Microsoft created components this is by far the least useful and least polished implementation at this point. jQuery Globalization http://github.com/jquery/jquery-global Globalization in JavaScript applications often gets short shrift and part of the reason for this is that natively in JavaScript there’s little support for formatting and parsing of numbers and dates. There are a number of JavaScript libraries out there that provide some support for globalization, but most are limited to a particular portion of globalization. As .NET developers we’re fairly spoiled by the richness of APIs provided in the framework and when dealing with client development one really notices the lack of these features. While you may not necessarily need to localize your application the globalization plug-in also helps with some basic tasks for non-localized applications: Dealing with formatting and parsing of dates and time values. Dates in particular are problematic in JavaScript as there are no formatters whatsoever except the .toString() method which outputs a verbose and next to useless long string. With the globalization plug-in you get a good chunk of the formatting and parsing functionality that the .NET framework provides on the server. You can write code like the following for example to format numbers and dates: var date = new Date(); var output = $.format(date, "MMM. dd, yy") + "\r\n" + $.format(date, "d") + "\r\n" + // 10/25/2010 $.format(1222.32213, "N2") + "\r\n" + $.format(1222.33, "c") + "\r\n"; alert(output); This becomes even more useful if you combine it with templates which can also include any JavaScript expressions. Assuming the globalization plug-in is loaded you can create template expressions that use the $.format function. Here’s the template I used earlier for the stock quote again with a couple of formats applied: <script id="stockTemplate" type="text/x-jquery-tmpl"> <div id="divStockQuote" class="errordisplay" style="width: 500px;"> <div class="label">Company:</div><div><b>${Company}(${Symbol})</b></div> <div class="label">Last Price:</div> <div>${$.format(LastPrice,"N2")}</div> <div class="label">Net Change:</div><div> {{if NetChange > 0}} <b style="color:green" >${NetChange}</b> {{else}} <b style="color:red" >${NetChange}</b> {{/if}} </div> <div class="label">Last Update:</div> <div>${$.format(LastQuoteTime,"MMM dd, yyyy")}</div> </div> </script> There are also parsing methods that can parse dates and numbers from strings into numbers easily: alert($.parseDate("25.10.2010")); alert($.parseInt("12.222")); // de-DE uses . for thousands separators As you can see culture specific options are taken into account when parsing. The globalization plugin provides rich support for a variety of locales: Get a list of all available cultures Query cultures for culture items (like currency symbol, separators etc.) Localized string names for all calendar related items (days of week, months) Generated off of .NET’s supported locales In short you get much of the same functionality that you already might be using in .NET on the server side. The plugin includes a huge number of locales and an Globalization.all.min.js file that contains the text defaults for each of these locales as well as small locale specific script files that define each of the locale specific settings. It’s highly recommended that you NOT use the huge globalization file that includes all locales, but rather add script references to only those languages you explicitly care about. Overall this plug-in is a welcome helper. Even if you use it with a single locale (like en-US) and do no other localization, you’ll gain solid support for number and date formatting which is a vital feature of many applications. Changes for Microsoft It’s good to see Microsoft coming out of its shell and away from the ‘not-built-here’ mentality that has been so pervasive in the past. It’s especially good to see it applied to jQuery – a technology that has stood in drastic contrast to Microsoft’s own internal efforts in terms of design, usage model and… popularity. It’s great to see that Microsoft is paying attention to what customers prefer to use and supporting the customer sentiment – even if it meant drastically changing course of policy and moving into a more open and sharing environment in the process. The additional jQuery support that has been introduced in the last two years certainly has made lives easier for many developers on the ASP.NET platform. It’s also nice to see Microsoft submitting proposals through the standard jQuery process of plug-ins and getting accepted for various very useful projects. Certainly the jQuery Templates plug-in is going to be very useful to many especially since it will be baked into the jQuery core in jQuery 1.5. I hope we see more of this type of involvement from Microsoft in the future. Kudos!© Rick Strahl, West Wind Technologies, 2005-2010Posted in jQuery ASP.NET

Read the article
.NET HTML Sanitation for rich HTML Input

- by Rick Strahl

Recently I was working on updating a legacy application to MVC 4 that included free form text input. When I set up the new site my initial approach was to not allow any rich HTML input, only simple text formatting that would respect a few simple HTML commands for bold, lists etc. and automatically handles line break processing for new lines and paragraphs. This is typical for what I do with most multi-line text input in my apps and it works very well with very little development effort involved. Then the client sprung another note: Oh by the way we have a bunch of customers (real estate agents) who need to post complete HTML documents. Oh uh! There goes the simple theory. After some discussion and pleading on my part (<snicker>) to try and avoid this type of raw HTML input because of potential XSS issues, the client decided to go ahead and allow raw HTML input anyway. There has been lots of discussions on this subject on StackOverFlow (and here and here) but to after reading through some of the solutions I didn't really find anything that would work even closely for what I needed. Specifically we need to be able to allow just about any HTML markup, with the exception of script code. Remote CSS and Images need to be loaded, links need to work and so. While the 'legit' HTML posted by these agents is basic in nature it does span most of the full gamut of HTML (4). Most of the solutions XSS prevention/sanitizer solutions I found were way to aggressive and rendered the posted output unusable mostly because they tend to strip any externally loaded content. In short I needed a custom solution. I thought the best solution to this would be to use an HTML parser - in this case the Html Agility Pack - and then to run through all the HTML markup provided and remove any of the blacklisted tags and a number of attributes that are prone to JavaScript injection. There's much discussion on whether to use blacklists vs. whitelists in the discussions mentioned above, but I found that whitelists can make sense in simple scenarios where you might allow manual HTML input, but when you need to allow a larger array of HTML functionality a blacklist is probably easier to manage as the vast majority of elements and attributes could be allowed. Also white listing gets a bit more complex with HTML5 and the new proliferation of new HTML tags and most new tags generally don't affect XSS issues directly. Pure whitelisting based on elements and attributes also doesn't capture many edge cases (see some of the XSS cheat sheets listed below) so even with a white list, custom logic is still required to handle many of those edge cases. The Microsoft Web Protection Library (AntiXSS) My first thought was to check out the Microsoft AntiXSS library. Microsoft has an HTML Encoding and Sanitation library in the Microsoft Web Protection Library (formerly AntiXSS Library) on CodePlex, which provides stricter functions for whitelist encoding and sanitation. Initially I thought the Sanitation class and its static members would do the trick for me,but I found that this library is way too restrictive for my needs. Specifically the Sanitation class strips out images and links which rendered the full HTML from our real estate clients completely useless. I didn't spend much time with it, but apparently I'm not alone if feeling this library is not really useful without some way to configure operation. To give you an example of what didn't work for me with the library here's a small and simple HTML fragment that includes script, img and anchor tags. I would expect the script to be stripped and everything else to be left intact. Here's the original HTML:var value = "<b>Here</b> <script>alert('hello')</script> we go. Visit the " + "<a href='http://west-wind.com'>West Wind</a> site. " + "<img src='http://west-wind.com/images/new.gif' /> " ; and the code to sanitize it with the AntiXSS Sanitize class:@Html.Raw(Microsoft.Security.Application.Sanitizer.GetSafeHtmlFragment(value)) This produced a not so useful sanitized string: Here we go. Visit the <a>West Wind</a> site. While it removed the <script> tag (good) it also removed the href from the link and the image tag altogether (bad). In some situations this might be useful, but for most tasks I doubt this is the desired behavior. While links can contain javascript: references and images can 'broadcast' information to a server, without configuration to tell the library what to restrict this becomes useless to me. I couldn't find any way to customize the white list, nor is there code available in this 'open source' library on CodePlex. Using Html Agility Pack for HTML Parsing The WPL library wasn't going to cut it. After doing a bit of research I decided the best approach for a custom solution would be to use an HTML parser and inspect the HTML fragment/document I'm trying to import. I've used the HTML Agility Pack before for a number of apps where I needed an HTML parser without requiring an instance of a full browser like the Internet Explorer Application object which is inadequate in Web apps. In case you haven't checked out the Html Agility Pack before, it's a powerful HTML parser library that you can use from your .NET code. It provides a simple, parsable HTML DOM model to full HTML documents or HTML fragments that let you walk through each of the elements in your document. If you've used the HTML or XML DOM in a browser before you'll feel right at home with the Agility Pack. Blacklist based HTML Parsing to strip XSS Code For my purposes of HTML sanitation, the process involved is to walk the HTML document one element at a time and then check each element and attribute against a blacklist. There's quite a bit of argument of what's better: A whitelist of allowed items or a blacklist of denied items. While whitelists tend to be more secure, they also require a lot more configuration. In the case of HTML5 a whitelist could be very extensive. For what I need, I only want to ensure that no JavaScript is executed, so a blacklist includes the obvious <script> tag plus any tag that allows loading of external content including <iframe>, <object>, <embed> and <link> etc. <form> is also excluded to avoid posting content to a different location. I also disallow <head> and <meta> tags in particular for my case, since I'm only allowing posting of HTML fragments. There is also some internal logic to exclude some attributes or attributes that include references to JavaScript or CSS expressions. The default tag blacklist reflects my use case, but is customizable and can be added to. Here's my HtmlSanitizer implementation:using System.Collections.Generic; using System.IO; using System.Xml; using HtmlAgilityPack; namespace Westwind.Web.Utilities { public class HtmlSanitizer { public HashSet<string> BlackList = new HashSet<string>() { { "script" }, { "iframe" }, { "form" }, { "object" }, { "embed" }, { "link" }, { "head" }, { "meta" } }; /// <summary> /// Cleans up an HTML string and removes HTML tags in blacklist /// </summary> /// <param name="html"></param> /// <returns></returns> public static string SanitizeHtml(string html, params string[] blackList) { var sanitizer = new HtmlSanitizer(); if (blackList != null && blackList.Length > 0) { sanitizer.BlackList.Clear(); foreach (string item in blackList) sanitizer.BlackList.Add(item); } return sanitizer.Sanitize(html); } /// <summary> /// Cleans up an HTML string by removing elements /// on the blacklist and all elements that start /// with onXXX . /// </summary> /// <param name="html"></param> /// <returns></returns> public string Sanitize(string html) { var doc = new HtmlDocument(); doc.LoadHtml(html); SanitizeHtmlNode(doc.DocumentNode); //return doc.DocumentNode.WriteTo(); string output = null; // Use an XmlTextWriter to create self-closing tags using (StringWriter sw = new StringWriter()) { XmlWriter writer = new XmlTextWriter(sw); doc.DocumentNode.WriteTo(writer); output = sw.ToString(); // strip off XML doc header if (!string.IsNullOrEmpty(output)) { int at = output.IndexOf("?>"); output = output.Substring(at + 2); } writer.Close(); } doc = null; return output; } private void SanitizeHtmlNode(HtmlNode node) { if (node.NodeType == HtmlNodeType.Element) { // check for blacklist items and remove if (BlackList.Contains(node.Name)) { node.Remove(); return; } // remove CSS Expressions and embedded script links if (node.Name == "style") { if (string.IsNullOrEmpty(node.InnerText)) { if (node.InnerHtml.Contains("expression") || node.InnerHtml.Contains("javascript:")) node.ParentNode.RemoveChild(node); } } // remove script attributes if (node.HasAttributes) { for (int i = node.Attributes.Count - 1; i >= 0; i--) { HtmlAttribute currentAttribute = node.Attributes[i]; var attr = currentAttribute.Name.ToLower(); var val = currentAttribute.Value.ToLower(); span style="background: white; color: green">// remove event handlers if (attr.StartsWith("on")) node.Attributes.Remove(currentAttribute); // remove script links else if ( //(attr == "href" || attr== "src" || attr == "dynsrc" || attr == "lowsrc") && val != null && val.Contains("javascript:")) node.Attributes.Remove(currentAttribute); // Remove CSS Expressions else if (attr == "style" && val != null && val.Contains("expression") || val.Contains("javascript:") || val.Contains("vbscript:")) node.Attributes.Remove(currentAttribute); } } } // Look through child nodes recursively if (node.HasChildNodes) { for (int i = node.ChildNodes.Count - 1; i >= 0; i--) { SanitizeHtmlNode(node.ChildNodes[i]); } } } } } Please note: Use this as a starting point only for your own parsing and review the code for your specific use case! If your needs are less lenient than mine were you can you can make this much stricter by not allowing src and href attributes or CSS links if your HTML doesn't allow it. You can also check links for external URLs and disallow those - lots of options. The code is simple enough to make it easy to extend to fit your use cases more specifically. It's also quite easy to make this code work using a WhiteList approach if you want to go that route. The code above is semi-generic for allowing full featured HTML fragments that only disallow script related content. The Sanitize method walks through each node of the document and then recursively drills into all of its children until the entire document has been traversed. Note that the code here uses an XmlTextWriter to write output - this is done to preserve XHTML style self-closing tags which are otherwise left as non-self-closing tags. The sanitizer code scans for blacklist elements and removes those elements not allowed. Note that the blacklist is configurable either in the instance class as a property or in the static method via the string parameter list. Additionally the code goes through each element's attributes and looks for a host of rules gleaned from some of the XSS cheat sheets listed at the end of the post. Clearly there are a lot more XSS vulnerabilities, but a lot of them apply to ancient browsers (IE6 and versions of Netscape) - many of these glaring holes (like CSS expressions - WTF IE?) have been removed in modern browsers. What a Pain To be honest this is NOT a piece of code that I wanted to write. I think building anything related to XSS is better left to people who have far more knowledge of the topic than I do. Unfortunately, I was unable to find a tool that worked even closely for me, or even provided a working base. For the project I was working on I had no choice and I'm sharing the code here merely as a base line to start with and potentially expand on for specific needs. It's sad that Microsoft Web Protection Library is currently such a train wreck - this is really something that should come from Microsoft as the systems vendor or possibly a third party that provides security tools. Luckily for my application we are dealing with a authenticated and validated users so the user base is fairly well known, and relatively small - this is not a wide open Internet application that's directly public facing. As I mentioned earlier in the post, if I had my way I would simply not allow this type of raw HTML input in the first place, and instead rely on a more controlled HTML input mechanism like MarkDown or even a good HTML Edit control that can provide some limits on what types of input are allowed. Alas in this case I was overridden and we had to go forward and allow *any* raw HTML posted. Sometimes I really feel sad that it's come this far - how many good applications and tools have been thwarted by fear of XSS (or worse) attacks? So many things that could be done *if* we had a more secure browser experience and didn't have to deal with every little script twerp trying to hack into Web pages and obscure browser bugs. So much time wasted building secure apps, so much time wasted by others trying to hack apps… We're a funny species - no other species manages to waste as much time, effort and resources as we humans do :-) Resources Code on GitHub Html Agility Pack XSS Cheat Sheet XSS Prevention Cheat Sheet Microsoft Web Protection Library (AntiXss) StackOverflow Links: http://stackoverflow.com/questions/341872/html-sanitizer-for-net http://blog.stackoverflow.com/2008/06/safe-html-and-xss/ http://code.google.com/p/subsonicforums/source/browse/trunk/SubSonic.Forums.Data/HtmlScrubber.cs?r=61© Rick Strahl, West Wind Technologies, 2005-2012Posted in Security HTML ASP.NET JavaScript Tweet !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs"); (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

Read the article
UPDATE FOR BI PUBLISHER ENTERPRISE 10.1.3.4.1 MARCH 2010

- by Tim Dexter

Latest roll up patch for 10.1.3.4.1 is now out in the wild. Yep, there are bug fixes but the guys have implemented some great enhancements. I'll be covering some of them over the coming weeks, from collapsing bookmarks in your PDFs to better MS AD support to 'true' Excel templates, yes you read that correctly! Patch is available from Oracle's support site. Just search for patch 9546699. Here's the contents and readme, apologies for the big list but at least you can search against it for a particular fix. This patch contains backports of following bugs for BI Publisher Enterprise 10.1.3.4.0 and 10.1.3.4.1. 6193342 - REG:SAMPLE DATA FILE FOR PDF FORM MAPPING IS NOT VALIDATED 6261875 - ERRONEOUS PRECISION VALIDATION ON ONLINE ANALYZER 6439437 - NULL POINTER EXCEPTION WHEN PROCESSING TABLE OF CONTENT 6460974 - BACS EFT PAYMENT INSTRUCTION OUTPUT FILE IS EMPTY 6939721 - BIP: REPORT BUSTING DELIVERY KEY VALUES CANNOT CONTAIN SEVERAL SPECIAL CHARACTER 6996069 - USING XML DB FOR BI REPOSITORY FAILS WITH RESOURCENOTFOUNDEXCEPTION 7207434 - TIMEZONE:SHOULD NOT DO TIMEZONE CONVERSION AGAINST CANONICAL DATE YYYY-MM-DD 7371531 - SUPPORT FOR CSV OUTPUT FOR STRUCTURED XML AND NON SQL DATA SOURCES 7596148 - ER: LDAP FOR MS AD TO SEARCH FROM AD ROOT 7646139 - WEBSERVICES ERROR 7829516 - BIP STANDALONE FAILS TO BURST USING XSL-FO TEMPLATES 8219848 - PDF TEMPLATE REPORT NOT PERFORMING PAGE BREAK 8232116 - PARAMETER VALUE IS PASSED AS NULL,IF IT CONTAINS 'AND' WITHIN THE STRING 8250690 - NOT ABLE TO UPLOAD TEMPLATE VIA BIP API 8288459 - ER: QUERY BUILDER OPTION TO NOT INCLUDE TABLENAME. PREFIX IN SQL 8289600 - REPORT TITLE AND DESCRIPTION CAN'T SUPPORT MULTIPLE LANGUAGES 8327080 - CAN NOT CONFIGURE ORACLE EBUSINESS SUITE SECURITY MODEL WITH ORACLE RAC 8332164 - AN XDO PROPERTY TO ENABLE DEBUG LOGGING 8333289 - WEB SERVICE JOBS FAIL AFTER BIP STARTED UP 8340239 - HTTP NOTIFY IS MISSING IN SCHEDULEREPORTREQUEST 8360933 - UNABLE TO USE LOGGED IN BI USER AS THE WSSECURITY USERNAME IN A VARIABLE FORMAT 8400744 - ADMINISTRATOR USER DOES NOT HAVE FULL ADMINISTRATOR RIGHTS 8402436 - CRASH CAUSED BY UNDETERMINED ATTRKEY ERROR IN MULTI-THREADED 8403779 - IMPOSSIBLE TO CONFIGURE PARAMETER FOR A REPORT 8412259 - PDF, RTF OUTPUT NOT HANDLING THE TABLE BORDER AND CONTENT OVERFLOWS TO NEXT PAGE 8483919 - DYNAMIC DATASOURCE WEBSERVICE SHOULD WORK WITH SERVERSIDE CONNECTIONS 8444382 - ID ATTRIBUTE IN TITLE-PAGE DOES NOT WORK WITH SELECTACTION PROPERTY 8446681 - UI LANGUAGE IS NOT REFLECTED AT THE FIRST LOG IN 8449884 - PUBLICREPORTSERVICE FAILS ON EMAIL DELIVERY USING BIP 10.1.3.4.0D+ - NPE 8454858 - DB: XMLP_ADMIN CAN SEE ALL THE FOLDERS BUT ONLY HAS VIEW PERMISSIONS 8458818 - PDFBOOKBINDER FAILS WITH OUTOFMEMORY ERROR WHEN TRYING TO BIND > 1500 PDFS 8463992 - INCORRECT IMPLEMENTATION OF XLIFF SPECIFICATION 8468777 - BI PUBLISHER QUERY BUILDER NOT LOADING SCHEMA OBJECTS 8477310 - QUERY BUILDER NOT WORK WITH SSL ON STANDALONE OC4J 8506701 - POSITIVE PAY FILE WITH OPTIONS NOT CREATING FILE CHECKS OVER 2500 8506761 - PERFORMANCE: PDFBOOKBINDER CLASS TAKES 4 HOURS TO BIND 4000 PAGES 8535604 - NPE WHEN CLICKING "ANALYZER FOR EXCEL" BUTTON IN ALL_* REPORTS 8536246 - REMOVE-PDF-FIELDS DOES NOT WORK WITH CHECKBOXES WITH OPT ARRAY 8541792 - NULLPOINTER EXCEPTION WHILE USING SFTP PROTOCOL 8554443 - LOGGING TIME STAMP IN 10G: THE HOUR PART IS WRONG 8558007 - UNABLE TO LOGIN BIP WITH UNPRIVILEGED USER WHEN XDB IS USED FOR REPORSITORY STOR 8565758 - NEED TO CONNECT IMPERSONATION TO DATA SOURCE WITH PL/SQL FUNCTION 8567235 - EFTPROCESSOR AND XDO DEBUG ENABLED CAUSES ORG.XML.SAX.SAXPARSEEXCEPTION 8572216 - EFTPROCESSOR NOT THREAD SAFE - CAUSING CORRUPTED REPORTS TO BE GENERATED 8575776 - LANDSCAPE REPORT ORIENTAION NOT SELECTED WHEN REPORT IS PRINTED WITH PS 8588330 - XLIFF GENERATING WITH WRONG MAXWIDTH ATTRIBUTE IN SOME TRANS-UNITS 8584446 - EFTGENERATOR DOES NOT USE XSLT SCALABILITY - JAVA.LANG.OUTOFMEMORY EXCEPTION 8594954 - ENG: BIP NOTIFY MESSAGE BECOMES ENGLISH 8599646 - ER:EXTRA SPACE ADDED BELOW IMAGE IN A TABLE CELL OF TEMPLATE IN FIREFOX 8605110 - PDFSIGNATURE API ENCOUNTERS JAVA.LANG.NULLPOINTEREXCEPTION ON PDF WITH WATERMARK 8660915 - BURSTING WITH DATA TEMPLATE NOT WORKING WITH OPTION: VALUE=FALSE 8660920 - ER: EXTRACT XHTML DATA USING XDODTEXE IN XHTML FORMAT 8667150 - PROBLEM WITH 3RD APPLICATION ABOUT PDF GENERATED WITH BI PUBLISHER 8683547 - "CLICK VIEW REPORT BUTTON TO GENERATE THE REPORT" MESSAGE IS DISPLAYED 8713080 - SEARCH" PARAMETER IS NOT SHOWING NON ENGLISH DATA IN INTERNET EXPLORER 8724778 - EXCEL ANALYZER PARAMETERS DO NOT WORK WITH EXCEL 2007 8725450 - UIX 2.3.6.6 UPTAKE FOR 10.1.3.4.1 8728807 - DYNAMIC JDBC DATA SOURCE WITH PRE-PROCESS FUNCTION BASED ON EXISTING DATA SOURCE 8759558 - XDO TEMPLATE SHOWS CURRENCY IN WRONG FORMAT FOR DUNNING 8792894 - EFTPROCESSOR DOES NOT SUPPORT XSL TEMPLATE AS INPUTSTREAM 8793550 - BIP GENERATES CSV REPORTS OUTPUT FORMAT WITH EXTENTION .OUT NOT .CSV IN EMAIL 8819869 - PERIOD CLOSE VALUE SUMMARY REPORT (XML) RUNNING INTO WARNING 8825732 - MY FOLDERS LINK BROKEN WITH USER NAME THAT INCLUDES A SLASH (/) OBIEE SECURITY 8831948 - TRYING TO GENERATE A SCATTER PLOT USING THE CHART WIZARD 8842299 - SEEDED QUERY ALWAYS RETURNS RESULTS BASED ON FIRST COLUMN 8858027 - NODE.GETTEXTCONTEXT() NOT AVAILABLE IN 10G UNDER OC4J 8859957 - REPORT TITLE ALIGNMENT GOES BAD FOR REPORTS WITH XLIFF FILE ATTACHED 8860957 - ER: IMPROVE PERFORMANCE OF ANSWERS PARAMETERS 8891537 - GETREPORTPARAMETERS WEB SERVICE API ISSUES WITH OAAM REPORTS 8891558 - GETTING SQLEXCEPTION IN GENERATEREPORT WEB SERVICE API ON OAAM REPORTS 8927796 - ER: DYANAMIC DATA SOURCE SUPPORT BY DATA SOURCE NAME 8969898 - BI PUBLISHER WEB SERVICE GETREPORTPARAMETERS DOES NOT TRANSLATE PARAMETER LABEL 8998967 - MULTIPLE XSL PREDICATES ELEMENT[A='A'] [B='B'] CAUSES XML-22019 ERROR 9012511 - SCALABLE MODE IS NOT WORKING IN XMLPUBLISHER 10.1.3.4 9016976 - ER: PRINT XSL-T AND FOPROCESSING TIMING INFORMATION 9018580 - WEB SERVICE CALL FAILS WHEN REPORT INCLUDES SEARCH TYPE 9018657 - JOB FAILS WHEN LOV QUERY CONTAINS BIND VARIABLES :XDO_USER_UI_LOCALE 9021224 - PERFORMANCE ISSUE TO VIEW DASHBOARD PAGE WITH BIP REPORT LINKS 9022440 - ER: SUPPORT "COMB OF N CHARACTERS" FEATURE PDF FORM TEXT FIELDS 9026236 - XPATH DOES NOT WORK CORRECTLY IN 10.1.3.4.1 9051652 - FILE EXTENSION OF CSV OUTPUT IS TXT WHEN IT IS EXPORTED FROM REPORT VIEWER 9053770 - WHEN SENDING CSV REPORT OUTPUT BY EMAIL SOMETIMES IT IS SENT WITHOUT EXTENSION 9066483 - PDFBOOKBINDER LEAVE SOME TEMPORARY FILES AFTER MERGING TITLE PAGE OR TOC 9102420 - USE RELATIVE PATHS IN HYPERLINKS 9127185 - CHECKBOX NOT WORK ON SUB TEMPLATE 9149679 - BASE URL IS NOT PASSED CORRECTLY 9149691 - PROVIDE A WAY TO DISABLE THE ABILITY TO CREATE SCHEDULED REPORT JOB "PUBLIC" 9167822 - NOTIFICATION URL BREAKS ON FOLDER NAMES WITH SPACES 9167913 - CHARTS ARE MISSING IN PDF OUTPUTS WHEN THE DEFAULT OUTPUT FORMAT IS NOT A PDF 9217965 - REPORT HISTORY TAKES LONG TIME TO RENDER THE PAGE 9236674 - BI PUBLISHER PARAMETERS DO NOT CASCADE REFRESH AFTER SECOND PARAMETER 9283933 - OPTION TO COLLAPSE PDF OUTPUT BOOKMARKS BY DEFAULT 9287245 - SAVE COMPLETED SCHEDULED REPORTS IN ITS REPORT NAME AND NOT IN A GENERIC NAME 9348862 - ADD FEATURE TO DISABLE THE XSLT1.0-COMPATIBILITY IN RTF TEMPLATE 9355897 - ER: NEED A SAFE DIVIDE FUNCTION 9364169 - UIX 2.3.6.6 PATCH UPTAKE FOR 10.1.3.4.1 9365153 - LEADING WHITESPACE CHARACTERS IN A FIELD TRIMMED WHEN RUN VIEW OR EXPORT TO .CSV 9389039 - LONG TEXT IS NOT WRAPPED PROPERLY IN THE AUTOSHAPE ON RTF TEMPLATE 9475697 - ENH: SUB-TEMPLATE:DYNAMIC VARIABLE WITH PARAMETER VALUE IN CALL-TEMPLATE CLAUSE 9484549 - CHANGE DEFAULT FOR "XSLT1.0-COMPATIBILITY" TO FALSE FOR 10G 9508499 - UNABLE TO READ EXCEL FILE IF MORE THAN 1800 ROWS GENERATED 9546078 - EMAIL DELIVERY INFORMATION SHOULD NOT BE SAVED AND AUTO-FED IN JOB SUBMISSION 9546101 - EXCEPTION OCCURS WHEN SFTP/FTP REMOTE FILENAME DOSE NOT CONTAIN A SLASH '/' 9546117 - SFTP REPORT DELIVERY FAILS WITH NO CLASS DEF FOUND EXCEPTION ON WEBLOGIC 9.2 Following bugs are included in 10.1.3.4.1 and they are only applied to 10.1.3.4.0. 4612604 - FROM EDGE ATTRIBUTE OF HEADER AND FOOTER IS NOT PRESERVED 6621006 - PARAMNAMEVALUE ELEMENT DEFINITION SHOULD HAVE PARAMETER TYPE 6811967 - DATE PARAMETER NOT HANDLING DATE OFFSET WHEN PASSED UPPERCASE Z FOR OFFSET 6864451 - WHEN BIP REPORTS TIMEOUT, THE PROCESS TO LOG BACK IN IS NOT USER FRIENDLY 6869887 - FUSION CURRENCY BRD:4.1.4/4.1.6 OVERRIDINDG MASK /W XSLT._XDOCURMASKS /W SYMBOL 6959078 - "TEXT FIELD CONTAINS COMMA-SEPARATED VALUES" DOESN'T WORK IN CASE OF STRING 6994647 - GETTING ERROR MESSAGE SAYING JOB FAILED EVEN THOUGH WORKS OK IN BI PUBLISHER 7133143 - ENABLE USER TO ENTER 'TODAY' AS VALUE TO DATE PARAMETER IN SCHEDULE REPORT UI 7165117 - QA_BIP_FUNC:-CLOSED LIFE TIME REPORT ERROR MESSAGE IN CMD 7167068 - LEADER-LENGTH OR RULE-THICKNESS PROPRTY IS TOO LARGE 7219517 - NEED EXTENSION FUNCTIONS TO URL ENCODE TEXT STRING. 7269228 - TEMPLATEHELPER PRODUCES A GARBLED OUTPUT WHEN INVOKED BY MULTIPLE THREADS 7276813 - GETREPORTPARAMETERSRETURN ELEMENT SHOULD HAVE DEFAULT VALUE 7279046 - SCHDEULER:UNABLE TO DELETE A JOB USING API 7280336 - ER: BI PUBLISHER - SITEMINDER SUPPORT - GENERIC NON-ORACLE SSO SUPPORT 7281468 - MODIFY SQL SERVER PROPERTIES TO USE HYP DATA DIRECT IN JDBCDEFAULTS.XML 7281495 - PLEASE ADD SUPPORTED DBS TO JDBCDEFAULT.XML AND LIST EACH DB VERSION SEPARATELY 7282456 - FUSION CURRENCY BRD 4.1.9.2: CURRENCY AMOUTS SHOULD NOT BE WRAPPED. MINUS SIGN 7282507 - FUSION CURRENCY BRD4.1.2.5:DISPLAY CURRENCY AND LOCALE DERIVED CURRENCY SYMBOL 7284780 - FUSION CURRENCY BRD 4.1.12.4 CORRECTLY ALIGN NEGATIVE CURRENCY AMOUNTS 7306874 - OPP ERROR - JAVA.LANG.OUTOFMEMORYERROR: ZIP002:OUTOFMEMORYERROR, MEM_ERROR 7309596 - SIEBELCRM: BIP ENHANCEMENT REQUEST FOR SIEBEL PARAMETERIZATION 7337173 - UI LOCALE IS ALWAYS REWRITTEN TO EN WHEN MOVE FROM DASHBOARD 7338349 - REG:ANALYZER REPORT WITH AVERAGE FUNCTION FAIL TO RUN FOR NON INTERACTIVE FORMAT 7343757 - OUTPUT FORMAT OF TEMPLATES IS NOT SAVING 7345989 - SET XDK REPLACEILLEGALCHARS AND ENHANCE XSLTWRAPPER WARNING 7354775 - UNEXPECTED BEHAVIOR OF LAYOUT TEMPLATE PARAMETER OF RUNREPORT WEBSERVICES API 7354798 - SEQUENCE ORDER OF PARAMETERS FOR THE RUNREPORT WEBSERVICES API 7358973 - PARALLEL SFTP DELIVERY FAILS DUE TO SSHEXCEPTION: CORRUPT MAC ON INPUT 7370110 - REGN:FAIL WHEN USE JNDI TO XMLDB REPORT REPOSITORY 7375859 - NEW WEBSERVICE REQUIRED FOR RUNREPORT 7375892 - REQUIRE NEW WEBSERVICE TO CHECK IF REPORTFOLDER EXISTS 7377686 - TEXT-ALIGN NOT APPLIED IN PDF IN HEBREW LOCALE 7413722 - RUNREPORT API DOES NOT PASS BACK ANY GENERATED EXCEPTIONS TO SCHEDULEREPORT 7435420 - FUSION CURRENCY: SUPPORT MICROSOFT(JAVA) FORMAT MASK WITH CURRENCY 7441486 - ER: ADD PARAMETER FOR SFTP TO BURSTING QUERY 7458169 - SSO WITH OID LDAP COULD NOT FETCH OID ROLES 7461161 - EMAIL DELIVERY FAILS - DELIVERYEXCEPTION: 0 BYTE AVAILABLE IN THE GIVEN INPU 7580715 - INCORRECT FORMATTING OF DATES IN TIMEZONE GMT+13 7582694 - INVALID MAXWIDTH VALUE CAUSES NLS FAILURES 7583693 - JAVA.LANG.NULLPOINTEREXCEPTION RAISED WHEN GENERATING HRMS BENEFITS PDF REPORT 7587998 - NEWLY CREATED USERS IN OID CANT ACCESS REPORTS UNTILL BI PUBLISHER IS RESTARTED 7588317 - TABLE OF CONTENT ALWAYS IN THE SAME FONT 7590084 - REMOVING THE BIP ENTERPRISE BANNER BUT KEEPING THE REPORTS & SCHEDULES TAB 7590112 - SOMEONE NOT PRIVILEGED ACCESS BIP DIRECTLY SHOULD GET A CUSTOM PAGE 7590125 - AUTOMATING CREATION OF USERS AND ROLES 7597902 - TIMEZONE SUPPORT IN RUNREPORT WEBSERVICE API 7599031 - XML PUBLISHER SUM(CURRENT-GROUP()) FAILS 7609178 - ISSUE WITH TAGS EXTRACTED FROM RTF TEMPLATE 7613024 - HEADER/FOOTER SETTINGS OF RTF TEMPLATE ARE NOT RETAINING IN THE RTF OUTPUT 7623988 - ADD XSLT FUNCTION TO PRINT XDO PROPERTIES 7625975 - RETRIEVING PARAMETER LOV FROM RTF TEMPLATE 7629445 - SPELL OUT A NUMBER INTO WORDS 7641827 - ANALYTICS FROZED AFTER PAGE TAB WHICH INCLUDES [BI PUBLISHER REPORT] WERE CLICKE 7645504 - BIP REPORT FROZED AFTER THE SAME DASHBOARD BIP REPORTS WERE CLICKED SIMULTANEOUS 7649561 - RECEIVE 'TO MANY OPEN FILE HANDLES' ERROR CAUSING BI TO CRASH 7654155 - BIP REMOVES THE FIRST FILE SEPARATOR WHEN RE-ENTER REPOSITORY LOCATION IN ADMIN 7656834 - NEED AN OPTION TO NOT APPEND SCHEMA NAME IN GENERATED QUERY 7660292 - ER: XDOPARSER UPGRADE TO XDK 11G 7687862 - BIP DATA EXTRACTING ENHANCEMENT FOR SIEBEL BIP INTEGRATION 7694875 - ADMINISTRATOR IS SUPER USER WHETHER CONFIGURED MANDATORY_USER_ROLE OR NOT 7697592 - BI PUBLISHER STRINGINDEXOUTOFBOUNDSEXCEPTION WHEN PRINTING LABEL FROM SIM 7702372 - ARABIC/ENGLISH NUMBER/DATE PROBLEM, TOTAL PAGE NUMBER NOT RENDERED IN ENGLISH 7707987 - OUTOFMEMORY BURSTING A BI PUBLISHER REPORT BI SERVER DATA SOURCE 7712026 - ER: CHANGE CHART OUTPUT FORMAT TO PNG IN HTML OUTPUT 7833732 - THE 'SEARCH' PARAMETER TYPE CANNOT BE USED IN IE6 UNDER WINDOWS 8214839 - ER: INCREASE COLUMN SIZE IN SCHEDULER TABLE XMLP_SCHED_JOB 8218271 - ISSUES WHILE CONVERTING EXCEL TO XML 8218452 - BI PUBLISHER STANDALONE : GRAPHICS WITHOUT COLORS IF MORE THAN 33 PAGES 8250980 - USER WITH XMLP_ADMIN RESPONSIBILITY IS NOT ABLE TO EDIT REPORT IN BIP 8262410 - IMPOSSIBLE TO PRINT PDF CREATED BY BI PUBLISHER VIA 3RD PARTY PDF APPLICATION 8274369 - QA: CANNOT DELETE EMAIL SERVER UNDER DELIVERY CONFIGURATION 8284173 - FO:VISIBILITY="HIDDEN" DOESN'T WORK WITH FO:PAGE-NUMBER-CITATION 8288421 - THE VALUE OF VIEW BY GO BACK TO MY HISTORY IN SCHEDULES TAB 8299212 - REG: THE SPECIFICAL BI USER DIDN'T GET THE CORRECT REPORT HISTORY 8301767 - ORA-01795 ERROR OCCURED AFTER ACCESSING DASHBOARD PAGE WHICH INCLUDES BIP 8304944 - ADD SIEBEL SECURITY MODEL IN BI PUBLISHER 10.1.3.4.1 8312814 - QA:HOT:OBI SERVER JDBC DRIVER BIJDBC14.JAR IN XMLPSERVER.WAR IS INCORRECT 8323679 - BI PUBLISHER SENDS HTML REPORT TO OUTLOOK CLIENT AS ATTACHMENT NOT INLINE 8370794 - HISTORY OF COMPLETED SCHEDULER JOBS STILL SHOW ONE AS RUNNING ON CLUSTER ENV 8390970 - OUT OF MEMORY EXCEPTION RAISED, WHILE SAVING THE DATA 8393681 - CHECKBOX IS SHOWING UP AS CHECKED WHEN DATA IS NOT CHECKED VALUE 8725450 - UIX 2.3.6.6 UPTAKE FOR 10.1.3.4.1 UIX fixes: 6866363 - SUPPORT FOR JAVA DATE FORMAT AS PER JDK 1.4 AND ABOVE 6829124 - DATE PARAMETER NOT HANDLING DATE OFFSET AS PER JAVA STANDARDS ---------------------------- INSTALLATION FOR ENTERPRISE ---------------------------- Upgrade from 10.1.3.4.0d (patch 8284524, 8398280) and 10.1.3.4.1 does not require step 8 and step 9. 1 - Make a backup copy of the xmlp-server-config.xml file located in <application installation>/WEB-INF/ directory, where your application server unpacked the BI Publisher war or ear file. Example: In an Oracle AS/OC4J 10.1.3 deployment, the location is <ORACLE_HOME>/j2ee/home/applications/xmlpserver/xmlpserver/WEB-INF/xmlp-server-config.xml 2 - Back up all the directories under the BI Publisher repository (for example: {Oracle_Home}/xmlp/XMLP). 3 - If you are using Scheduling, back up your existing BI Publisher Scheduler schema. 4 - Shut down BI Publisher. 5 - Undeploy the BI Publisher application ("xmlpserver") from your J2EE application server. See your application server documentation for instructions how to undeploy an application. 6 - Deploy the 10.1.3.4 xmlpserver.ear or xmlpserver.war to your application server. See "Manually Installing BI Publisher to Your J2EE Application Server" secition of BI Publisher Installation Guide for guidelines for your application server type. 7 - Copy the saved backup copy of the xmlp-server-config.xml file from step 1 to the newly created BI Publisher <application installation>/WEB-INF/ directory, where your application server unpacked the BI Publisher war or ear file. Example: In an Oracle AS/OC4J 10.1.3 deployment, the location is <ORACLE_HOME>/j2ee/home/applications/xmlpserver/xmlpserver/WEB-INF/xmlp-server-config.xml 8 - Copy ssodefaults.xml to the following directory. And replace [host]:[port] with your server's information. Default values for other properties can be updated depending on your configuration. <Existing Repository>\XMLP\Admin\Security 9 - Copy database-config.xml to the following directory. <Existing Repository>\XMLP\Admin\Scheduler 10 - Restart xmlpserver application or Application Server ---------------------------------- IBM WEBSPHERE 6.1 DEPLOYMENT NOTE ---------------------------------- When users fail to log on to BI Publisher with "HTTP 500 Internal Server Error" on WebSphere 6.1, you must change Class Loader configuration to avoid the error. (bug7506253 - XMLPSERVER WON'T START AFTER DEPLOYMENT TO WEBSPHERE 6.1) SystemErr.log: java.lang.VerifyError: class loading constraint violated (class: oracle/xml/parser/v2/XMLNode method: xdkSetQxName(Loracle/xml/util/QxName;)V) at pc: 0 .... Class Loader Configuration Steps: 1 - Login to WebSphere Admin console. Click Enterprise Applications under Applications menu 2 - Click xmlpserver application name from the list 3 - Select "Class loading and update detection" 4 - Update class loader configuration as follows in Class Loader -> General Properties * Polling interval for updated files: [0] Seconds * Class loader order: [x] Classes loaded with application class loader first * WAR class loader policy: [x] Single class loader for application 5 - Apply this change and save the new configuration. 6 - Restart xmlpserver application Please refer to WebSphere 6.1 documentation for more details. "http://publib.boulder.ibm.com/infocenter/wasinfo/v6r1/index.jsp?topic=/com.ibm.websphere.base.doc/info/aes/ae/trun_classload_entapp.html"> http://publib.boulder.ibm.com/infocenter/wasinfo/v6r1/index.jsp?topic=/com.ibm.websphere.base.doc/info/aes/ae/trun_classload_entapp.html ------------------------------------------------------- Oracle WebLogic Server 11g R1 (10.3.1) Deployment NOTE ------------------------------------------------------- If you are deploying BI Publisher to WebLogic Server 10.3.1, you must add the following setting at startup for the domain that contains the BI Publisher server in the /weblogic_home/user_projects/domains/base_domain/bin/startWebLogic.sh script : -Dtoplink.xml.platform=oracle.toplink.platform.xml.jaxp.JAXPPlatform This setting is required to enable BI Publisher to find the TopLink JAR files to create the Scheduler tables.

Read the article
Error 0x800f0922 installing .NET 3.5 on Windows 8

- by Benjamin Nolan

I'm trying to install .NET 3.5 on my Windows 8 box and it keeps throwing Error 0x800f0922 at me. From what I've read on answers.microsoft.com and StackOverflow I gather the easiest way to fix this is to perform a system refresh, however this will remove all software I've installed from discs. I've just moved house, so I'd rather not do that as I don't know where all the installation media actually are for a lot of my software, so if possible I'd prefer to track down where the problem is actually occurring. (Also, I have a LOT of software installed. It'd take me a long time to reinstall it all, and I unfortunately haven't got that time.) The on-demand error screen sends me to KB2734782 (can't link it as I'm <10 rep), which doesn't help much. When I run this DISM line from the StackOverflow post: Dism.exe /online /enable-feature /featurename:NetFX3 /All /Source:C:\Windows\WinSxS /LimitAccess I get the following output on the terminal: Microsoft Windows [Version 6.2.9200] (c) 2012 Microsoft Corporation. All rights reserved. C:\Windows\system32>Dism.exe /online /enable-feature /featurename:NetFX3 /All /Source:C:\Windows\WinSxS /LimitAccess Deployment Image Servicing and Management tool Version: 6.2.9200.16384 Image Version: 6.2.9200.16384 Enabling feature(s) [==========================100.0%==========================] Error: 0x800f0922 DISM failed. No operation was performed. For more information, review the log file. The DISM log file can be found at C:\Windows\Logs\DISM\dism.log C:\Windows\system32> Incidentally, it jumps straight from 0 to 100% and then sits on that line for about 5 minutes before the error line occurs. dism.log contains the following lines around that time: (Link to full logs is at bottom of post) 2013-07-02 00:56:58, Info DISM DISM.EXE: Succesfully registered commands for the provider: Edition Manager. 2013-07-02 00:56:58, Info DISM DISM Provider Store: PID=5768 TID=5780 Getting Provider DISM Package Manager - CDISMProviderStore::GetProvider 2013-07-02 00:56:58, Info DISM DISM Provider Store: PID=5768 TID=5780 Provider has previously been initialized. Returning the existing instance. - CDISMProviderStore::Internal_GetProvider 2013-07-02 00:56:58, Info DISM DISM Package Manager: PID=5768 TID=5780 Processing the top level command token(enable-feature). - CPackageManagerCLIHandler::Private_ValidateCmdLine 2013-07-02 00:56:58, Info DISM DISM Package Manager: PID=5768 TID=5780 Attempting to route to appropriate command handler. - CPackageManagerCLIHandler::ExecuteCmdLine 2013-07-02 00:56:58, Info DISM DISM Package Manager: PID=5768 TID=5780 Routing the command... - CPackageManagerCLIHandler::ExecuteCmdLine 2013-07-02 00:56:58, Info DISM DISM Package Manager: PID=5768 TID=5780 Encountered the option "featurename" with value "NetFX3" - CPackageManagerCLIHandler::Private_GetPackagesFromCommandLine 2013-07-02 00:56:58, Info DISM DISM Package Manager: PID=5768 TID=5780 Encountered an unknown option "featurename" with value "NetFX3" - CPackageManagerCLIHandler::Private_GetPackagesFromCommandLine 2013-07-02 00:56:58, Info DISM DISM Package Manager: PID=5768 TID=5780 Encountered the option "source" with value "C:\Windows\WinSxS" - CPackageManagerCLIHandler::Private_GetPackagesFromCommandLine 2013-07-02 00:56:58, Info DISM DISM Package Manager: PID=5768 TID=5780 Encountered an unknown option "source" with value "C:\Windows\WinSxS" - CPackageManagerCLIHandler::Private_GetPackagesFromCommandLine 2013-07-02 00:56:59, Info DISM DISM Package Manager: PID=5768 TID=5780 Initiating Changes on Package with values: 5, 7 - CDISMPackage::Internal_ChangePackageState 2013-07-02 00:56:59, Info DISM DISM Package Manager: PID=5768 TID=5780 CBS session options=0x20100! - CDISMPackageManager::Internal_Finalize 2013-07-02 01:00:27, Info DISM DISM Package Manager: PID=5768 TID=2420 Error in operation: (null) (CBS HRESULT=0x800f0922) - CCbsConUIHandler::Error 2013-07-02 01:00:27, Error DISM DISM Package Manager: PID=5768 TID=5780 Failed finalizing changes. - CDISMPackageManager::Internal_Finalize(hr:0x800f0922) 2013-07-02 01:00:27, Error DISM DISM Package Manager: PID=5768 TID=5780 Failed processing package changes with session options - CDISMPackageManager::ProcessChangesWithOptions(hr:0x800f0922) 2013-07-02 01:00:27, Error DISM DISM Package Manager: PID=5768 TID=5780 Failed ProcessChanges. - CPackageManagerCLIHandler::Private_ProcessFeatureChange(hr:0x800f0922) 2013-07-02 01:00:27, Error DISM DISM Package Manager: PID=5768 TID=5780 Failed while processing command enable-feature. - CPackageManagerCLIHandler::ExecuteCmdLine(hr:0x800f0922) 2013-07-02 01:00:27, Info DISM DISM Package Manager: PID=5768 TID=5780 Further logs for online package and feature related operations can be found at %WINDIR%\logs\CBS\cbs.log - CPackageManagerCLIHandler::ExecuteCmdLine 2013-07-02 01:00:27, Error DISM DISM.EXE: DISM Package Manager processed the command line but failed. HRESULT=800F0922 cbs.log has the following chunks around then which could be relevant: 2013-07-02 00:55:06, Info CBS Exec: This is a PSF Package. Job has been saved and we are returning to client. 2013-07-02 00:55:06, Info CSI 0000042d@2013/7/1:23:55:06.203 CSI Transaction @0xe2f5e59500 destroyed 2013-07-02 00:55:06, Info CBS Exec: DPX job state saved for one or more packages, aborting the staging and install of execution. 2013-07-02 00:55:06, Info CSI 0000042e@2013/7/1:23:55:06.207 CSI Transaction @0xe2f5e58480 destroyed 2013-07-02 00:55:06, Info CBS Perf: Stage chain complete. 2013-07-02 00:55:06, Info CBS Failed to stage execution chain. [HRESULT = 0x800f0816 - CBS_E_DPX_JOB_STATE_SAVED] 2013-07-02 00:55:06, Info CBS Failed to process single phase execution. [HRESULT = 0x800f0816 - CBS_E_DPX_JOB_STATE_SAVED] 2013-07-02 00:55:06, Info CBS WER: Failure is not worth reporting [HRESULT = 0x800f0816 - CBS_E_DPX_JOB_STATE_SAVED] 2013-07-02 00:55:06, Info CBS Reboot mark cleared and further down: 2013-07-02 00:59:19, Info CSI 000004e6 Begin executing advanced installer phase 38 (0x00000026) index 253 (0x00000000000000fd) (sequence 289) Old component: [l:0]"" New component: [ml:306{153},l:304{152}]"NetFx35CDF-CDF_GenericCommands, Culture=neutral, Version=6.2.9200.16384, PublicKeyToken=31bf3856ad364e35, ProcessorArchitecture=x86, versionScope=NonSxS" Install mode: install Installer ID: {81a34a10-4256-436a-89d6-794b97ca407c} Installer name: [15]"Generic Command" 2013-07-02 00:59:19, Info CSI 000004e7 Performing 1 operations; 1 are not lock/unlock and follow: (0) LockComponentPath (10): flags: 0 comp: {l:16 b:19fc6600b776ce01c91f0000fc07a816} pathid: {l:16 b:19fc6600b776ce01ca1f0000fc07a816} path: [l:214{107}]"\SystemRoot\WinSxS\x86_netfx35cdf-cdf_genericcommands_31bf3856ad364e35_6.2.9200.16384_none_0cec490be12fb858" pid: 7fc starttime: 130171962799582915 (0x01ce76b5e2626ec3) 2013-07-02 00:59:19, Info CSI 000004e8 Performing 1 operations; 1 are not lock/unlock and follow: (0) LockComponentPath (10): flags: 0 comp: {l:16 b:27236700b776ce01cb1f0000fc07a816} pathid: {l:16 b:27236700b776ce01cc1f0000fc07a816} path: [l:210{105}]"\SystemRoot\WinSxS\x86_netfx35cdf-csd_cdf_installer_31bf3856ad364e35_6.2.9200.16384_none_55072425fd5c3716" pid: 7fc starttime: 130171962799582915 (0x01ce76b5e2626ec3) 2013-07-02 00:59:19, Info CSI 000004e9 Calling generic command executable (sequence 1): [122]"C:\Windows\WinSxS\x86_netfx35cdf-csd_cdf_installer_31bf3856ad364e35_6.2.9200.16384_none_55072425fd5c3716\WFServicesReg.exe" CmdLine: [139]""C:\Windows\WinSxS\x86_netfx35cdf-csd_cdf_installer_31bf3856ad364e35_6.2.9200.16384_none_55072425fd5c3716\WFServicesReg.exe" /c /b /v /m /i" 2013-07-02 00:59:20, Info CSI 000004ea Performing 1 operations; 1 are not lock/unlock and follow: (0) LockComponentPath (10): flags: 0 comp: {l:16 b:bd790401b776ce01cd1f0000fc07a816} pathid: {l:16 b:bd790401b776ce01ce1f0000fc07a816} path: [l:234{117}]"\SystemRoot\WinSxS\x86_microsoft.windows.s..ation.badcomponents_31bf3856ad364e35_6.2.9200.16384_none_353ccb4c94858655" pid: 7fc starttime: 130171962799582915 (0x01ce76b5e2626ec3) 2013-07-02 00:59:20, Info CSI 000004eb Creating NT transaction (seq 27), objectname [6]"(null)" 2013-07-02 00:59:20, Info CSI 000004ec Created NT transaction (seq 27) result 0x00000000, handle @0x24b8 2013-07-02 00:59:20, Info CSI 000004ed@2013/7/1:23:59:20.933 Beginning NT transaction commit... 2013-07-02 00:59:22, Info CSI 000004ee@2013/7/1:23:59:22.065 CSI perf trace: CSIPERF:TXCOMMIT;1387723 2013-07-02 00:59:22, Error CSI 000004ef (F) Done with generic command 1; CreateProcess returned 0, CPAW returned S_OK Process exit code 255 (0x000000ff) resulted in success? FALSE Process output: [l:28479 [4096]"DDSet_Entry: WFServicesReg.exe DDSet_Status: CFxInstaller::CopyConfigFilesToTemp is64bit=0 DDSet_Status: CFileHelper::CopyConfigFilesToTempLocation DDSet_Status: CFxInstaller::SetupBaseComponents isInstall=1 DDSet_Status: CFxInstaller::SetupBaseComponents Calling SetupExtensions. isInstall=1 (0x000000FF -- The extended attributes are inconsistent. ??) And a bit further down: 2013-07-02 00:59:22, Error [0x018007] CSI 000004f0 (F) Failed execution of queue item Installer: Generic Command ({81a34a10-4256-436a-89d6-794b97ca407c}) with HRESULT HRESULT_FROM_WIN32(14109). Failure will not be ignored: A rollback will be initiated after all the operations in the installer queue are completed; installer is reliable (2)[gle=0x80004005] [...snip...] 2013-07-02 00:59:22, Info CBS Not able to add pending.xml.bad to Windows Error Report. [HRESULT = 0x80070002 - ERROR_FILE_NOT_FOUND] 2013-07-02 00:59:28, Info CSI 000004f1@2013/7/1:23:59:28.467 CSI Advanced installer perf trace: CSIPERF:AIDONE;{81a34a10-4256-436a-89d6-794b97ca407c};NetFx35CDF-CDF_GenericCommands, Version = 6.2.9200.16384, pA = PROCESSOR_ARCHITECTURE_INTEL (0), Culture neutral, VersionScope = 1 nonSxS, PublicKeyToken = {l:8 b:31bf3856ad364e35}, Type neutral, TypeName neutral, PublicKey neutral;10609242us 2013-07-02 00:59:28, Info CSI 000004f2 End executing advanced installer (sequence 289) Completion status: HRESULT_FROM_WIN32(ERROR_ADVANCED_INSTALLER_FAILED) [...snip...] 2013-07-02 01:00:26, Info CBS Exec: Cancelled pending transactions after rollback. [HRESULT = 0x00000000 - S_OK] 2013-07-02 01:00:26, Error CBS Exec: An error occurred while committing the transaction, the transaction could not be rolled back. [HRESULT = 0x800f0922 - CBS_E_INSTALLERS_FAILED] The full DISM and CBS logs are at http://ben.mu/files/dotnet35_dism_cbs.zip as the CBS log is nearly 167MB uncompressed. o.o dism.log gives the timeframe of where its errors occur--00:56:20ish to 01:00:22. Does anyone have any ideas what's actually causing the installation to fail, and if so how I can fix it? Please don't just say "Refresh the OS". :)

Read the article
Using jQuery to Insert a New Database Record

- by Stephen Walther

The goal of this blog entry is to explore the easiest way of inserting a new record into a database using jQuery and .NET. I’m going to explore two approaches: using Generic Handlers and using a WCF service (In a future blog entry I’ll take a look at OData and WCF Data Services). Create the ASP.NET Project I’ll start by creating a new empty ASP.NET application with Visual Studio 2010. Select the menu option File, New Project and select the ASP.NET Empty Web Application project template. Setup the Database and Data Model I’ll use my standard MoviesDB.mdf movies database. This database contains one table named Movies that looks like this: I’ll use the ADO.NET Entity Framework to represent my database data: Select the menu option Project, Add New Item and select the ADO.NET Entity Data Model project item. Name the data model MoviesDB.edmx and click the Add button. In the Choose Model Contents step, select Generate from database and click the Next button. In the Choose Your Data Connection step, leave all of the defaults and click the Next button. In the Choose Your Data Objects step, select the Movies table and click the Finish button. Unfortunately, Visual Studio 2010 cannot spell movie correctly :) You need to click on Movy and change the name of the class to Movie. In the Properties window, change the Entity Set Name to Movies. Using a Generic Handler In this section, we’ll use jQuery with an ASP.NET generic handler to insert a new record into the database. A generic handler is similar to an ASP.NET page, but it does not have any of the overhead. It consists of one method named ProcessRequest(). Select the menu option Project, Add New Item and select the Generic Handler project item. Name your new generic handler InsertMovie.ashx and click the Add button. Modify your handler so it looks like Listing 1: Listing 1 – InsertMovie.ashx using System.Web; namespace WebApplication1 { /// <summary> /// Inserts a new movie into the database /// </summary> public class InsertMovie : IHttpHandler { private MoviesDBEntities _dataContext = new MoviesDBEntities(); public void ProcessRequest(HttpContext context) { context.Response.ContentType = "text/plain"; // Extract form fields var title = context.Request["title"]; var director = context.Request["director"]; // Create movie to insert var movieToInsert = new Movie { Title = title, Director = director }; // Save new movie to DB _dataContext.AddToMovies(movieToInsert); _dataContext.SaveChanges(); // Return success context.Response.Write("success"); } public bool IsReusable { get { return true; } } } } In Listing 1, the ProcessRequest() method is used to retrieve a title and director from form parameters. Next, a new Movie is created with the form values. Finally, the new movie is saved to the database and the string “success” is returned. Using jQuery with the Generic Handler We can call the InsertMovie.ashx generic handler from jQuery by using the standard jQuery post() method. The following HTML page illustrates how you can retrieve form field values and post the values to the generic handler: Listing 2 – Default.htm <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Add Movie</title> <script src="http://ajax.microsoft.com/ajax/jquery/jquery-1.4.2.js" type="text/javascript"></script> </head> <body> <form> <label>Title:</label> <input name="title" /> <br /> <label>Director:</label> <input name="director" /> </form> <button id="btnAdd">Add Movie</button> <script type="text/javascript"> $("#btnAdd").click(function () { $.post("InsertMovie.ashx", $("form").serialize(), insertCallback); }); function insertCallback(result) { if (result == "success") { alert("Movie added!"); } else { alert("Could not add movie!"); } } </script> </body> </html> When you open the page in Listing 2 in a web browser, you get a simple HTML form: Notice that the page in Listing 2 includes the jQuery library. The jQuery library is included with the following SCRIPT tag: <script src="http://ajax.microsoft.com/ajax/jquery/jquery-1.4.2.js" type="text/javascript"></script> The jQuery library is included on the Microsoft Ajax CDN so you can always easily include the jQuery library in your applications. You can learn more about the CDN at this website: http://www.asp.net/ajaxLibrary/cdn.ashx When you click the Add Movie button, the jQuery post() method is called to post the form data to the InsertMovie.ashx generic handler. Notice that the form values are serialized into a URL encoded string by calling the jQuery serialize() method. The serialize() method uses the name attribute of form fields and not the id attribute. Notes on this Approach This is a very low-level approach to interacting with .NET through jQuery – but it is simple and it works! And, you don’t need to use any JavaScript libraries in addition to the jQuery library to use this approach. The signature for the jQuery post() callback method looks like this: callback(data, textStatus, XmlHttpRequest) The second parameter, textStatus, returns the HTTP status code from the server. I tried returning different status codes from the generic handler with an eye towards implementing server validation by returning a status code such as 400 Bad Request when validation fails (see http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html ). I finally figured out that the callback is not invoked when the textStatus has any value other than “success”. Using a WCF Service As an alternative to posting to a generic handler, you can create a WCF service. You create a new WCF service by selecting the menu option Project, Add New Item and selecting the Ajax-enabled WCF Service project item. Name your WCF service InsertMovie.svc and click the Add button. Modify the WCF service so that it looks like Listing 3: Listing 3 – InsertMovie.svc using System.ServiceModel; using System.ServiceModel.Activation; namespace WebApplication1 { [ServiceBehavior(IncludeExceptionDetailInFaults=true)] [ServiceContract(Namespace = "")] [AspNetCompatibilityRequirements(RequirementsMode = AspNetCompatibilityRequirementsMode.Allowed)] public class MovieService { private MoviesDBEntities _dataContext = new MoviesDBEntities(); [OperationContract] public bool Insert(string title, string director) { // Create movie to insert var movieToInsert = new Movie { Title = title, Director = director }; // Save new movie to DB _dataContext.AddToMovies(movieToInsert); _dataContext.SaveChanges(); // Return movie (with primary key) return true; } } } The WCF service in Listing 3 uses the Entity Framework to insert a record into the Movies database table. The service always returns the value true. Notice that the service in Listing 3 includes the following attribute: [ServiceBehavior(IncludeExceptionDetailInFaults=true)] You need to include this attribute if you want to get detailed error information back to the client. When you are building an application, you should always include this attribute. When you are ready to release your application, you should remove this attribute for security reasons. Using jQuery with the WCF Service Calling a WCF service from jQuery requires a little more work than calling a generic handler from jQuery. Here are some good blog posts on some of the issues with using jQuery with WCF: http://encosia.com/2008/06/05/3-mistakes-to-avoid-when-using-jquery-with-aspnet-ajax/ http://encosia.com/2008/03/27/using-jquery-to-consume-aspnet-json-web-services/ http://weblogs.asp.net/scottgu/archive/2007/04/04/json-hijacking-and-how-asp-net-ajax-1-0-mitigates-these-attacks.aspx http://www.west-wind.com/Weblog/posts/896411.aspx http://www.west-wind.com/weblog/posts/324917.aspx http://professionalaspnet.com/archive/tags/WCF/default.aspx The primary requirement when calling WCF from jQuery is that the request use JSON: The request must include a content-type:application/json header. Any parameters included with the request must be JSON encoded. Unfortunately, jQuery does not include a method for serializing JSON (Although, oddly, jQuery does include a parseJSON() method for deserializing JSON). Therefore, we need to use an additional library to handle the JSON serialization. The page in Listing 4 illustrates how you can call a WCF service from jQuery. Listing 4 – Default2.aspx <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Add Movie</title> <script src="http://ajax.microsoft.com/ajax/jquery/jquery-1.4.2.js" type="text/javascript"></script> <script src="Scripts/json2.js" type="text/javascript"></script> </head> <body> <form> <label>Title:</label> <input id="title" /> <br /> <label>Director:</label> <input id="director" /> </form> <button id="btnAdd">Add Movie</button> <script type="text/javascript"> $("#btnAdd").click(function () { // Convert the form into an object var data = { title: $("#title").val(), director: $("#director").val() }; // JSONify the data data = JSON.stringify(data); // Post it $.ajax({ type: "POST", contentType: "application/json; charset=utf-8", url: "MovieService.svc/Insert", data: data, dataType: "json", success: insertCallback }); }); function insertCallback(result) { // unwrap result result = result["d"]; if (result === true) { alert("Movie added!"); } else { alert("Could not add movie!"); } } </script> </body> </html> There are several things to notice about Listing 4. First, notice that the page includes both the jQuery library and Douglas Crockford’s JSON2 library: <script src="Scripts/json2.js" type="text/javascript"></script> You need to include the JSON2 library to serialize the form values into JSON. You can download the JSON2 library from the following location: http://www.json.org/js.html When you click the button to submit the form, the form data is converted into a JavaScript object: // Convert the form into an object var data = { title: $("#title").val(), director: $("#director").val() }; Next, the data is serialized into JSON using the JSON2 library: // JSONify the data var data = JSON.stringify(data); Finally, the form data is posted to the WCF service by calling the jQuery ajax() method: // Post it $.ajax({ type: "POST", contentType: "application/json; charset=utf-8", url: "MovieService.svc/Insert", data: data, dataType: "json", success: insertCallback }); You can’t use the standard jQuery post() method because you must set the content-type of the request to be application/json. Otherwise, the WCF service will reject the request for security reasons. For details, see the Scott Guthrie blog post: http://weblogs.asp.net/scottgu/archive/2007/04/04/json-hijacking-and-how-asp-net-ajax-1-0-mitigates-these-attacks.aspx The insertCallback() method is called when the WCF service returns a response. This method looks like this: function insertCallback(result) { // unwrap result result = result["d"]; if (result === true) { alert("Movie added!"); } else { alert("Could not add movie!"); } } When we called the jQuery ajax() method, we set the dataType to JSON. That causes the jQuery ajax() method to deserialize the response from the WCF service from JSON into a JavaScript object automatically. The following value is passed to the insertCallback method: {"d":true} For security reasons, a WCF service always returns a response with a “d” wrapper. The following line of code removes the “d” wrapper: // unwrap result result = result["d"]; To learn more about the “d” wrapper, I recommend that you read the following blog posts: http://encosia.com/2009/02/10/a-breaking-change-between-versions-of-aspnet-ajax/ http://encosia.com/2009/06/29/never-worry-about-asp-net-ajaxs-d-again/ Summary In this blog entry, I explored two methods of inserting a database record using jQuery and .NET. First, we created a generic handler and called the handler from jQuery. This is a very low-level approach. However, it is a simple approach that works. Next, we looked at how you can call a WCF service using jQuery. This approach required a little more work because you need to serialize objects into JSON. We used the JSON2 library to perform the serialization. In the next blog post, I want to explore how you can use jQuery with OData and WCF Data Services.

Read the article
Guide to reduce TFS database growth using the Test Attachment Cleaner

- by terje

Recently there has been several reports on TFS databases growing too fast and growing too big. Notable this has been observed when one has started to use more features of the Testing system. Also, the TFS 2010 handles test results differently from TFS 2008, and this leads to more data stored in the TFS databases. As a consequence of this there has been released some tools to remove unneeded data in the database, and also some fixes to correct for bugs which has been found and corrected during this process. Further some preventive practices and maintenance rules should be adopted. A lot of people have blogged about this, among these are: Anu’s very important blog post here describes both the problem and solutions to handle it. She describes both the Test Attachment Cleaner tool, and also some QFE/CU releases to fix some underlying bugs which prevented the tool from being fully effective. Brian Harry’s blog post here describes the problem too This forum thread describes the problem with some solution hints. Ravi Shanker’s blog post here describes best practices on solving this (TBP) Grant Holidays blogpost here describes strategies to use the Test Attachment Cleaner both to detect space problems and how to rectify them. The problem can be divided into the following areas: Publishing of test results from builds Publishing of manual test results and their attachments in particular Publishing of deployment binaries for use during a test run Bugs in SQL server preventing total cleanup of data (All the published data above is published into the TFS database as attachments.) The test results will include all data being collected during the run. Some of this data can grow rather large, like IntelliTrace logs and video recordings. Also the pushing of binaries which happen for automated test runs, including tests run during a build using code coverage which will include all the files in the deployment folder, contributes a lot to the size of the attached data. In order to handle this systematically, I have set up a 3-stage process: Find out if you have a database space issue Set up your TFS server to minimize potential database issues If you have the “problem”, clean up the database and otherwise keep it clean Analyze the data Are your database( s) growing ? Are unused test results growing out of proportion ? To find out about this you need to query your TFS database for some of the information, and use the Test Attachment Cleaner (TAC) to obtain some more detailed information. If you don’t have too many databases you can use the SQL Server reports from within the Management Studio to analyze the database and table sizes. Or, you can use a set of queries . I find queries often faster to use because I can tweak them the way I want them. But be aware that these queries are non-documented and non-supported and may change when the product team wants to change them. If you have multiple Project Collections, find out which might have problems: (Disclaimer: The queries below work on TFS 2010. They will not work on Dev-11, since the table structure have been changed. I will try to update them for Dev-11 when it is released.) Open a SQL Management Studio session onto the SQL Server where you have your TFS Databases. Use the query below to find the Project Collection databases and their sizes, in descending size order. use master select DB_NAME(database_id) AS DBName, (size/128) SizeInMB FROM sys.master_files where type=0 and substring(db_name(database_id),1,4)='Tfs_' and DB_NAME(database_id)<>'Tfs_Configuration' order by size desc Doing this on one of our SQL servers gives the following results: It is pretty easy to see on which collection to start the work Find out which tables are possibly too large Keep a special watch out for the Tfs_Attachment table. Use the script at the bottom of Grant’s blog to find the table sizes in descending size order. In our case we got this result: From Grant’s blog we learnt that the tbl_Content is in the Version Control category, so the major only big issue we have here is the tbl_AttachmentContent. Find out which team projects have possibly too large attachments In order to use the TAC to find and eventually delete attachment data we need to find out which team projects have these attachments. The team project is a required parameter to the TAC. Use the following query to find this, replace the collection database name with whatever applies in your case: use Tfs_DefaultCollection select p.projectname, sum(a.compressedlength)/1024/1024 as sizeInMB from dbo.tbl_Attachment as a inner join tbl_testrun as tr on a.testrunid=tr.testrunid inner join tbl_project as p on p.projectid=tr.projectid group by p.projectname order by sum(a.compressedlength) desc In our case we got this result (had to remove some names), out of more than 100 team projects accumulated over quite some years: As can be seen here it is pretty obvious the “Byggtjeneste – Projects” are the main team project to take care of, with the ones on lines 2-4 as the next ones. Check which attachment types takes up the most space It can be nice to know which attachment types takes up the space, so run the following query: use Tfs_DefaultCollection select a.attachmenttype, sum(a.compressedlength)/1024/1024 as sizeInMB from dbo.tbl_Attachment as a inner join tbl_testrun as tr on a.testrunid=tr.testrunid inner join tbl_project as p on p.projectid=tr.projectid group by a.attachmenttype order by sum(a.compressedlength) desc We then got this result: From this it is pretty obvious that the problem here is the binary files, as also mentioned in Anu’s blog. Check which file types, by their extension, takes up the most space Run the following query use Tfs_DefaultCollection select SUBSTRING(filename,len(filename)-CHARINDEX('.',REVERSE(filename))+2,999)as Extension, sum(compressedlength)/1024 as SizeInKB from tbl_Attachment group by SUBSTRING(filename,len(filename)-CHARINDEX('.',REVERSE(filename))+2,999) order by sum(compressedlength) desc This gives a result like this: Now you should have collected enough information to tell you what to do – if you got to do something, and some of the information you need in order to set up your TAC settings file, both for a cleanup and for scheduled maintenance later. Get your TFS server and environment properly set up Even if you have got the problem or if have yet not got the problem, you should ensure the TFS server is set up so that the risk of getting into this problem is minimized. To ensure this you should install the following set of updates and components. The assumption is that your TFS Server is at SP1 level. Install the QFE for KB2608743 – which also contains detailed instructions on its use, download from here. The QFE changes the default settings to not upload deployed binaries, which are used in automated test runs. Binaries will still be uploaded if: Code coverage is enabled in the test settings. You change the UploadDeploymentItem to true in the testsettings file. Be aware that this might be reset back to false by another user which haven't installed this QFE. The hotfix should be installed to The build servers (the build agents) The machine hosting the Test Controller Local development computers (Visual Studio) Local test computers (MTM) It is not required to install it to the TFS Server, test agents or the build controller – it has no effect on these programs. If you use the SQL Server 2008 R2 you should also install the CU 10 (or later). This CU fixes a potential problem of hanging “ghost” files. This seems to happen only in certain trigger situations, but to ensure it doesn’t bite you, it is better to make sure this CU is installed. There is no such CU for SQL Server 2008 pre-R2 Work around: If you suspect hanging ghost files, they can be – with some mental effort, deduced from the ghost counters using the following SQL query: use master SELECT DB_NAME(database_id) as 'database',OBJECT_NAME(object_id) as 'objectname', index_type_desc,ghost_record_count,version_ghost_record_count,record_count,avg_record_size_in_bytes FROM sys.dm_db_index_physical_stats (DB_ID(N'<DatabaseName>'), OBJECT_ID(N'<TableName>'), NULL, NULL , 'DETAILED') The problem is a stalled ghost cleanup process. Restarting the SQL server after having stopped all components that depends on it, like the TFS Server and SPS services – that is all applications that connect to the SQL server. Then restart the SQL server, and finally start up all dependent processes again. (I would guess a complete server reboot would do the trick too.) After this the ghost cleanup process will run properly again. The fix will come in the next CU cycle for SQL Server R2 SP1. The R2 pre-SP1 and R2 SP1 have separate maintenance cycles, and are maintained individually. Each have its own set of CU’s. When it comes I will add the link here to that CU. The "hanging ghost file” issue came up after one have run the TAC, and deleted enourmes amount of data. The SQL Server can get into this hanging state (without the QFE) in certain cases due to this. And of course, install and set up the Test Attachment Cleaner command line power tool. This should be done following some guidelines from Ravi Shanker: “When you run TAC, ensure that you are deleting small chunks of data at regular intervals (say run TAC every night at 3AM to delete data that is between age 730 to 731 days) – this will ensure that small amounts of data are being deleted and SQL ghosted record cleanup can catch up with the number of deletes performed. “ This rule minimizes the risk of the ghosted hang problem to occur, and further makes it easier for the SQL server ghosting process to work smoothly. “Run DBCC SHRINKDB post the ghosted records are cleaned up to physically reclaim the space on the file system” This is the last step in a 3 step process of removing SQL server data. First they are logically deleted. Then they are cleaned out by the ghosting process, and finally removed using the shrinkdb command. Cleaning out the attachments The TAC is run from the command line using a set of parameters and controlled by a settingsfile. The parameters point out a server uri including the team project collection and also point at a specific team project. So in order to run this for multiple team projects regularly one has to set up a script to run the TAC multiple times, once for each team project. When you install the TAC there is a very useful readme file in the same directory. When the deployment binaries are published to the TFS server, ALL items are published up from the deployment folder. That often means much more files than you would assume are necessary. This is a brute force technique. It works, but you need to take care when cleaning up. Grant has shown how their settings file looks in his blog post, removing all attachments older than 180 days , as long as there are no active workitems connected to them. This setting can be useful to clean out all items, both in a clean-up once operation, and in a general There are two scenarios we need to consider: Cleaning up an existing overgrown database Maintaining a server to avoid an overgrown database using scheduled TAC 1. Cleaning up a database which has grown too big due to these attachments. This job is a “Once” job. We do this once and then move on to make sure it won’t happen again, by taking the actions in 2) below. In this scenario you should only consider the large files. Your goal should be to simply reduce the size, and don’t bother about the smaller stuff. That can be left a scheduled TAC cleanup ( 2 below). Here you can use a very general settings file, and just remove the large attachments, or you can choose to remove any old items. Grant’s settings file is an example of the last one. A settings file to remove only large attachments could look like this:  <DeletionCriteria> <TestRun /> <Attachment> <SizeInMB GreaterThan="10" /> </Attachment> </DeletionCriteria> Or like this: If you want only to remove dll’s and pdb’s about that size, add an Extensions-section. Without that section, all extensions will be deleted.  <DeletionCriteria> <TestRun /> <Attachment> <SizeInMB GreaterThan="10" /> <Extensions> <Include value="dll" /> <Include value="pdb" /> </Extensions> </Attachment> </DeletionCriteria> Before you start up your scheduled maintenance, you should clear out all older items. 2. Scheduled maintenance using the TAC If you run a schedule every night, and remove old items, and also remove them in small batches. It is important to run this often, like every night, in order to keep the number of deleted items low. That way the SQL ghost process works better. One approach could be to delete all items older than some number of days, let’s say 180 days. This could be combined with restricting it to keep attachments with active or resolved bugs. Doing this every night ensures that only small amounts of data is deleted.  <DeletionCriteria> <TestRun> <AgeInDays OlderThan="180" /> </TestRun> <Attachment /> <LinkedBugs> <Exclude state="Active" /> <Exclude state="Resolved"/> </LinkedBugs> </DeletionCriteria> In my experience there are projects which are left with active or resolved workitems, akthough no further work is done. It can be wise to have a cleanup process with no restrictions on linked bugs at all. Note that you then have to remove the whole LinkedBugs section. A approach which could work better here is to do a two step approach, use the schedule above to with no LinkedBugs as a sweeper cleaning task taking away all data older than you could care about. Then have another scheduled TAC task to take out more specifically attachments that you are not likely to use. This task could be much more specific, and based on your analysis clean out what you know is troublesome data.  <DeletionCriteria> <TestRun > <AgeInDays OlderThan="30" /> </TestRun> <Attachment> <SizeInMB GreaterThan="10" /> <Extensions> <Include value="iTrace"/> <Include value="dll"/> <Include value="pdb"/> <Include value="wmv"/> </Extensions> </Attachment> <LinkedBugs> <Exclude state="Active" /> <Exclude state="Resolved" /> </LinkedBugs> </DeletionCriteria> The readme document for the TAC says that it recognizes “internal” extensions, but it does recognize any extension. To run the tool do the following command: tcmpt attachmentcleanup /collection:your_tfs_collection_url /teamproject:your_team_project /settingsfile:path_to_settingsfile /outputfile:%temp%/teamproject.tcmpt.log /mode:delete Shrinking the database You could run a shrink database command after the TAC has run in cases where there are a lot of data being deleted. In this case you SHOULD do it, to free up all that space. But, after the shrink operation you should do a rebuild indexes, since the shrink operation will leave the database in a very fragmented state, which will reduce performance. Note that you need to rebuild indexes, reorganizing is not enough. For smaller amounts of data you should NOT shrink the database, since the data will be reused by the SQL server when it need to add more records. In fact, it is regarded as a bad practice to shrink the database regularly. So on a daily maintenance schedule you should NOT shrink the database. To shrink the database you do a DBCC SHRINKDATABASE command, and then follow up with a DBCC INDEXDEFRAG afterwards. I find the easiest way to do this is to create a SQL Maintenance plan including the Shrink Database Task and the Rebuild Index Task and just execute it when you need to do this.

Read the article
Creating a dynamic proxy generator – Part 1 – Creating the Assembly builder, Module builder and cach

- by SeanMcAlinden

I’ve recently started a project with a few mates to learn the ins and outs of Dependency Injection, AOP and a number of other pretty crucial patterns of development as we’ve all been using these patterns for a while but have relied totally on third part solutions to do the magic. We thought it would be interesting to really get into the details by rolling our own IoC container and hopefully learn a lot on the way, and you never know, we might even create an excellent framework. The open source project is called Rapid IoC and is hosted at http://rapidioc.codeplex.com/ One of the most interesting tasks for me is creating the dynamic proxy generator for enabling Aspect Orientated Programming (AOP). In this series of articles, I’m going to track each step I take for creating the dynamic proxy generator and I’ll try my best to explain what everything means - mainly as I’ll be using Reflection.Emit to emit a fair amount of intermediate language code (IL) to create the proxy types at runtime which can be a little taxing to read. It’s worth noting that building the proxy is without a doubt going to be slightly painful so I imagine there will be plenty of areas I’ll need to change along the way. Anyway lets get started… Part 1 - Creating the Assembly builder, Module builder and caching mechanism Part 1 is going to be a really nice simple start, I’m just going to start by creating the assembly, module and type caches. The reason we need to create caches for the assembly, module and types is simply to save the overhead of recreating proxy types that have already been generated, this will be one of the important steps to ensure that the framework is fast… kind of important as we’re calling the IoC container ‘Rapid’ – will be a little bit embarrassing if we manage to create the slowest framework. The Assembly builder The assembly builder is what is used to create an assembly at runtime, we’re going to have two overloads, one will be for the actual use of the proxy generator, the other will be mainly for testing purposes as it will also save the assembly so we can use Reflector to examine the code that has been created. Here’s the code: DynamicAssemblyBuilder using System; using System.Reflection; using System.Reflection.Emit; namespace Rapid.DynamicProxy.Assembly { /// <summary> /// Class for creating an assembly builder. /// </summary> internal static class DynamicAssemblyBuilder { #region Create /// <summary> /// Creates an assembly builder. /// </summary> /// <param name="assemblyName">Name of the assembly.</param> public static AssemblyBuilder Create(string assemblyName) { AssemblyName name = new AssemblyName(assemblyName); AssemblyBuilder assembly = AppDomain.CurrentDomain.DefineDynamicAssembly( name, AssemblyBuilderAccess.Run); DynamicAssemblyCache.Add(assembly); return assembly; } /// <summary> /// Creates an assembly builder and saves the assembly to the passed in location. /// </summary> /// <param name="assemblyName">Name of the assembly.</param> /// <param name="filePath">The file path.</param> public static AssemblyBuilder Create(string assemblyName, string filePath) { AssemblyName name = new AssemblyName(assemblyName); AssemblyBuilder assembly = AppDomain.CurrentDomain.DefineDynamicAssembly( name, AssemblyBuilderAccess.RunAndSave, filePath); DynamicAssemblyCache.Add(assembly); return assembly; } #endregion } } So hopefully the above class is fairly explanatory, an AssemblyName is created using the passed in string for the actual name of the assembly. An AssemblyBuilder is then constructed with the current AppDomain and depending on the overload used, it is either just run in the current context or it is set up ready for saving. It is then added to the cache. DynamicAssemblyCache using System.Reflection.Emit; using Rapid.DynamicProxy.Exceptions; using Rapid.DynamicProxy.Resources.Exceptions; namespace Rapid.DynamicProxy.Assembly { /// <summary> /// Cache for storing the dynamic assembly builder. /// </summary> internal static class DynamicAssemblyCache { #region Declarations private static object syncRoot = new object(); internal static AssemblyBuilder Cache = null; #endregion #region Adds a dynamic assembly to the cache. /// <summary> /// Adds a dynamic assembly builder to the cache. /// </summary> /// <param name="assemblyBuilder">The assembly builder.</param> public static void Add(AssemblyBuilder assemblyBuilder) { lock (syncRoot) { Cache = assemblyBuilder; } } #endregion #region Gets the cached assembly /// <summary> /// Gets the cached assembly builder. /// </summary> /// <returns></returns> public static AssemblyBuilder Get { get { lock (syncRoot) { if (Cache != null) { return Cache; } } throw new RapidDynamicProxyAssertionException(AssertionResources.NoAssemblyInCache); } } #endregion } } The cache is simply a static property that will store the AssemblyBuilder (I know it’s a little weird that I’ve made it public, this is for testing purposes, I know that’s a bad excuse but hey…) There are two methods for using the cache – Add and Get, these just provide thread safe access to the cache. The Module Builder The module builder is required as the create proxy classes will need to live inside a module within the assembly. Here’s the code: DynamicModuleBuilder using System.Reflection.Emit; using Rapid.DynamicProxy.Assembly; namespace Rapid.DynamicProxy.Module { /// <summary> /// Class for creating a module builder. /// </summary> internal static class DynamicModuleBuilder { /// <summary> /// Creates a module builder using the cached assembly. /// </summary> public static ModuleBuilder Create() { string assemblyName = DynamicAssemblyCache.Get.GetName().Name; ModuleBuilder moduleBuilder = DynamicAssemblyCache.Get.DefineDynamicModule (assemblyName, string.Format("{0}.dll", assemblyName)); DynamicModuleCache.Add(moduleBuilder); return moduleBuilder; } } } As you can see, the module builder is created on the assembly that lives in the DynamicAssemblyCache, the module is given the assembly name and also a string representing the filename if the assembly is to be saved. It is then added to the DynamicModuleCache. DynamicModuleCache using System.Reflection.Emit; using Rapid.DynamicProxy.Exceptions; using Rapid.DynamicProxy.Resources.Exceptions; namespace Rapid.DynamicProxy.Module { /// <summary> /// Class for storing the module builder. /// </summary> internal static class DynamicModuleCache { #region Declarations private static object syncRoot = new object(); internal static ModuleBuilder Cache = null; #endregion #region Add /// <summary> /// Adds a dynamic module builder to the cache. /// </summary> /// <param name="moduleBuilder">The module builder.</param> public static void Add(ModuleBuilder moduleBuilder) { lock (syncRoot) { Cache = moduleBuilder; } } #endregion #region Get /// <summary> /// Gets the cached module builder. /// </summary> /// <returns></returns> public static ModuleBuilder Get { get { lock (syncRoot) { if (Cache != null) { return Cache; } } throw new RapidDynamicProxyAssertionException(AssertionResources.NoModuleInCache); } } #endregion } } The DynamicModuleCache is very similar to the assembly cache, it is simply a statically stored module with thread safe Add and Get methods. The DynamicTypeCache To end off this post, I’m going to create the cache for storing the generated proxy classes. I’ve spent a fair amount of time thinking about the type of collection I should use to store the types and have finally decided that for the time being I’m going to use a generic dictionary. This may change when I can actually performance test the proxy generator but the time being I think it makes good sense in theory, mainly as it pretty much maintains it’s performance with varying numbers of items – almost constant (0)1. Plus I won’t ever need to loop through the items which is not the dictionaries strong point. Here’s the code as it currently stands: DynamicTypeCache using System; using System.Collections.Generic; using System.Security.Cryptography; using System.Text; namespace Rapid.DynamicProxy.Types { /// <summary> /// Cache for storing proxy types. /// </summary> internal static class DynamicTypeCache { #region Declarations static object syncRoot = new object(); public static Dictionary<string, Type> Cache = new Dictionary<string, Type>(); #endregion /// <summary> /// Adds a proxy to the type cache. /// </summary> /// <param name="type">The type.</param> /// <param name="proxy">The proxy.</param> public static void AddProxyForType(Type type, Type proxy) { lock (syncRoot) { Cache.Add(GetHashCode(type.AssemblyQualifiedName), proxy); } } /// <summary> /// Tries the type of the get proxy for. /// </summary> /// <param name="type">The type.</param> /// <returns></returns> public static Type TryGetProxyForType(Type type) { lock (syncRoot) { Type proxyType; Cache.TryGetValue(GetHashCode(type.AssemblyQualifiedName), out proxyType); return proxyType; } } #region Private Methods private static string GetHashCode(string fullName) { SHA1CryptoServiceProvider provider = new SHA1CryptoServiceProvider(); Byte[] buffer = Encoding.UTF8.GetBytes(fullName); Byte[] hash = provider.ComputeHash(buffer, 0, buffer.Length); return Convert.ToBase64String(hash); } #endregion } } As you can see, there are two public methods, one for adding to the cache and one for getting from the cache. Hopefully they should be clear enough, the Get is a TryGet as I do not want the dictionary to throw an exception if a proxy doesn’t exist within the cache. Other than that I’ve decided to create a key using the SHA1CryptoServiceProvider, this may change but my initial though is the SHA1 algorithm is pretty fast to put together using the provider and it is also very unlikely to have any hashing collisions. (there are some maths behind how unlikely this is – here’s the wiki if you’re interested http://en.wikipedia.org/wiki/SHA_hash_functions) Anyway, that’s the end of part 1 – although I haven’t started any of the fun stuff (by fun I mean hairpulling, teeth grating Relfection.Emit style fun), I’ve got the basis of the DynamicProxy in place so all we have to worry about now is creating the types, interceptor classes, method invocation information classes and finally a really nice fluent interface that will abstract all of the hard-core craziness away and leave us with a lightning fast, easy to use AOP framework. Hope you find the series interesting. All of the source code can be viewed and/or downloaded at our codeplex site - http://rapidioc.codeplex.com/ Kind Regards, Sean.

Read the article
Visual Studio 2010 and .NET 4 Released

- by ScottGu

The final release of Visual Studio 2010 and .NET 4 is now available. Download and Install Today MSDN subscribers, as well as WebsiteSpark/BizSpark/DreamSpark members, can now download the final releases of Visual Studio 2010 and TFS 2010 through the MSDN subscribers download center. If you are not an MSDN Subscriber, you can download free 90-day trial editions of Visual Studio 2010. Or you can can download the free Visual Studio express editions of Visual Web Developer 2010, Visual Basic 2010, Visual C# 2010 and Visual C++. These express editions are available completely for free (and never time out). If you are looking for an easy way to setup a new machine for web-development you can automate installing ASP.NET 4, ASP.NET MVC 2, IIS, SQL Server Express and Visual Web Developer 2010 Express really quickly with the Microsoft Web Platform Installer (just click the install button on the page). What is new with VS 2010 and .NET 4 Today’s release is a big one – and brings with it a ton of new feature and capabilities. One of the things we tried hard to focus on with this release was to invest heavily in making existing applications, projects and developer experiences better. What this means is that you don’t need to read 1000+ page books or spend time learning major new concepts in order to take advantage of the release. There are literally thousands of improvements (both big and small) that make you more productive and successful without having to learn big new concepts in order to start using them. Below is just a small sampling of some of the improvements with this release: Visual Studio 2010 IDE Visual Studio 2010 now supports multiple-monitors (enabling much better use of screen real-estate). It has new code Intellisense support that makes it easier to find and use classes and methods. It has improved code navigation support for searching code-bases and seeing how code is called and used. It has new code visualization support that allows you to see the relationships across projects and classes within projects, as well as to automatically generate sequence diagrams to chart execution flow. The editor now supports HTML and JavaScript snippet support as well as improved JavaScript intellisense. The VS 2010 Debugger and Profiling support is now much, much richer and enables new features like Intellitrace (aka Historical Debugging), debugging of Crash/Dump files, and better parallel debugging. VS 2010’s multi-targeting support is now much richer, and enables you to use VS 2010 to target .NET 2, .NET 3, .NET 3.5 and .NET 4 applications. And the infamous Add Reference dialog now loads much faster. TFS 2010 is now easy to setup (you can now install the server in under 10 minutes) and enables great source-control, bug/work-item tracking, and continuous integration support. Testing (both automated and manual) is now much, much richer. And VS 2010 Premium and Ultimate provide much richer architecture and design tooling support. VB and C# Language Features VB and C# in VS 2010 both contain a bunch of new features and capabilities. VB adds new support for automatic properties, collection initializers, and implicit line continuation support among many other features. C# adds support for optional parameters and named arguments, a new dynamic keyword, co-variance and contra-variance, and among many other features. ASP.NET 4 and ASP.NET MVC 2 With ASP.NET 4, Web Forms controls now render clean, semantically correct, and CSS friendly HTML markup. Built-in URL routing functionality allows you to expose clean, search engine friendly, URLs and increase the traffic to your Website. ViewState within applications can now be more easily controlled and made smaller. ASP.NET Dynamic Data support has been expanded. More controls, including rich charting and data controls, are now built-into ASP.NET 4 and enable you to build applications even faster. New starter project templates now make it easier to get going with new projects. SEO enhancements make it easier to drive traffic to your public facing sites. And web.config files are now clean and simple. ASP.NET MVC 2 is now built-into VS 2010 and ASP.NET 4, and provides a great way to build web sites and applications using a model-view-controller based pattern. ASP.NET MVC 2 adds features to easily enable client and server validation logic, provides new strongly-typed HTML and UI-scaffolding helper methods. It also enables more modular/reusable applications. The new <%: %> syntax in ASP.NET makes it easier to HTML encode output. Visual Studio 2010 also now includes better tooling support for unit testing and TDD. In particular, “Consume first intellisense” and “generate from usage" support within VS 2010 make it easier to write your unit tests first, and then drive your implementation from them. Deploying ASP.NET applications gets a lot easier with this release. You can now publish your Websites and applications to a staging or production server from within Visual Studio itself. Visual Studio 2010 makes it easy to transfer all your files, code, configuration, database schema and data in one complete package. VS 2010 also makes it easy to manage separate web.config configuration files settings depending upon whether you are in debug, release, staging or production modes. WPF 4 and Silverlight 4 WPF 4 includes a ton of new improvements and capabilities including more built-in controls, richer graphics features (cached composition, pixel shader 3 support, layoutrounding, and animation easing functions), a much improved text stack (with crisper text rendering, custom dictionary support, and selection and caret brush options). WPF 4 also includes a bunch of support to enable you to take advantage of new Windows 7 features – including multi-touch and Windows 7 shell integration. Silverlight 4 will launch this week as well. You can watch my Silverlight 4 launch keynote streamed live Tuesday (April 13th) at 8am Pacific Time. Silverlight 4 includes a ton of new capabilities – including a bunch for making it possible to build great business applications and out of the browser applications. I’ll be doing a separate blog post later this week (once it is live on the web) that talks more about its capabilities. Visual Studio 2010 now includes great tooling support for both WPF and Silverlight. The new VS 2010 WPF and Silverlight designer makes it much easier to build client applications as well as build great line of business solutions, as well as integrate and bind with data. Tooling support for Silverlight 4 with the final release of Visual Studio 2010 will be available when Silverlight 4 releases to the web this week. SharePoint and Azure Visual Studio 2010 now includes built-in support for building SharePoint applications. You can now create, edit, build, and debug SharePoint applications directly within Visual Studio 2010. You can also now use SharePoint with TFS 2010. Support for creating Azure-hosted applications is also now included with VS 2010 – allowing you to build ASP.NET and WCF based applications and host them within the cloud. Data Access Data access has a lot of improvements coming to it with .NET 4. Entity Framework 4 includes a ton of new features and capabilities – including support for model first and POCO development, default support for lazy loading, built-in support for pluralization/singularization of table/property names within the VS 2010 designer, full support for all the LINQ operators, the ability to optionally expose foreign keys on model objects (useful for some stateless web scenarios), disconnected API support to better handle N-Tier and stateless web scenarios, and T4 template customization support within VS 2010 to allow you to customize and automate how code is generated for you by the data designer. In addition to improvements with the Entity Framework, LINQ to SQL with .NET 4 also includes a bunch of nice improvements. WCF and Workflow WCF includes a bunch of great new capabilities – including better REST, activation and configuration support. WCF Data Services (formerly known as Astoria) and WCF RIA Services also now enable you to easily expose and work with data from remote clients. Windows Workflow is now much faster, includes flowchart services, and now makes it easier to make custom services than before. More details can be found here. CLR and Core .NET Library Improvements .NET 4 includes the new CLR 4 engine – which includes a lot of nice performance and feature improvements. CLR 4 engine now runs side-by-side in-process with older versions of the CLR – allowing you to use two different versions of .NET within the same process. It also includes improved COM interop support. The .NET 4 base class libraries (BCL) include a bunch of nice additions and refinements. In particular, the .NET 4 BCL now includes new parallel programming support that makes it much easier to build applications that take advantage of multiple CPUs and cores on a computer. This work dove-tails nicely with the new VS 2010 parallel debugger (making it much easier to debug parallel applications), as well as the new F# functional language support now included in the VS 2010 IDE. .NET 4 also now also has the Dynamic Language Runtime (DLR) library built-in – which makes it easier to use dynamic language functionality with .NET. MEF – a really cool library that enables rich extensibility – is also now built-into .NET 4 and included as part of the base class libraries. .NET 4 Client Profile The download size of the .NET 4 redist is now much smaller than it was before (the x86 full .NET 4 package is about 36MB). We also now have a .NET 4 Client Profile package which is a pure sub-set of the full .NET that can be used to streamline client application installs. C++ VS 2010 includes a bunch of great improvements for C++ development. This includes better C++ Intellisense support, MSBuild support for projects, improved parallel debugging and profiler support, MFC improvements, and a number of language features and compiler optimizations. My VS 2010 and .NET 4 Blog Series I’ve been cranking away on a blog series the last few months that highlights many of the new VS 2010 and .NET 4 improvements. The good news is that I have about 20 in-depth posts already written. The bad news (for me) is that I have about 200 more to go until I’m done! I’m going to try and keep adding a few more each week over the next few months to discuss the new improvements and how best to take advantage of them. Below is a list of the already written ones that you can check out today: Clean Web.Config Files Starter Project Templates Multi-targeting Multiple Monitor Support New Code Focused Web Profile Option HTML / ASP.NET / JavaScript Code Snippets Auto-Start ASP.NET Applications URL Routing with ASP.NET 4 Web Forms Searching and Navigating Code in VS 2010 VS 2010 Code Intellisense Improvements WPF 4 Add Reference Dialog Improvements SEO Improvements with ASP.NET 4 Output Cache Extensibility with ASP.NET 4 Built-in Charting Controls for ASP.NET and Windows Forms Cleaner HTML Markup with ASP.NET 4 - Client IDs Optional Parameters and Named Arguments in C# 4 - and a cool scenarios with ASP.NET MVC 2 Automatic Properties, Collection Initializers and Implicit Line Continuation Support with VB 2010 New <%: %> Syntax for HTML Encoding Output using ASP.NET 4 JavaScript Intellisense Improvements with VS 2010 Stay tuned to my blog as I post more. Also check out this page which links to a bunch of great articles and videos done by others. VS 2010 Installation Notes If you have installed a previous version of VS 2010 on your machine (either the beta or the RC) you must first uninstall it before installing the final VS 2010 release. I also recommend uninstalling .NET 4 betas (including both the client and full .NET 4 installs) as well as the other installs that come with VS 2010 (e.g. ASP.NET MVC 2 preview builds, etc). The uninstalls of the betas/RCs will clean up all the old state on your machine – after which you can install the final VS 2010 version and should have everything just work (this is what I’ve done on all of my machines and I haven’t had any problems). The VS 2010 and .NET 4 installs add a bunch of new managed assemblies to your machine. Some of these will be “NGEN’d” to native code during the actual install process (making them run fast). To avoid adding too much time to VS setup, though, we don’t NGEN all assemblies immediately – and instead will NGEN the rest in the background when your machine is idle. Until it finishes NGENing the assemblies they will be JIT’d to native code the first time they are used in a process – which for large assemblies can sometimes cause a slight performance hit. If you run into this you can manually force all assemblies to be NGEN’d to native code immediately (and not just wait till the machine is idle) by launching the Visual Studio command line prompt from the Windows Start Menu (Microsoft Visual Studio 2010->Visual Studio Tools->Visual Studio Command Prompt). Within the command prompt type “Ngen executequeueditems” – this will cause everything to be NGEN’d immediately. How to Buy Visual Studio 2010 You can can download and use the free Visual Studio express editions of Visual Web Developer 2010, Visual Basic 2010, Visual C# 2010 and Visual C++. These express editions are available completely for free (and never time out). You can buy a new copy of VS 2010 Professional that includes a 1 year subscription to MSDN Essentials for $799. MSDN Essentials includes a developer license of Windows 7 Ultimate, Windows Server 2008 R2 Enterprise, SQL Server 2008 DataCenter R2, and 20 hours of Azure hosting time. Subscribers also have access to MSDN’s Online Concierge, and Priority Support in MSDN Forums. Upgrade prices from previous releases of Visual Studio are also available. Existing Visual Studio 2005/2008 Standard customers can upgrade to Visual Studio 2010 Professional for a special $299 retail price until October. You can take advantage of this VS Standard->Professional upgrade promotion here. Web developers who build applications for others, and who are either independent developers or who work for companies with less than 10 employees, can also optionally take advantage of the Microsoft WebSiteSpark program. This program gives you three copies of Visual Studio 2010 Professional, 1 copy of Expression Studio, and 4 CPU licenses of both Windows 2008 R2 Web Server and SQL 2008 Web Edition that you can use to both develop and deploy applications with at no cost for 3 years. At the end of the 3 years there is no obligation to buy anything. You can sign-up for WebSiteSpark today in under 5 minutes – and immediately have access to the products to download. Summary Today’s release is a big one – and has a bunch of improvements for pretty much every developer. Thank you everyone who provided feedback, suggestions and reported bugs throughout the development process – we couldn’t have delivered it without you. Hope this helps, Scott P.S. In addition to blogging, I am also now using Twitter for quick updates and to share links. Follow me at: twitter.com/scottgu

Read the article
IIS 7 Authentication: Certain users can't authenticate, while almost all others can.

- by user35335

I'm using IIS 7 Digest authentication to control access to a certain directory containing files. Users access the files through a department website from inside our network and outside. I've set NTFS permissions on the directory to allow a certain AD group to view the files. When I click a link to one of those files on the website I get prompted for a username and password. With most users everything works fine, but with a few of them it prompts for a password 3 times and then get: 401 - Unauthorized: Access is denied due to invalid credentials. But other users that are in the group can get in without a problem. If I switch it over to Windows Authentication, then the trouble users can log in fine. That directory is also shared, and users that can't log in through the website are able to browse to the share and view files in it, so I know that the permissions are ok. Here's the portion of the IIS log where I tried to download the file (/assets/files/secure/WWGNL.pdf): 2010-02-19 19:47:20 xxx.xxx.xxx.xxx GET /assets/images/bullet.gif - 80 - 10.5.16.138 Mozilla/5.0+(Windows;+U;+Windows+NT+6.1;+en-US)+AppleWebKit/532.5+(KHTML,+like+Gecko)+Chrome/4.0.249.89+Safari/532.5 200 0 0 218 2010-02-19 19:47:20 xxx.xxx.xxx.xxx GET /assets/images/bgOFF.gif - 80 - 10.5.16.138 Mozilla/5.0+(Windows;+U;+Windows+NT+6.1;+en-US)+AppleWebKit/532.5+(KHTML,+like+Gecko)+Chrome/4.0.249.89+Safari/532.5 200 0 0 218 2010-02-19 19:47:21 xxx.xxx.xxx.xxx GET /assets/files/secure/WWGNL.pdf - 80 - 10.5.16.138 Mozilla/5.0+(Windows;+U;+Windows+NT+6.1;+en-US)+AppleWebKit/532.5+(KHTML,+like+Gecko)+Chrome/4.0.249.89+Safari/532.5 401 2 5 0 2010-02-19 19:47:36 xxx.xxx.xxx.xxx GET /assets/files/secure/WWGNL.pdf - 80 - 10.5.16.138 Mozilla/5.0+(Windows;+U;+Windows+NT+6.1;+en-US)+AppleWebKit/532.5+(KHTML,+like+Gecko)+Chrome/4.0.249.89+Safari/532.5 401 1 2148074252 0 2010-02-19 19:47:43 xxx.xxx.xxx.xxx GET /assets/files/secure/WWGNL.pdf - 80 - 10.5.16.138 Mozilla/5.0+(Windows;+U;+Windows+NT+6.1;+en-US)+AppleWebKit/532.5+(KHTML,+like+Gecko)+Chrome/4.0.249.89+Safari/532.5 401 1 2148074252 15 2010-02-19 19:47:46 xxx.xxx.xxx.xxx GET /manager/media/script/_session.gif 0.19665693119168282 80 - 10.5.16.138 Mozilla/5.0+(Windows;+U;+Windows+NT+6.1;+en-US)+AppleWebKit/532.5+(KHTML,+like+Gecko)+Chrome/4.0.249.89+Safari/532.5 200 0 0 203 2010-02-19 19:47:46 xxx.xxx.xxx.xxx POST /manager/index.php - 80 - 10.5.16.138 Mozilla/5.0+(Windows;+U;+Windows+NT+6.1;+en-US)+AppleWebKit/532.5+(KHTML,+like+Gecko)+Chrome/4.0.249.89+Safari/532.5 200 0 0 296 2010-02-19 19:47:56 xxx.xxx.xxx.xxx GET /assets/files/secure/WWGNL.pdf - 80 - 10.5.16.138 Mozilla/5.0+(Windows;+U;+Windows+NT+6.1;+en-US)+AppleWebKit/532.5+(KHTML,+like+Gecko)+Chrome/4.0.249.89+Safari/532.5 401 1 2148074252 15 2010-02-19 19:47:59 xxx.xxx.xxx.xxx GET /favicon.ico - 80 - 10.5.16.138 Mozilla/5.0+(Windows;+U;+Windows+NT+6.1;+en-US)+AppleWebKit/532.5+(KHTML,+like+Gecko)+Chrome/4.0.249.89+Safari/532.5 404 0 2 0 Here's the Failed Logon attempt in the Security Log: Log Name: Security Source: Microsoft-Windows-Security-Auditing Date: 2/19/2010 11:47:43 AM Event ID: 4625 Task Category: Logon Level: Information Keywords: Audit Failure User: N/A Computer: WEB4.net.domain.org Description: An account failed to log on. Subject: Security ID: NULL SID Account Name: - Account Domain: - Logon ID: 0x0 Logon Type: 3 Account For Which Logon Failed: Security ID: NULL SID Account Name: jim.lastname Account Domain: net.domain.org Failure Information: Failure Reason: Unknown user name or bad password. Status: 0xc000006d Sub Status: 0xc000006a Process Information: Caller Process ID: 0x0 Caller Process Name: - Network Information: Workstation Name: - Source Network Address: 10.5.16.138 Source Port: 50065 Detailed Authentication Information: Logon Process: WDIGEST Authentication Package: WDigest Transited Services: - Package Name (NTLM only): - Key Length: 0 This event is generated when a logon request fails. It is generated on the computer where access was attempted. The Subject fields indicate the account on the local system which requested the logon. This is most commonly a service such as the Server service, or a local process such as Winlogon.exe or Services.exe. The Logon Type field indicates the kind of logon that was requested. The most common types are 2 (interactive) and 3 (network). The Process Information fields indicate which account and process on the system requested the logon. The Network Information fields indicate where a remote logon request originated. Workstation name is not always available and may be left blank in some cases. The authentication information fields provide detailed information about this specific logon request. - Transited services indicate which intermediate services have participated in this logon request. - Package name indicates which sub-protocol was used among the NTLM protocols. - Key length indicates the length of the generated session key. This will be 0 if no session key was requested. Event Xml: <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event"> <System> <Provider Name="Microsoft-Windows-Security-Auditing" Guid="{54849625-5478-4994-a5ba-3e3b0328c30d}" /> <EventID>4625</EventID> <Version>0</Version> <Level>0</Level> <Task>12544</Task> <Opcode>0</Opcode> <Keywords>0x8010000000000000</Keywords> <TimeCreated SystemTime="2010-02-19T19:47:43.890Z" /> <EventRecordID>2276316</EventRecordID> <Correlation /> <Execution ProcessID="612" ThreadID="692" /> <Channel>Security</Channel> <Computer>WEB4.net.domain.org</Computer> <Security /> </System> <EventData> <Data Name="SubjectUserSid">S-1-0-0</Data> <Data Name="SubjectUserName">-</Data> <Data Name="SubjectDomainName">-</Data> <Data Name="SubjectLogonId">0x0</Data> <Data Name="TargetUserSid">S-1-0-0</Data> <Data Name="TargetUserName">jim.lastname</Data> <Data Name="TargetDomainName">net.domain.org</Data> <Data Name="Status">0xc000006d</Data> <Data Name="FailureReason">%%2313</Data> <Data Name="SubStatus">0xc000006a</Data> <Data Name="LogonType">3</Data> <Data Name="LogonProcessName">WDIGEST</Data> <Data Name="AuthenticationPackageName">WDigest</Data> <Data Name="WorkstationName">-</Data> <Data Name="TransmittedServices">-</Data> <Data Name="LmPackageName">-</Data> <Data Name="KeyLength">0</Data> <Data Name="ProcessId">0x0</Data> <Data Name="ProcessName">-</Data> <Data Name="IpAddress">10.5.16.138</Data> <Data Name="IpPort">50065</Data> </EventData> </Event>

Read the article
Guidance: A Branching strategy for Scrum Teams

- by Martin Hinshelwood

Having a good branching strategy will save your bacon, or at least your code. Be careful when deviating from your branching strategy because if you do, you may be worse off than when you started! This is one possible branching strategy for Scrum teams and I will not be going in depth with Scrum but you can find out more about Scrum by reading the Scrum Guide and you can even assess your Scrum knowledge by having a go at the Scrum Open Assessment. You can also read SSW’s Rules to Better Scrum using TFS which have been developed during our own Scrum implementations. Acknowledgements Bill Heys – Bill offered some good feedback on this post and helped soften the language. Note: Bill is a VS ALM Ranger and co-wrote the Branching Guidance for TFS 2010 Willy-Peter Schaub – Willy-Peter is an ex Visual Studio ALM MVP turned blue badge and has been involved in most of the guidance including the Branching Guidance for TFS 2010 Chris Birmele – Chris wrote some of the early TFS Branching and Merging Guidance. Dr Paul Neumeyer, Ph.D Parallel Processes, ScrumMaster and SSW Solution Architect – Paul wanted to have feature branches coming from the release branch as well. We agreed that this is really a spin-off that needs own project, backlog, budget and Team. Scenario: A product is developed RTM 1.0 is released and gets great sales. Extra features are demanded but the new version will have double to price to pay to recover costs, work is approved by the guys with budget and a few sprints later RTM 2.0 is released. Sales a very low due to the pricing strategy. There are lots of clients on RTM 1.0 calling out for patches. As I keep getting Reverse Integration and Forward Integration mixed up and Bill keeps slapping my wrists I thought I should have a reminder: You still seemed to use reverse and/or forward integration in the wrong context. I would recommend reviewing your document at the end to ensure that it agrees with the common understanding of these terms merge (forward integration) from parent to child (same direction as the branch), and merge (reverse integration) from child to parent (the reverse direction of the branch). - one of my many slaps on the wrist from Bill Heys. As I mentioned previously we are using a single feature branching strategy in our current project. The single biggest mistake developers make is developing against the “Main” or “Trunk” line. This ultimately leads to messy code as things are added and never finished. Your only alternative is to NEVER check in unless your code is 100%, but this does not work in practice, even with a single developer. Your ADD will kick in and your half-finished code will be finished enough to pass the build and the tests. You do use builds don’t you? Sadly, this is a very common scenario and I have had people argue that branching merely adds complexity. Then again I have seen the other side of the universe ... branching structures from he... We should somehow convince everyone that there is a happy between no-branching and too-much-branching. - Willy-Peter Schaub, VS ALM Ranger, Microsoft A key benefit of branching for development is to isolate changes from the stable Main branch. Branching adds sanity more than it adds complexity. We do try to stress in our guidance that it is important to justify a branch, by doing a cost benefit analysis. The primary cost is the effort to do merges and resolve conflicts. A key benefit is that you have a stable code base in Main and accept changes into Main only after they pass quality gates, etc. - Bill Heys, VS ALM Ranger & TFS Branching Lead, Microsoft The second biggest mistake developers make is branching anything other than the WHOLE “Main” line. If you branch parts of your code and not others it gets out of sync and can make integration a nightmare. You should have your Source, Assets, Build scripts deployment scripts and dependencies inside the “Main” folder and branch the whole thing. Some departments within MSFT even go as far as to add the environments used to develop the product in there as well; although I would not recommend that unless you have a massive SQL cluster to house your source code. We tried the “add environment” back in South-Africa and while it was “phenomenal”, especially when having to switch between environments, the disk storage and processing requirements killed us. We opted for virtualization to skin this cat of keeping a ready-to-go environment handy. - Willy-Peter Schaub, VS ALM Ranger, Microsoft I think people often think that you should have separate branches for separate environments (e.g. Dev, Test, Integration Test, QA, etc.). I prefer to think of deploying to environments (such as from Main to QA) rather than branching for QA). - Bill Heys, VS ALM Ranger & TFS Branching Lead, Microsoft You can read about SSW’s Rules to better Source Control for some additional information on what Source Control to use and how to use it. There are also a number of branching Anti-Patterns that should be avoided at all costs: You know you are on the wrong track if you experience one or more of the following symptoms in your development environment: Merge Paranoia—avoiding merging at all cost, usually because of a fear of the consequences. Merge Mania—spending too much time merging software assets instead of developing them. Big Bang Merge—deferring branch merging to the end of the development effort and attempting to merge all branches simultaneously. Never-Ending Merge—continuous merging activity because there is always more to merge. Wrong-Way Merge—merging a software asset version with an earlier version. Branch Mania—creating many branches for no apparent reason. Cascading Branches—branching but never merging back to the main line. Mysterious Branches—branching for no apparent reason. Temporary Branches—branching for changing reasons, so the branch becomes a permanent temporary workspace. Volatile Branches—branching with unstable software assets shared by other branches or merged into another branch. Note Branches are volatile most of the time while they exist as independent branches. That is the point of having them. The difference is that you should not share or merge branches while they are in an unstable state. Development Freeze—stopping all development activities while branching, merging, and building new base lines. Berlin Wall—using branches to divide the development team members, instead of dividing the work they are performing. -Branching and Merging Primer by Chris Birmele - Developer Tools Technical Specialist at Microsoft Pty Ltd in Australia In fact, this can result in a merge exercise no-one wants to be involved in, merging hundreds of thousands of change sets and trying to get a consolidated build. Again, we need to find a happy medium. - Willy-Peter Schaub on Merge Paranoia Merge conflicts are generally the result of making changes to the same file in both the target and source branch. If you create merge conflicts, you will eventually need to resolve them. Often the resolution is manual. Merging more frequently allows you to resolve these conflicts close to when they happen, making the resolution clearer. Waiting weeks or months to resolve them, the Big Bang approach, means you are more likely to resolve conflicts incorrectly. - Bill Heys, VS ALM Ranger & TFS Branching Lead, Microsoft Figure: Main line, this is where your stable code lives and where any build has known entities, always passes and has a happy test that passes as well? Many development projects consist of, a single “Main” line of source and artifacts. This is good; at least there is source control . There are however a couple of issues that need to be considered. What happens if: you and your team are working on a new set of features and the customer wants a change to his current version? you are working on two features and the customer decides to abandon one of them? you have two teams working on different feature sets and their changes start interfering with each other? I just use labels instead of branches? That's a lot of “what if’s”, but there is a simple way of preventing this. Branching… In TFS, labels are not immutable. This does not mean they are not useful. But labels do not provide a very good development isolation mechanism. Branching allows separate code sets to evolve separately (e.g. Current with hotfixes, and vNext with new development). I don’t see how labels work here. - Bill Heys, VS ALM Ranger & TFS Branching Lead, Microsoft Figure: Creating a single feature branch means you can isolate the development work on that branch. Its standard practice for large projects with lots of developers to use Feature branching and you can check the Branching Guidance for the latest recommendations from the Visual Studio ALM Rangers for other methods. In the diagram above you can see my recommendation for branching when using Scrum development with TFS 2010. It consists of a single Sprint branch to contain all the changes for the current sprint. The main branch has the permissions changes so contributors to the project can only Branch and Merge with “Main”. This will prevent accidental check-ins or checkouts of the “Main” line that would contaminate the code. The developers continue to develop on sprint one until the completion of the sprint. Note: In the real world, starting a new Greenfield project, this process starts at Sprint 2 as at the start of Sprint 1 you would have artifacts in version control and no need for isolation. Figure: Once the sprint is complete the Sprint 1 code can then be merged back into the Main line. There are always good practices to follow, and one is to always do a Forward Integration from Main into Sprint 1 before you do a Reverse Integration from Sprint 1 back into Main. In this case it may seem superfluous, but this builds good muscle memory into your developer’s work ethic and means that no bad habits are learned that would interfere with additional Scrum Teams being added to the Product. The process of completing your sprint development: The Team completes their work according to their definition of done. Merge from “Main” into “Sprint1” (Forward Integration) Stabilize your code with any changes coming from other Scrum Teams working on the same product. If you have one Scrum Team this should be quick, but there may have been bug fixes in the Release branches. (we will talk about release branches later) Merge from “Sprint1” into “Main” to commit your changes. (Reverse Integration) Check-in Delete the Sprint1 branch Note: The Sprint 1 branch is no longer required as its useful life has been concluded. Check-in Done But you are not yet done with the Sprint. The goal in Scrum is to have a “potentially shippable product” at the end of every Sprint, and we do not have that yet, we only have finished code. Figure: With Sprint 1 merged you can create a Release branch and run your final packaging and testing In 99% of all projects I have been involved in or watched, a “shippable product” only happens towards the end of the overall lifecycle, especially when sprints are short. The in-between releases are great demonstration releases, but not shippable. Perhaps it comes from my 80’s brain washing that we only ship when we reach the agreed quality and business feature bar. - Willy-Peter Schaub, VS ALM Ranger, Microsoft Although you should have been testing and packaging your code all the way through your Sprint 1 development, preferably using an automated process, you still need to test and package with stable unchanging code. This is where you do what at SSW we call a “Test Please”. This is first an internal test of the product to make sure it meets the needs of the customer and you generally use a resource external to your Team. Then a “Test Please” is conducted with the Product Owner to make sure he is happy with the output. You can read about how to conduct a Test Please on our Rules to Successful Projects: Do you conduct an internal "test please" prior to releasing a version to a client? Figure: If you find a deviation from the expected result you fix it on the Release branch. If during your final testing or your “Test Please” you find there are issues or bugs then you should fix them on the release branch. If you can’t fix them within the time box of your Sprint, then you will need to create a Bug and put it onto the backlog for prioritization by the Product owner. Make sure you leave plenty of time between your merge from the development branch to find and fix any problems that are uncovered. This process is commonly called Stabilization and should always be conducted once you have completed all of your User Stories and integrated all of your branches. Even once you have stabilized and released, you should not delete the release branch as you would with the Sprint branch. It has a usefulness for servicing that may extend well beyond the limited life you expect of it. Note: Don't get forced by the business into adding features into a Release branch instead that indicates the unspoken requirement is that they are asking for a product spin-off. In this case you can create a new Team Project and branch from the required Release branch to create a new Main branch for that product. And you create a whole new backlog to work from. Figure: When the Team decides it is happy with the product you can create a RTM branch. Once you have fixed all the bugs you can, and added any you can’t to the Product Backlog, and you Team is happy with the result you can create a Release. This would consist of doing the final Build and Packaging it up ready for your Sprint Review meeting. You would then create a read-only branch that represents the code you “shipped”. This is really an Audit trail branch that is optional, but is good practice. You could use a Label, but Labels are not Auditable and if a dispute was raised by the customer you can produce a verifiable version of the source code for an independent party to check. Rare I know, but you do not want to be at the wrong end of a legal battle. Like the Release branch the RTM branch should never be deleted, or only deleted according to your companies legal policy, which in the UK is usually 7 years. Figure: If you have made any changes in the Release you will need to merge back up to Main in order to finalise the changes. Nothing is really ever done until it is in Main. The same rules apply when merging any fixes in the Release branch back into Main and you should do a reverse merge before a forward merge, again for the muscle memory more than necessity at this stage. Your Sprint is now nearly complete, and you can have a Sprint Review meeting knowing that you have made every effort and taken every precaution to protect your customer’s investment. Note: In order to really achieve protection for both you and your client you would add Automated Builds, Automated Tests, Automated Acceptance tests, Acceptance test tracking, Unit Tests, Load tests, Web test and all the other good engineering practices that help produce reliable software. Figure: After the Sprint Planning meeting the process begins again. Where the Sprint Review and Retrospective meetings mark the end of the Sprint, the Sprint Planning meeting marks the beginning. After you have completed your Sprint Planning and you know what you are trying to achieve in Sprint 2 you can create your new Branch to develop in. How do we handle a bug(s) in production that can’t wait? Although in Scrum the only work done should be on the backlog there should be a little buffer added to the Sprint Planning for contingencies. One of these contingencies is a bug in the current release that can’t wait for the Sprint to finish. But how do you handle that? Willy-Peter Schaub asked an excellent question on the release activities: In reality Sprint 2 starts when sprint 1 ends + weekend. Should we not cater for a possible parallelism between Sprint 2 and the release activities of sprint 1? It would introduce FI’s from main to sprint 2, I guess. Your “Figure: Merging print 2 back into Main.” covers, what I tend to believe to be reality in most cases. - Willy-Peter Schaub, VS ALM Ranger, Microsoft I agree, and if you have a single Scrum team then your resources are limited. The Scrum Team is responsible for packaging and release, so at least one run at stabilization, package and release should be included in the Sprint time box. If more are needed on the current production release during the Sprint 2 time box then resource needs to be pulled from Sprint 2. The Product Owner and the Team have four choices (in order of disruption/cost): Backlog: Add the bug to the backlog and fix it in the next Sprint Buffer Time: Use any buffer time included in the current Sprint to fix the bug quickly Make time: Remove a Story from the current Sprint that is of equal value to the time lost fixing the bug(s) and releasing. Note: The Team must agree that it can still meet the Sprint Goal. Cancel Sprint: Cancel the sprint and concentrate all resource on fixing the bug(s) Note: This can be a very costly if the current sprint has already had a lot of work completed as it will be lost. The choice will depend on the complexity and severity of the bug(s) and both the Product Owner and the Team need to agree. In this case we will go with option #2 or #3 as they are uncomplicated but severe bugs. Figure: Real world issue where a bug needs fixed in the current release. If the bug(s) is urgent enough then then your only option is to fix it in place. You can edit the release branch to find and fix the bug, hopefully creating a test so it can’t happen again. Follow the prior process and conduct an internal and customer “Test Please” before releasing. You can read about how to conduct a Test Please on our Rules to Successful Projects: Do you conduct an internal "test please" prior to releasing a version to a client? Figure: After you have fixed the bug you need to ship again. You then need to again create an RTM branch to hold the version of the code you released in escrow. Figure: Main is now out of sync with your Release. We now need to get these new changes back up into the Main branch. Do a reverse and then forward merge again to get the new code into Main. But what about the branch, are developers not working on Sprint 2? Does Sprint 2 now have changes that are not in Main and Main now have changes that are not in Sprint 2? Well, yes… and this is part of the hit you take doing branching. But would this scenario even have been possible without branching? Figure: Getting the changes in Main into Sprint 2 is very important. The Team now needs to do a Forward Integration merge into their Sprint and resolve any conflicts that occur. Maybe the bug has already been fixed in Sprint 2, maybe the bug no longer exists! This needs to be identified and resolved by the developers before they continue to get further out of Sync with Main. Note: Avoid the “Big bang merge” at all costs. Figure: Merging Sprint 2 back into Main, the Forward Integration, and R0 terminates. Sprint 2 now merges (Reverse Integration) back into Main following the procedures we have already established. Figure: The logical conclusion. This then allows the creation of the next release. By now you should be getting the big picture and hopefully you learned something useful from this post. I know I have enjoyed writing it as I find these exploratory posts coupled with real world experience really help harden my understanding. Branching is a tool; it is not a silver bullet. Don’t over use it, and avoid “Anti-Patterns” where possible. Although the diagram above looks complicated I hope showing you how it is formed simplifies it as much as possible. Technorati Tags: Branching,Scrum,VS ALM,TFS 2010,VS2010

Read the article
Another "Windows 7 entry missing from Grub2" Question

- by 4x10

Like many before me had the following problem that after installing Ubuntu (with windows 7 already installed), the grub boot loader wouldnt show windows 7 as a boot option, though i can boot fine if I use the "Choose Boot Device" options on the x220. The difference is that I try using UEFI only so many answers didn't really fit my problem, though i tried several stuffs: after running boot repair it destroyed the ubuntu boot loader custom entry in /etc/grub.d/40_custom for windows which doesnt show up many update-grub and reboots trying windows repair recovery thing while being there i also did bootrec.exe /FixBoot and update-grub and reboot again and finaly because it was so much fun, i installed linux all over again, while formatting and deleting everything linux related before that. Now that i think of it, Ubuntu also didn't notice Windows being there during the Setup and it still doesnt according to the Boot Info from Boot Repair. Boot Info Script 0.61-git-patched [23 April 2012] ============================= Boot Info Summary: =============================== => No boot loader is installed in the MBR of /dev/sda. sda1: __________________________________________________________________________ File system: vfat Boot sector type: Windows 7: FAT32 Boot sector info: No errors found in the Boot Parameter Block. Operating System: Boot files: /efi/Boot/bootx64.efi /efi/ubuntu/grubx64.efi sda2: __________________________________________________________________________ File system: Boot sector type: - Boot sector info: Mounting failed: mount: unknown filesystem type '' sda3: __________________________________________________________________________ File system: ntfs Boot sector type: Windows Vista/7: NTFS Boot sector info: No errors found in the Boot Parameter Block. Operating System: Windows 7 Boot files: /Windows/System32/winload.exe sda4: __________________________________________________________________________ File system: ext4 Boot sector type: - Boot sector info: Operating System: Ubuntu precise (development branch) Boot files: /boot/grub/grub.cfg /etc/fstab sda5: __________________________________________________________________________ File system: ext4 Boot sector type: - Boot sector info: Operating System: Boot files: sda6: __________________________________________________________________________ File system: swap Boot sector type: - Boot sector info: ============================ Drive/Partition Info: ============================= Drive: sda _____________________________________________________________________ Disk /dev/sda: 320.1 GB, 320072933376 bytes 255 heads, 63 sectors/track, 38913 cylinders, total 625142448 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes Partition Boot Start Sector End Sector # of Sectors Id System /dev/sda1 1 625,142,447 625,142,447 ee GPT GUID Partition Table detected. Partition Start Sector End Sector # of Sectors System /dev/sda1 2,048 206,847 204,800 EFI System partition /dev/sda2 206,848 468,991 262,144 Microsoft Reserved Partition (Windows) /dev/sda3 468,992 170,338,303 169,869,312 Data partition (Windows/Linux) /dev/sda4 170,338,304 330,338,304 160,000,001 Data partition (Windows/Linux) /dev/sda5 330,338,305 617,141,039 286,802,735 Data partition (Windows/Linux) /dev/sda6 617,141,040 625,141,040 8,000,001 Swap partition (Linux) "blkid" output: ________________________________________________________________ Device UUID TYPE LABEL /dev/sda1 885C-ED1B vfat /dev/sda3 EE06CC0506CBCCB1 ntfs /dev/sda4 604dd3b2-64ca-4200-b8fb-820e8d0ca899 ext4 /dev/sda5 d62515fd-8120-4a74-b17b-0bdf244124a3 ext4 /dev/sda6 7078b649-fb2a-4c59-bd03-fd31ef440d37 swap ================================ Mount points: ================================= Device Mount_Point Type Options /dev/sda1 /boot/efi vfat (rw) /dev/sda4 / ext4 (rw,errors=remount-ro) /dev/sda5 /home ext4 (rw) =========================== sda4/boot/grub/grub.cfg: =========================== -------------------------------------------------------------------------------- # # DO NOT EDIT THIS FILE # # It is automatically generated by grub-mkconfig using templates # from /etc/grub.d and settings from /etc/default/grub # ### BEGIN /etc/grub.d/00_header ### if [ -s $prefix/grubenv ]; then set have_grubenv=true load_env fi set default="0" if [ "${prev_saved_entry}" ]; then set saved_entry="${prev_saved_entry}" save_env saved_entry set prev_saved_entry= save_env prev_saved_entry set boot_once=true fi function savedefault { if [ -z "${boot_once}" ]; then saved_entry="${chosen}" save_env saved_entry fi } function recordfail { set recordfail=1 if [ -n "${have_grubenv}" ]; then if [ -z "${boot_once}" ]; then save_env recordfail; fi; fi } function load_video { insmod efi_gop insmod efi_uga insmod video_bochs insmod video_cirrus } insmod part_gpt insmod ext2 set root='(hd0,gpt4)' search --no-floppy --fs-uuid --set=root 604dd3b2-64ca-4200-b8fb-820e8d0ca899 if loadfont /usr/share/grub/unicode.pf2 ; then set gfxmode=auto load_video insmod gfxterm insmod part_gpt insmod ext2 set root='(hd0,gpt4)' search --no-floppy --fs-uuid --set=root 604dd3b2-64ca-4200-b8fb-820e8d0ca899 set locale_dir=($root)/boot/grub/locale set lang=en_US insmod gettext fi terminal_output gfxterm if [ "${recordfail}" = 1 ]; then set timeout=-1 else set timeout=10 fi ### END /etc/grub.d/00_header ### ### BEGIN /etc/grub.d/05_debian_theme ### set menu_color_normal=white/black set menu_color_highlight=black/light-gray if background_color 44,0,30; then clear fi ### END /etc/grub.d/05_debian_theme ### ### BEGIN /etc/grub.d/10_linux ### function gfxmode { set gfxpayload="$1" if [ "$1" = "keep" ]; then set vt_handoff=vt.handoff=7 else set vt_handoff= fi } if [ ${recordfail} != 1 ]; then if [ -e ${prefix}/gfxblacklist.txt ]; then if hwmatch ${prefix}/gfxblacklist.txt 3; then if [ ${match} = 0 ]; then set linux_gfx_mode=keep else set linux_gfx_mode=text fi else set linux_gfx_mode=text fi else set linux_gfx_mode=keep fi else set linux_gfx_mode=text fi export linux_gfx_mode if [ "$linux_gfx_mode" != "text" ]; then load_video; fi menuentry 'Ubuntu, with Linux 3.2.0-20-generic' --class ubuntu --class gnu-linux --class gnu --class os { recordfail gfxmode $linux_gfx_mode insmod gzio insmod part_gpt insmod ext2 set root='(hd0,gpt4)' search --no-floppy --fs-uuid --set=root 604dd3b2-64ca-4200-b8fb-820e8d0ca899 linux /boot/vmlinuz-3.2.0-20-generic root=UUID=604dd3b2-64ca-4200-b8fb-820e8d0ca899 ro quiet splash $vt_handoff initrd /boot/initrd.img-3.2.0-20-generic } menuentry 'Ubuntu, with Linux 3.2.0-20-generic (recovery mode)' --class ubuntu --class gnu-linux --class gnu --class os { recordfail insmod gzio insmod part_gpt insmod ext2 set root='(hd0,gpt4)' search --no-floppy --fs-uuid --set=root 604dd3b2-64ca-4200-b8fb-820e8d0ca899 echo 'Loading Linux 3.2.0-20-generic ...' linux /boot/vmlinuz-3.2.0-20-generic root=UUID=604dd3b2-64ca-4200-b8fb-820e8d0ca899 ro recovery nomodeset echo 'Loading initial ramdisk ...' initrd /boot/initrd.img-3.2.0-20-generic } ### END /etc/grub.d/10_linux ### ### BEGIN /etc/grub.d/20_linux_xen ### ### END /etc/grub.d/20_linux_xen ### ### BEGIN /etc/grub.d/20_memtest86+ ### menuentry "Memory test (memtest86+)" { insmod part_gpt insmod ext2 set root='(hd0,gpt4)' search --no-floppy --fs-uuid --set=root 604dd3b2-64ca-4200-b8fb-820e8d0ca899 linux16 /boot/memtest86+.bin } menuentry "Memory test (memtest86+, serial console 115200)" { insmod part_gpt insmod ext2 set root='(hd0,gpt4)' search --no-floppy --fs-uuid --set=root 604dd3b2-64ca-4200-b8fb-820e8d0ca899 linux16 /boot/memtest86+.bin console=ttyS0,115200n8 } ### END /etc/grub.d/20_memtest86+ ### ### BEGIN /etc/grub.d/30_os-prober ### ### END /etc/grub.d/30_os-prober ### ### BEGIN /etc/grub.d/40_custom ### # This file provides an easy way to add custom menu entries. Simply type the # menu entries you want to add after this comment. Be careful not to change # the 'exec tail' line above. ### END /etc/grub.d/40_custom ### ### BEGIN /etc/grub.d/41_custom ### if [ -f $prefix/custom.cfg ]; then source $prefix/custom.cfg; fi ### END /etc/grub.d/41_custom ### -------------------------------------------------------------------------------- =============================== sda4/etc/fstab: ================================ -------------------------------------------------------------------------------- # /etc/fstab: static file system information. # # Use 'blkid' to print the universally unique identifier for a # device; this may be used with UUID= as a more robust way to name devices # that works even if disks are added and removed. See fstab(5). # # <file system> <mount point> <type> <options> <dump> <pass> proc /proc proc nodev,noexec,nosuid 0 0 # / was on /dev/sda4 during installation UUID=604dd3b2-64ca-4200-b8fb-820e8d0ca899 / ext4 errors=remount-ro 0 1 # /boot/efi was on /dev/sda1 during installation UUID=885C-ED1B /boot/efi vfat defaults 0 1 # /home was on /dev/sda5 during installation UUID=d62515fd-8120-4a74-b17b-0bdf244124a3 /home ext4 defaults 0 2 # swap was on /dev/sda6 during installation UUID=7078b649-fb2a-4c59-bd03-fd31ef440d37 none swap sw 0 0 -------------------------------------------------------------------------------- =================== sda4: Location of files loaded by Grub: ==================== GiB - GB File Fragment(s) 129.422874451 = 138.966753280 boot/grub/grub.cfg 1 83.059570312 = 89.184534528 boot/initrd.img-3.2.0-20-generic 2 101.393131256 = 108.870045696 boot/vmlinuz-3.2.0-20-generic 1 83.059570312 = 89.184534528 initrd.img 2 101.393131256 = 108.870045696 vmlinuz 1 ADDITIONAL INFORMATION : =================== log of boot-repair 2012-04-25__23h40 =================== boot-repair version : 3.18-0ppa3~precise boot-sav version : 3.18-0ppa4~precise glade2script version : 0.3.2.1-0ppa7~precise internet: connected python-software-properties version : 0.82.7 0 upgraded, 0 newly installed, 1 reinstalled, 0 to remove and 591 not upgraded. dpkg-preconfigure: unable to re-open stdin: No such file or directory boot-repair is executed in installed-session (Ubuntu precise (development branch) , precise , Ubuntu , x86_64) WARNING: GPT (GUID Partition Table) detected on '/dev/sda'! The util fdisk doesn't support GPT. Use GNU Parted. =================== OSPROBER: /dev/sda4:The OS now in use - Ubuntu precise (development branch) CurrentSession:linux =================== BLKID: /dev/sda3: UUID="EE06CC0506CBCCB1" TYPE="ntfs" /dev/sda1: UUID="885C-ED1B" TYPE="vfat" /dev/sda4: UUID="604dd3b2-64ca-4200-b8fb-820e8d0ca899" TYPE="ext4" /dev/sda5: UUID="d62515fd-8120-4a74-b17b-0bdf244124a3" TYPE="ext4" /dev/sda6: UUID="7078b649-fb2a-4c59-bd03-fd31ef440d37" TYPE="swap" 1 disks with OS, 1 OS : 1 Linux, 0 MacOS, 0 Windows, 0 unknown type OS. WARNING: GPT (GUID Partition Table) detected on '/dev/sda'! The util sfdisk doesn't support GPT. Use GNU Parted. =================== /etc/default/grub : # If you change this file, run 'update-grub' afterwards to update # /boot/grub/grub.cfg. # For full documentation of the options in this file, see: # info -f grub -n 'Simple configuration' GRUB_DEFAULT=0 #GRUB_HIDDEN_TIMEOUT=0 #GRUB_HIDDEN_TIMEOUT_QUIET=true GRUB_TIMEOUT=10 GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian` GRUB_CMDLINE_LINUX_DEFAULT="quiet splash" GRUB_CMDLINE_LINUX="" # Uncomment to enable BadRAM filtering, modify to suit your needs # This works with Linux (no patch required) and with any kernel that obtains # the memory map information from GRUB (GNU Mach, kernel of FreeBSD ...) #GRUB_BADRAM="0x01234567,0xfefefefe,0x89abcdef,0xefefefef" # Uncomment to disable graphical terminal (grub-pc only) #GRUB_TERMINAL=console # The resolution used on graphical terminal # note that you can use only modes which your graphic card supports via VBE # you can see them in real GRUB with the command `vbeinfo' #GRUB_GFXMODE=640x480 # Uncomment if you don't want GRUB to pass "root=UUID=xxx" parameter to Linux #GRUB_DISABLE_LINUX_UUID=true # Uncomment to disable generation of recovery mode menu entries #GRUB_DISABLE_RECOVERY="true" # Uncomment to get a beep at grub start #GRUB_INIT_TUNE="480 440 1" EFI_OF_PART[1] (, ) =================== dmesg | grep EFI : [ 0.000000] EFI v2.00 by Lenovo [ 0.000000] Kernel-defined memdesc doesn't match the one from EFI! [ 0.000000] EFI: mem00: type=3, attr=0xf, range=[0x0000000000000000-0x0000000000001000) (0MB) [ 0.000000] EFI: mem01: type=7, attr=0xf, range=[0x0000000000001000-0x000000000004e000) (0MB) [ 0.000000] EFI: mem02: type=3, attr=0xf, range=[0x000000000004e000-0x0000000000058000) (0MB) [ 0.000000] EFI: mem03: type=10, attr=0xf, range=[0x0000000000058000-0x0000000000059000) (0MB) [ 0.000000] EFI: mem04: type=7, attr=0xf, range=[0x0000000000059000-0x000000000005e000) (0MB) [ 0.000000] EFI: mem05: type=4, attr=0xf, range=[0x000000000005e000-0x000000000005f000) (0MB) [ 0.000000] EFI: mem06: type=3, attr=0xf, range=[0x000000000005f000-0x00000000000a0000) (0MB) [ 0.000000] EFI: mem07: type=2, attr=0xf, range=[0x0000000000100000-0x00000000005b9000) (4MB) [ 0.000000] EFI: mem08: type=7, attr=0xf, range=[0x00000000005b9000-0x0000000020000000) (506MB) [ 0.000000] EFI: mem09: type=0, attr=0xf, range=[0x0000000020000000-0x0000000020200000) (2MB) [ 0.000000] EFI: mem10: type=7, attr=0xf, range=[0x0000000020200000-0x00000000364e4000) (354MB) [ 0.000000] EFI: mem11: type=2, attr=0xf, range=[0x00000000364e4000-0x000000003726a000) (13MB) [ 0.000000] EFI: mem12: type=7, attr=0xf, range=[0x000000003726a000-0x0000000040000000) (141MB) [ 0.000000] EFI: mem13: type=0, attr=0xf, range=[0x0000000040000000-0x0000000040200000) (2MB) [ 0.000000] EFI: mem14: type=7, attr=0xf, range=[0x0000000040200000-0x000000009df35000) (1501MB) [ 0.000000] EFI: mem15: type=2, attr=0xf, range=[0x000000009df35000-0x00000000d39a0000) (858MB) [ 0.000000] EFI: mem16: type=4, attr=0xf, range=[0x00000000d39a0000-0x00000000d39c0000) (0MB) [ 0.000000] EFI: mem17: type=7, attr=0xf, range=[0x00000000d39c0000-0x00000000d5df5000) (36MB) [ 0.000000] EFI: mem18: type=4, attr=0xf, range=[0x00000000d5df5000-0x00000000d6990000) (11MB) [ 0.000000] EFI: mem19: type=7, attr=0xf, range=[0x00000000d6990000-0x00000000d6b82000) (1MB) [ 0.000000] EFI: mem20: type=1, attr=0xf, range=[0x00000000d6b82000-0x00000000d6b9f000) (0MB) [ 0.000000] EFI: mem21: type=7, attr=0xf, range=[0x00000000d6b9f000-0x00000000d77b0000) (12MB) [ 0.000000] EFI: mem22: type=4, attr=0xf, range=[0x00000000d77b0000-0x00000000d780a000) (0MB) [ 0.000000] EFI: mem23: type=7, attr=0xf, range=[0x00000000d780a000-0x00000000d7826000) (0MB) [ 0.000000] EFI: mem24: type=4, attr=0xf, range=[0x00000000d7826000-0x00000000d7868000) (0MB) [ 0.000000] EFI: mem25: type=7, attr=0xf, range=[0x00000000d7868000-0x00000000d7869000) (0MB) [ 0.000000] EFI: mem26: type=4, attr=0xf, range=[0x00000000d7869000-0x00000000d786a000) (0MB) [ 0.000000] EFI: mem27: type=7, attr=0xf, range=[0x00000000d786a000-0x00000000d786b000) (0MB) [ 0.000000] EFI: mem28: type=4, attr=0xf, range=[0x00000000d786b000-0x00000000d786c000) (0MB) [ 0.000000] EFI: mem29: type=7, attr=0xf, range=[0x00000000d786c000-0x00000000d786d000) (0MB) [ 0.000000] EFI: mem30: type=4, attr=0xf, range=[0x00000000d786d000-0x00000000d825f000) (9MB) [ 0.000000] EFI: mem31: type=7, attr=0xf, range=[0x00000000d825f000-0x00000000d8261000) (0MB) [ 0.000000] EFI: mem32: type=4, attr=0xf, range=[0x00000000d8261000-0x00000000d82f7000) (0MB) [ 0.000000] EFI: mem33: type=7, attr=0xf, range=[0x00000000d82f7000-0x00000000d82f8000) (0MB) [ 0.000000] EFI: mem34: type=4, attr=0xf, range=[0x00000000d82f8000-0x00000000d8705000) (4MB) [ 0.000000] EFI: mem35: type=7, attr=0xf, range=[0x00000000d8705000-0x00000000d8706000) (0MB) [ 0.000000] EFI: mem36: type=4, attr=0xf, range=[0x00000000d8706000-0x00000000d8761000) (0MB) [ 0.000000] EFI: mem37: type=7, attr=0xf, range=[0x00000000d8761000-0x00000000d8768000) (0MB) [ 0.000000] EFI: mem38: type=4, attr=0xf, range=[0x00000000d8768000-0x00000000d9b9f000) (20MB) [ 0.000000] EFI: mem39: type=7, attr=0xf, range=[0x00000000d9b9f000-0x00000000d9e4c000) (2MB) [ 0.000000] EFI: mem40: type=2, attr=0xf, range=[0x00000000d9e4c000-0x00000000d9e52000) (0MB) [ 0.000000] EFI: mem41: type=3, attr=0xf, range=[0x00000000d9e52000-0x00000000da59f000) (7MB) [ 0.000000] EFI: mem42: type=5, attr=0x800000000000000f, range=[0x00000000da59f000-0x00000000da6c3000) (1MB) [ 0.000000] EFI: mem43: type=5, attr=0x800000000000000f, range=[0x00000000da6c3000-0x00000000da79f000) (0MB) [ 0.000000] EFI: mem44: type=6, attr=0x800000000000000f, range=[0x00000000da79f000-0x00000000da8b1000) (1MB) [ 0.000000] EFI: mem45: type=6, attr=0x800000000000000f, range=[0x00000000da8b1000-0x00000000da99f000) (0MB) [ 0.000000] EFI: mem46: type=0, attr=0xf, range=[0x00000000da99f000-0x00000000daa22000) (0MB) [ 0.000000] EFI: mem47: type=0, attr=0xf, range=[0x00000000daa22000-0x00000000daa9b000) (0MB) [ 0.000000] EFI: mem48: type=0, attr=0xf, range=[0x00000000daa9b000-0x00000000daa9c000) (0MB) [ 0.000000] EFI: mem49: type=0, attr=0xf, range=[0x00000000daa9c000-0x00000000daa9f000) (0MB) [ 0.000000] EFI: mem50: type=10, attr=0xf, range=[0x00000000daa9f000-0x00000000daadd000) (0MB) [ 0.000000] EFI: mem51: type=10, attr=0xf, range=[0x00000000daadd000-0x00000000dab9f000) (0MB) [ 0.000000] EFI: mem52: type=9, attr=0xf, range=[0x00000000dab9f000-0x00000000dabdc000) (0MB) [ 0.000000] EFI: mem53: type=9, attr=0xf, range=[0x00000000dabdc000-0x00000000dabff000) (0MB) [ 0.000000] EFI: mem54: type=4, attr=0xf, range=[0x00000000dabff000-0x00000000dac00000) (0MB) [ 0.000000] EFI: mem55: type=7, attr=0xf, range=[0x0000000100000000-0x000000021e600000) (4582MB) [ 0.000000] EFI: mem56: type=11, attr=0x8000000000000001, range=[0x00000000f80f8000-0x00000000f80f9000) (0MB) [ 0.000000] EFI: mem57: type=11, attr=0x8000000000000001, range=[0x00000000fed1c000-0x00000000fed20000) (0MB) [ 0.000000] ACPI: UEFI 00000000dabde000 0003E (v01 LENOVO TP-8D 00001280 PTL 00000002) [ 0.000000] ACPI: UEFI 00000000dabdd000 00042 (v01 PTL COMBUF 00000001 PTL 00000001) [ 0.000000] ACPI: UEFI 00000000dabdc000 00292 (v01 LENOVO TP-8D 00001280 PTL 00000002) [ 0.795807] fb0: EFI VGA frame buffer device [ 1.057243] EFI Variables Facility v0.08 2004-May-17 [ 9.122104] fb: conflicting fb hw usage inteldrmfb vs EFI VGA - removing generic driver ReadEFI: /dev/sda , N 128 , 0 , , PRStart 1024 , PRSize 128 WARNING: GPT (GUID Partition Table) detected on '/dev/sda'! The util fdisk doesn't support GPT. Use GNU Parted. =================== PARTITIONS & DISKS: sda4 : sda, not-sepboot, grubenv-ok grub2, grub-efi, update-grub, 64, with-boot, is-os, gpt-but-not-EFI, fstab-has-bad-efi, no-nt, no-winload, no-recov-nor-hid, no-bmgr, no-grldr, no-b-bcd, apt-get, grub-install, . sda3 : sda, maybesepboot, no-grubenv nogrub, no-docgrub, no-update-grub, 32, no-boot, no-os, gpt-but-not-EFI, part-has-no-fstab, no-nt, haswinload, no-recov-nor-hid, no-bmgr, no-grldr, no-b-bcd, nopakmgr, nogrubinstall, /mnt/boot-sav/sda3. sda1 : sda, maybesepboot, no-grubenv nogrub, no-docgrub, no-update-grub, 32, no-boot, no-os, is-correct-EFI, part-has-no-fstab, no-nt, no-winload, no-recov-nor-hid, no-bmgr, no-grldr, no-b-bcd, nopakmgr, nogrubinstall, /boot/efi. sda5 : sda, maybesepboot, no-grubenv nogrub, no-docgrub, no-update-grub, 32, no-boot, no-os, gpt-but-not-EFI, part-has-no-fstab, no-nt, no-winload, no-recov-nor-hid, no-bmgr, no-grldr, no-b-bcd, nopakmgr, nogrubinstall, /home. sda : GPT-BIS, GPT, no-BIOS_boot, has-correctEFI, 2048 sectors * 512 bytes =================== PARTED: Model: ATA HITACHI HTS72323 (scsi) Disk /dev/sda: 320GB Sector size (logical/physical): 512B/512B Partition Table: gpt Number Start End Size File system Name Flags 1 1049kB 106MB 105MB fat32 EFI system partition boot 2 106MB 240MB 134MB Microsoft reserved partition msftres 3 240MB 87.2GB 87.0GB ntfs Basic data partition 4 87.2GB 169GB 81.9GB ext4 5 169GB 316GB 147GB ext4 6 316GB 320GB 4096MB linux-swap(v1) =================== MOUNT: /dev/sda4 on / type ext4 (rw,errors=remount-ro) proc on /proc type proc (rw,noexec,nosuid,nodev) sysfs on /sys type sysfs (rw,noexec,nosuid,nodev) none on /sys/fs/fuse/connections type fusectl (rw) none on /sys/kernel/debug type debugfs (rw) none on /sys/kernel/security type securityfs (rw) udev on /dev type devtmpfs (rw,mode=0755) devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620) tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755) none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880) none on /run/shm type tmpfs (rw,nosuid,nodev) /dev/sda1 on /boot/efi type vfat (rw) /dev/sda5 on /home type ext4 (rw) gvfs-fuse-daemon on /home/vierlex/.gvfs type fuse.gvfs-fuse-daemon (rw,nosuid,nodev,user=vierlex) /dev/sda3 on /mnt/boot-sav/sda3 type fuseblk (rw,nosuid,nodev,allow_other,blksize=4096) /sys/block/sda: alignment_offset bdi capability dev device discard_alignment events events_async events_poll_msecs ext_range holders inflight power queue range removable ro sda1 sda2 sda3 sda4 sda5 sda6 size slaves stat subsystem trace uevent /dev: agpgart autofs block bsg btrfs-control bus char console core cpu cpu_dma_latency disk dri ecryptfs fb0 fd full fuse hpet input kmsg log mapper mcelog mei mem net network_latency network_throughput null oldmem port ppp psaux ptmx pts random rfkill rtc rtc0 sda sda1 sda2 sda3 sda4 sda5 sda6 sg0 shm snapshot snd stderr stdin stdout tpm0 uinput urandom usbmon0 usbmon1 usbmon2 v4l vga_arbiter video0 watchdog zero /dev/mapper: control /boot/efi: EFI /boot/efi/EFI: Boot Microsoft ubuntu /boot/efi/efi: Boot Microsoft ubuntu /boot/efi/efi/Boot: bootx64.efi /boot/efi/efi/ubuntu: grubx64.efi WARNING: GPT (GUID Partition Table) detected on '/dev/sda'! The util fdisk doesn't support GPT. Use GNU Parted. =================== DF: Filesystem Type Size Used Avail Use% Mounted on /dev/sda4 ext4 77G 4.1G 69G 6% / udev devtmpfs 3.9G 12K 3.9G 1% /dev tmpfs tmpfs 1.6G 864K 1.6G 1% /run none tmpfs 5.0M 0 5.0M 0% /run/lock none tmpfs 3.9G 152K 3.9G 1% /run/shm /dev/sda1 vfat 96M 18M 79M 19% /boot/efi /dev/sda5 ext4 137G 2.2G 128G 2% /home /dev/sda3 fuseblk 81G 30G 52G 37% /mnt/boot-sav/sda3 =================== FDISK: Disk /dev/sda: 320.1 GB, 320072933376 bytes 255 heads, 63 sectors/track, 38913 cylinders, total 625142448 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0xf34fe538 Device Boot Start End Blocks Id System /dev/sda1 1 625142447 312571223+ ee GPT =================== Before mainwindow FSCK no PASTEBIN yes WUBI no WINBOOT yes recommendedrepair, purge, QTY_OF_PART_FOR_REINSTAL 1 no-kernel-purge UNHIDEBOOT_ACTION yes (10s), noflag () PART_TO_REINSTALL_GRUB sda4, FORCE_GRUB no (sda) REMOVABLEDISK no USE_SEPARATEBOOTPART no (sda3) grub2 () UNCOMMENT_GFXMODE no ATA ADD_KERNEL_OPTION no (acpi=off) MBR_TO_RESTORE ( ) EFI detected. Please check the options. =================== Actions FSCK no PASTEBIN yes WUBI no WINBOOT no bootinfo, nombraction, QTY_OF_PART_FOR_REINSTAL 1 no-kernel-purge UNHIDEBOOT_ACTION no (10s), noflag () PART_TO_REINSTALL_GRUB sda4, FORCE_GRUB no (sda) REMOVABLEDISK no USE_SEPARATEBOOTPART no (sda3) grub2 () UNCOMMENT_GFXMODE no ATA ADD_KERNEL_OPTION no (acpi=off) MBR_TO_RESTORE ( ) No change has been performed on your computer. See you soon! internet: connected Thanks for your time and attention. EDIT: additional Info Request =No boot loader is installed in the MBR of /dev/sda. But maybe this is how it is supposed to work? yea this is ok. boot stuff seems to be on a seperate partition, in my case sda1. I'm very new to this UEFI thing too. missing files like bootmgr i don't really have a clue :D but yea, maybe thats how it suppose to be? Instead and whats not shown in the log for some reason: There is additional microsoft bootfiles on sda1 under /efi/microsoft/ [much stuff] I remember also doing some kind of hack to make a UEFI windows 7 usb stick. http://jake.io/b/2011/installing-windows-7-with-uefi-boot-on-an-x220-from-usb/ In short: creating and placing bootx64.efi on the stick so it can be booted in UEFI mode. boot order i decide that in my BIOS. i read somwhere that the thinkpad x220 (essential part of the serial number: 4921 http://www.lenovo.com/shop/americas/content/user_guides/x220_x220i_x220tablet_x220itablet_ug_en.pdf) doesnt really have UEFI interface or something, still, these 2 options are listed with all the other usual devices you can give a boot priority to. Right now it looks like this: Boot Priority Order 1. ubuntu 2. Windows Boot Manager 3. USB FDD 4. USB HDD 5. ATA HDD0 HITACHI [random string]

Read the article
Improving Partitioned Table Join Performance

- by Paul White

The query optimizer does not always choose an optimal strategy when joining partitioned tables. This post looks at an example, showing how a manual rewrite of the query can almost double performance, while reducing the memory grant to almost nothing. Test Data The two tables in this example use a common partitioning partition scheme. The partition function uses 41 equal-size partitions: CREATE PARTITION FUNCTION PFT (integer) AS RANGE RIGHT FOR VALUES ( 125000, 250000, 375000, 500000, 625000, 750000, 875000, 1000000, 1125000, 1250000, 1375000, 1500000, 1625000, 1750000, 1875000, 2000000, 2125000, 2250000, 2375000, 2500000, 2625000, 2750000, 2875000, 3000000, 3125000, 3250000, 3375000, 3500000, 3625000, 3750000, 3875000, 4000000, 4125000, 4250000, 4375000, 4500000, 4625000, 4750000, 4875000, 5000000 ); GO CREATE PARTITION SCHEME PST AS PARTITION PFT ALL TO ([PRIMARY]); There two tables are: CREATE TABLE dbo.T1 ( TID integer NOT NULL IDENTITY(0,1), Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x, CONSTRAINT PK_T1 PRIMARY KEY CLUSTERED (TID) ON PST (TID) ); CREATE TABLE dbo.T2 ( TID integer NOT NULL, Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x, CONSTRAINT PK_T2 PRIMARY KEY CLUSTERED (TID, Column1) ON PST (TID) ); The next script loads 5 million rows into T1 with a pseudo-random value between 1 and 5 for Column1. The table is partitioned on the IDENTITY column TID: INSERT dbo.T1 WITH (TABLOCKX) (Column1) SELECT (ABS(CHECKSUM(NEWID())) % 5) + 1 FROM dbo.Numbers AS N WHERE n BETWEEN 1 AND 5000000; In case you don’t already have an auxiliary table of numbers lying around, here’s a script to create one with 10 million rows: CREATE TABLE dbo.Numbers (n bigint PRIMARY KEY); WITH L0 AS(SELECT 1 AS c UNION ALL SELECT 1), L1 AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B), L2 AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B), L3 AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B), L4 AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B), L5 AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B), Nums AS(SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS n FROM L5) INSERT dbo.Numbers WITH (TABLOCKX) SELECT TOP (10000000) n FROM Nums ORDER BY n OPTION (MAXDOP 1); Table T1 contains data like this: Next we load data into table T2. The relationship between the two tables is that table 2 contains ‘n’ rows for each row in table 1, where ‘n’ is determined by the value in Column1 of table T1. There is nothing particularly special about the data or distribution, by the way. INSERT dbo.T2 WITH (TABLOCKX) (TID, Column1) SELECT T.TID, N.n FROM dbo.T1 AS T JOIN dbo.Numbers AS N ON N.n >= 1 AND N.n <= T.Column1; Table T2 ends up containing about 15 million rows: The primary key for table T2 is a combination of TID and Column1. The data is partitioned according to the value in column TID alone. Partition Distribution The following query shows the number of rows in each partition of table T1: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T1 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are 40 partitions containing 125,000 rows (40 * 125k = 5m rows). The rightmost partition remains empty. The next query shows the distribution for table 2: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T2 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are roughly 375,000 rows in each partition (the rightmost partition is also empty): Ok, that’s the test data done. Test Query and Execution Plan The task is to count the rows resulting from joining tables 1 and 2 on the TID column: SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME(); SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID; SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; The optimizer chooses a plan using parallel hash join, and partial aggregation: The Plan Explorer plan tree view shows accurate cardinality estimates and an even distribution of rows across threads (click to enlarge the image): With a warm data cache, the STATISTICS IO output shows that no physical I/O was needed, and all 41 partitions were touched: Running the query without actual execution plan or STATISTICS IO information for maximum performance, the query returns in around 2600ms. Execution Plan Analysis The first step toward improving on the execution plan produced by the query optimizer is to understand how it works, at least in outline. The two parallel Clustered Index Scans use multiple threads to read rows from tables T1 and T2. Parallel scan uses a demand-based scheme where threads are given page(s) to scan from the table as needed. This arrangement has certain important advantages, but does result in an unpredictable distribution of rows amongst threads. The point is that multiple threads cooperate to scan the whole table, but it is impossible to predict which rows end up on which threads. For correct results from the parallel hash join, the execution plan has to ensure that rows from T1 and T2 that might join are processed on the same thread. For example, if a row from T1 with join key value ‘1234’ is placed in thread 5’s hash table, the execution plan must guarantee that any rows from T2 that also have join key value ‘1234’ probe thread 5’s hash table for matches. The way this guarantee is enforced in this parallel hash join plan is by repartitioning rows to threads after each parallel scan. The two repartitioning exchanges route rows to threads using a hash function over the hash join keys. The two repartitioning exchanges use the same hash function so rows from T1 and T2 with the same join key must end up on the same hash join thread. Expensive Exchanges This business of repartitioning rows between threads can be very expensive, especially if a large number of rows is involved. The execution plan selected by the optimizer moves 5 million rows through one repartitioning exchange and around 15 million across the other. As a first step toward removing these exchanges, consider the execution plan selected by the optimizer if we join just one partition from each table, disallowing parallelism: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = 1 AND $PARTITION.PFT(T2.TID) = 1 OPTION (MAXDOP 1); The optimizer has chosen a (one-to-many) merge join instead of a hash join. The single-partition query completes in around 100ms. If everything scaled linearly, we would expect that extending this strategy to all 40 populated partitions would result in an execution time around 4000ms. Using parallelism could reduce that further, perhaps to be competitive with the parallel hash join chosen by the optimizer. This raises a question. If the most efficient way to join one partition from each of the tables is to use a merge join, why does the optimizer not choose a merge join for the full query? Forcing a Merge Join Let’s force the optimizer to use a merge join on the test query using a hint: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN); This is the execution plan selected by the optimizer: This plan results in the same number of logical reads reported previously, but instead of 2600ms the query takes 5000ms. The natural explanation for this drop in performance is that the merge join plan is only using a single thread, whereas the parallel hash join plan could use multiple threads. Parallel Merge Join We can get a parallel merge join plan using the same query hint as before, and adding trace flag 8649: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN, QUERYTRACEON 8649); The execution plan is: This looks promising. It uses a similar strategy to distribute work across threads as seen for the parallel hash join. In practice though, performance is disappointing. On a typical run, the parallel merge plan runs for around 8400ms; slower than the single-threaded merge join plan (5000ms) and much worse than the 2600ms for the parallel hash join. We seem to be going backwards! The logical reads for the parallel merge are still exactly the same as before, with no physical IOs. The cardinality estimates and thread distribution are also still very good (click to enlarge): A big clue to the reason for the poor performance is shown in the wait statistics (captured by Plan Explorer Pro): CXPACKET waits require careful interpretation, and are most often benign, but in this case excessive waiting occurs at the repartitioning exchanges. Unlike the parallel hash join, the repartitioning exchanges in this plan are order-preserving ‘merging’ exchanges (because merge join requires ordered inputs): Parallelism works best when threads can just grab any available unit of work and get on with processing it. Preserving order introduces inter-thread dependencies that can easily lead to significant waits occurring. In extreme cases, these dependencies can result in an intra-query deadlock, though the details of that will have to wait for another time to explore in detail. The potential for waits and deadlocks leads the query optimizer to cost parallel merge join relatively highly, especially as the degree of parallelism (DOP) increases. This high costing resulted in the optimizer choosing a serial merge join rather than parallel in this case. The test results certainly confirm its reasoning. Collocated Joins In SQL Server 2008 and later, the optimizer has another available strategy when joining tables that share a common partition scheme. This strategy is a collocated join, also known as as a per-partition join. It can be applied in both serial and parallel execution plans, though it is limited to 2-way joins in the current optimizer. Whether the optimizer chooses a collocated join or not depends on cost estimation. The primary benefits of a collocated join are that it eliminates an exchange and requires less memory, as we will see next. Costing and Plan Selection The query optimizer did consider a collocated join for our original query, but it was rejected on cost grounds. The parallel hash join with repartitioning exchanges appeared to be a cheaper option. There is no query hint to force a collocated join, so we have to mess with the costing framework to produce one for our test query. Pretending that IOs cost 50 times more than usual is enough to convince the optimizer to use collocated join with our test query: -- Pretend IOs are 50x cost temporarily DBCC SETIOWEIGHT(50); -- Co-located hash join SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (RECOMPILE); -- Reset IO costing DBCC SETIOWEIGHT(1); Collocated Join Plan The estimated execution plan for the collocated join is: The Constant Scan contains one row for each partition of the shared partitioning scheme, from 1 to 41. The hash repartitioning exchanges seen previously are replaced by a single Distribute Streams exchange using Demand partitioning. Demand partitioning means that the next partition id is given to the next parallel thread that asks for one. My test machine has eight logical processors, and all are available for SQL Server to use. As a result, there are eight threads in the single parallel branch in this plan, each processing one partition from each table at a time. Once a thread finishes processing a partition, it grabs a new partition number from the Distribute Streams exchange…and so on until all partitions have been processed. It is important to understand that the parallel scans in this plan are different from the parallel hash join plan. Although the scans have the same parallelism icon, tables T1 and T2 are not being co-operatively scanned by multiple threads in the same way. Each thread reads a single partition of T1 and performs a hash match join with the same partition from table T2. The properties of the two Clustered Index Scans show a Seek Predicate (unusual for a scan!) limiting the rows to a single partition: The crucial point is that the join between T1 and T2 is on TID, and TID is the partitioning column for both tables. A thread that processes partition ‘n’ is guaranteed to see all rows that can possibly join on TID for that partition. In addition, no other thread will see rows from that partition, so this removes the need for repartitioning exchanges. CPU and Memory Efficiency Improvements The collocated join has removed two expensive repartitioning exchanges and added a single exchange processing 41 rows (one for each partition id). Remember, the parallel hash join plan exchanges had to process 5 million and 15 million rows. The amount of processor time spent on exchanges will be much lower in the collocated join plan. In addition, the collocated join plan has a maximum of 8 threads processing single partitions at any one time. The 41 partitions will all be processed eventually, but a new partition is not started until a thread asks for it. Threads can reuse hash table memory for the new partition. The parallel hash join plan also had 8 hash tables, but with all 5,000,000 build rows loaded at the same time. The collocated plan needs memory for only 8 * 125,000 = 1,000,000 rows at any one time. Collocated Hash Join Performance The collated join plan has disappointing performance in this case. The query runs for around 25,300ms despite the same IO statistics as usual. This is much the worst result so far, so what went wrong? It turns out that cardinality estimation for the single partition scans of table T1 is slightly low. The properties of the Clustered Index Scan of T1 (graphic immediately above) show the estimation was for 121,951 rows. This is a small shortfall compared with the 125,000 rows actually encountered, but it was enough to cause the hash join to spill to physical tempdb: A level 1 spill doesn’t sound too bad, until you realize that the spill to tempdb probably occurs for each of the 41 partitions. As a side note, the cardinality estimation error is a little surprising because the system tables accurately show there are 125,000 rows in every partition of T1. Unfortunately, the optimizer uses regular column and index statistics to derive cardinality estimates here rather than system table information (e.g. sys.partitions). Collocated Merge Join We will never know how well the collocated parallel hash join plan might have worked without the cardinality estimation error (and the resulting 41 spills to tempdb) but we do know: Merge join does not require a memory grant; and Merge join was the optimizer’s preferred join option for a single partition join Putting this all together, what we would really like to see is the same collocated join strategy, but using merge join instead of hash join. Unfortunately, the current query optimizer cannot produce a collocated merge join; it only knows how to do collocated hash join. So where does this leave us? CROSS APPLY sys.partitions We can try to write our own collocated join query. We can use sys.partitions to find the partition numbers, and CROSS APPLY to get a count per partition, with a final step to sum the partial counts. The following query implements this idea: SELECT row_count = SUM(Subtotals.cnt) FROM ( -- Partition numbers SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1 ) AS P CROSS APPLY ( -- Count per collocated join SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals; The estimated plan is: The cardinality estimates aren’t all that good here, especially the estimate for the scan of the system table underlying the sys.partitions view. Nevertheless, the plan shape is heading toward where we would like to be. Each partition number from the system table results in a per-partition scan of T1 and T2, a one-to-many Merge Join, and a Stream Aggregate to compute the partial counts. The final Stream Aggregate just sums the partial counts. Execution time for this query is around 3,500ms, with the same IO statistics as always. This compares favourably with 5,000ms for the serial plan produced by the optimizer with the OPTION (MERGE JOIN) hint. This is another case of the sum of the parts being less than the whole – summing 41 partial counts from 41 single-partition merge joins is faster than a single merge join and count over all partitions. Even so, this single-threaded collocated merge join is not as quick as the original parallel hash join plan, which executed in 2,600ms. On the positive side, our collocated merge join uses only one logical processor and requires no memory grant. The parallel hash join plan used 16 threads and reserved 569 MB of memory: Using a Temporary Table Our collocated merge join plan should benefit from parallelism. The reason parallelism is not being used is that the query references a system table. We can work around that by writing the partition numbers to a temporary table (or table variable): SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME(); CREATE TABLE #P ( partition_number integer PRIMARY KEY); INSERT #P (partition_number) SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1; SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals; DROP TABLE #P; SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; Using the temporary table adds a few logical reads, but the overall execution time is still around 3500ms, indistinguishable from the same query without the temporary table. The problem is that the query optimizer still doesn’t choose a parallel plan for this query, though the removal of the system table reference means that it could if it chose to: In fact the optimizer did enter the parallel plan phase of query optimization (running search 1 for a second time): Unfortunately, the parallel plan found seemed to be more expensive than the serial plan. This is a crazy result, caused by the optimizer’s cost model not reducing operator CPU costs on the inner side of a nested loops join. Don’t get me started on that, we’ll be here all night. In this plan, everything expensive happens on the inner side of a nested loops join. Without a CPU cost reduction to compensate for the added cost of exchange operators, candidate parallel plans always look more expensive to the optimizer than the equivalent serial plan. Parallel Collocated Merge Join We can produce the desired parallel plan using trace flag 8649 again: SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: One difference between this plan and the collocated hash join plan is that a Repartition Streams exchange operator is used instead of Distribute Streams. The effect is similar, though not quite identical. The Repartition uses round-robin partitioning, meaning the next partition id is pushed to the next thread in sequence. The Distribute Streams exchange seen earlier used Demand partitioning, meaning the next partition id is pulled across the exchange by the next thread that is ready for more work. There are subtle performance implications for each partitioning option, but going into that would again take us too far off the main point of this post. Performance The important thing is the performance of this parallel collocated merge join – just 1350ms on a typical run. The list below shows all the alternatives from this post (all timings include creation, population, and deletion of the temporary table where appropriate) from quickest to slowest: Collocated parallel merge join: 1350ms Parallel hash join: 2600ms Collocated serial merge join: 3500ms Serial merge join: 5000ms Parallel merge join: 8400ms Collated parallel hash join: 25,300ms (hash spill per partition) The parallel collocated merge join requires no memory grant (aside from a paltry 1.2MB used for exchange buffers). This plan uses 16 threads at DOP 8; but 8 of those are (rather pointlessly) allocated to the parallel scan of the temporary table. These are minor concerns, but it turns out there is a way to address them if it bothers you. Parallel Collocated Merge Join with Demand Partitioning This final tweak replaces the temporary table with a hard-coded list of partition ids (dynamic SQL could be used to generate this query from sys.partitions): SELECT row_count = SUM(Subtotals.cnt) FROM ( VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10), (11),(12),(13),(14),(15),(16),(17),(18),(19),(20), (21),(22),(23),(24),(25),(26),(27),(28),(29),(30), (31),(32),(33),(34),(35),(36),(37),(38),(39),(40),(41) ) AS P (partition_number) CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: The parallel collocated hash join plan is reproduced below for comparison: The manual rewrite has another advantage that has not been mentioned so far: the partial counts (per partition) can be computed earlier than the partial counts (per thread) in the optimizer’s collocated join plan. The earlier aggregation is performed by the extra Stream Aggregate under the nested loops join. The performance of the parallel collocated merge join is unchanged at around 1350ms. Final Words It is a shame that the current query optimizer does not consider a collocated merge join (Connect item closed as Won’t Fix). The example used in this post showed an improvement in execution time from 2600ms to 1350ms using a modestly-sized data set and limited parallelism. In addition, the memory requirement for the query was almost completely eliminated – down from 569MB to 1.2MB. The problem with the parallel hash join selected by the optimizer is that it attempts to process the full data set all at once (albeit using eight threads). It requires a large memory grant to hold all 5 million rows from table T1 across the eight hash tables, and does not take advantage of the divide-and-conquer opportunity offered by the common partitioning. The great thing about the collocated join strategies is that each parallel thread works on a single partition from both tables, reading rows, performing the join, and computing a per-partition subtotal, before moving on to a new partition. From a thread’s point of view… If you have trouble visualizing what is happening from just looking at the parallel collocated merge join execution plan, let’s look at it again, but from the point of view of just one thread operating between the two Parallelism (exchange) operators. Our thread picks up a single partition id from the Distribute Streams exchange, and starts a merge join using ordered rows from partition 1 of table T1 and partition 1 of table T2. By definition, this is all happening on a single thread. As rows join, they are added to a (per-partition) count in the Stream Aggregate immediately above the Merge Join. Eventually, either T1 (partition 1) or T2 (partition 1) runs out of rows and the merge join stops. The per-partition count from the aggregate passes on through the Nested Loops join to another Stream Aggregate, which is maintaining a per-thread subtotal. Our same thread now picks up a new partition id from the exchange (say it gets id 9 this time). The count in the per-partition aggregate is reset to zero, and the processing of partition 9 of both tables proceeds just as it did for partition 1, and on the same thread. Each thread picks up a single partition id and processes all the data for that partition, completely independently from other threads working on other partitions. One thread might eventually process partitions (1, 9, 17, 25, 33, 41) while another is concurrently processing partitions (2, 10, 18, 26, 34) and so on for the other six threads at DOP 8. The point is that all 8 threads can execute independently and concurrently, continuing to process new partitions until the wider job (of which the thread has no knowledge!) is done. This divide-and-conquer technique can be much more efficient than simply splitting the entire workload across eight threads all at once. Related Reading Understanding and Using Parallelism in SQL Server Parallel Execution Plans Suck © 2013 Paul White – All Rights Reserved Twitter: @SQL_Kiwi

Read the article
Toorcon 15 (2013)

- by danx

The Toorcon gang (senior staff): h1kari (founder), nfiltr8, and Geo Introduction to Toorcon 15 (2013) A Tale of One Software Bypass of MS Windows 8 Secure Boot Breaching SSL, One Byte at a Time Running at 99%: Surviving an Application DoS Security Response in the Age of Mass Customized Attacks x86 Rewriting: Defeating RoP and other Shinanighans Clowntown Express: interesting bugs and running a bug bounty program Active Fingerprinting of Encrypted VPNs Making Attacks Go Backwards Mask Your Checksums—The Gorry Details Adventures with weird machines thirty years after "Reflections on Trusting Trust" Introduction to Toorcon 15 (2013) Toorcon 15 is the 15th annual security conference held in San Diego. I've attended about a third of them and blogged about previous conferences I attended here starting in 2003. As always, I've only summarized the talks I attended and interested me enough to write about them. Be aware that I may have misrepresented the speaker's remarks and that they are not my remarks or opinion, or those of my employer, so don't quote me or them. Those seeking further details may contact the speakers directly or use The Google. For some talks, I have a URL for further information. A Tale of One Software Bypass of MS Windows 8 Secure Boot Andrew Furtak and Oleksandr Bazhaniuk Yuri Bulygin, Oleksandr ("Alex") Bazhaniuk, and (not present) Andrew Furtak Yuri and Alex talked about UEFI and Bootkits and bypassing MS Windows 8 Secure Boot, with vendor recommendations. They previously gave this talk at the BlackHat 2013 conference. MS Windows 8 Secure Boot Overview UEFI (Unified Extensible Firmware Interface) is interface between hardware and OS. UEFI is processor and architecture independent. Malware can replace bootloader (bootx64.efi, bootmgfw.efi). Once replaced can modify kernel. Trivial to replace bootloader. Today many legacy bootkits—UEFI replaces them most of them. MS Windows 8 Secure Boot verifies everything you load, either through signatures or hashes. UEFI firmware relies on secure update (with signed update). You would think Secure Boot would rely on ROM (such as used for phones0, but you can't do that for PCs—PCs use writable memory with signatures DXE core verifies the UEFI boat loader(s) OS Loader (winload.efi, winresume.efi) verifies the OS kernel A chain of trust is established with a root key (Platform Key, PK), which is a cert belonging to the platform vendor. Key Exchange Keys (KEKs) verify an "authorized" database (db), and "forbidden" database (dbx). X.509 certs with SHA-1/SHA-256 hashes. Keys are stored in non-volatile (NV) flash-based NVRAM. Boot Services (BS) allow adding/deleting keys (can't be accessed once OS starts—which uses Run-Time (RT)). Root cert uses RSA-2048 public keys and PKCS#7 format signatures. SecureBoot — enable disable image signature checks SetupMode — update keys, self-signed keys, and secure boot variables CustomMode — allows updating keys Secure Boot policy settings are: always execute, never execute, allow execute on security violation, defer execute on security violation, deny execute on security violation, query user on security violation Attacking MS Windows 8 Secure Boot Secure Boot does NOT protect from physical access. Can disable from console. Each BIOS vendor implements Secure Boot differently. There are several platform and BIOS vendors. It becomes a "zoo" of implementations—which can be taken advantage of. Secure Boot is secure only when all vendors implement it correctly. Allow only UEFI firmware signed updates protect UEFI firmware from direct modification in flash memory protect FW update components program SPI controller securely protect secure boot policy settings in nvram protect runtime api disable compatibility support module which allows unsigned legacy Can corrupt the Platform Key (PK) EFI root certificate variable in SPI flash. If PK is not found, FW enters setup mode wich secure boot turned off. Can also exploit TPM in a similar manner. One is not supposed to be able to directly modify the PK in SPI flash from the OS though. But they found a bug that they can exploit from User Mode (undisclosed) and demoed the exploit. It loaded and ran their own bootkit. The exploit requires a reboot. Multiple vendors are vulnerable. They will disclose this exploit to vendors in the future. Recommendations: allow only signed updates protect UEFI fw in ROM protect EFI variable store in ROM Breaching SSL, One Byte at a Time Yoel Gluck and Angelo Prado Angelo Prado and Yoel Gluck, Salesforce.com CRIME is software that performs a "compression oracle attack." This is possible because the SSL protocol doesn't hide length, and because SSL compresses the header. CRIME requests with every possible character and measures the ciphertext length. Look for the plaintext which compresses the most and looks for the cookie one byte-at-a-time. SSL Compression uses LZ77 to reduce redundancy. Huffman coding replaces common byte sequences with shorter codes. US CERT thinks the SSL compression problem is fixed, but it isn't. They convinced CERT that it wasn't fixed and they issued a CVE. BREACH, breachattrack.com BREACH exploits the SSL response body (Accept-Encoding response, Content-Encoding). It takes advantage of the fact that the response is not compressed. BREACH uses gzip and needs fairly "stable" pages that are static for ~30 seconds. It needs attacker-supplied content (say from a web form or added to a URL parameter). BREACH listens to a session's requests and responses, then inserts extra requests and responses. Eventually, BREACH guesses a session's secret key. Can use compression to guess contents one byte at-a-time. For example, "Supersecret SupersecreX" (a wrong guess) compresses 10 bytes, and "Supersecret Supersecret" (a correct guess) compresses 11 bytes, so it can find each character by guessing every character. To start the guess, BREACH needs at least three known initial characters in the response sequence. Compression length then "leaks" information. Some roadblocks include no winners (all guesses wrong) or too many winners (multiple possibilities that compress the same). The solutions include: lookahead (guess 2 or 3 characters at-a-time instead of 1 character). Expensive rollback to last known conflict check compression ratio can brute-force first 3 "bootstrap" characters, if needed (expensive) block ciphers hide exact plain text length. Solution is to align response in advance to block size Mitigations length: use variable padding secrets: dynamic CSRF tokens per request secret: change over time separate secret to input-less servlets Future work eiter understand DEFLATE/GZIP HTTPS extensions Running at 99%: Surviving an Application DoS Ryan Huber Ryan Huber, Risk I/O Ryan first discussed various ways to do a denial of service (DoS) attack against web services. One usual method is to find a slow web page and do several wgets. Or download large files. Apache is not well suited at handling a large number of connections, but one can put something in front of it Can use Apache alternatives, such as nginx How to identify malicious hosts short, sudden web requests user-agent is obvious (curl, python) same url requested repeatedly no web page referer (not normal) hidden links. hide a link and see if a bot gets it restricted access if not your geo IP (unless the website is global) missing common headers in request regular timing first seen IP at beginning of attack count requests per hosts (usually a very large number) Use of captcha can mitigate attacks, but you'll lose a lot of genuine users. Bouncer, goo.gl/c2vyEc and www.github.com/rawdigits/Bouncer Bouncer is software written by Ryan in netflow. Bouncer has a small, unobtrusive footprint and detects DoS attempts. It closes blacklisted sockets immediately (not nice about it, no proper close connection). Aggregator collects requests and controls your web proxies. Need NTP on the front end web servers for clean data for use by bouncer. Bouncer is also useful for a popularity storm ("Slashdotting") and scraper storms. Future features: gzip collection data, documentation, consumer library, multitask, logging destroyed connections. Takeaways: DoS mitigation is easier with a complete picture Bouncer designed to make it easier to detect and defend DoS—not a complete cure Security Response in the Age of Mass Customized Attacks Peleus Uhley and Karthik Raman Peleus Uhley and Karthik Raman, Adobe ASSET, blogs.adobe.com/asset/ Peleus and Karthik talked about response to mass-customized exploits. Attackers behave much like a business. "Mass customization" refers to concept discussed in the book Future Perfect by Stan Davis of Harvard Business School. Mass customization is differentiating a product for an individual customer, but at a mass production price. For example, the same individual with a debit card receives basically the same customized ATM experience around the world. Or designing your own PC from commodity parts. Exploit kits are another example of mass customization. The kits support multiple browsers and plugins, allows new modules. Exploit kits are cheap and customizable. Organized gangs use exploit kits. A group at Berkeley looked at 77,000 malicious websites (Grier et al., "Manufacturing Compromise: The Emergence of Exploit-as-a-Service", 2012). They found 10,000 distinct binaries among them, but derived from only a dozen or so exploit kits. Characteristics of Mass Malware: potent, resilient, relatively low cost Technical characteristics: multiple OS, multipe payloads, multiple scenarios, multiple languages, obfuscation Response time for 0-day exploits has gone down from ~40 days 5 years ago to about ~10 days now. So the drive with malware is towards mass customized exploits, to avoid detection There's plenty of evicence that exploit development has Project Manager bureaucracy. They infer from the malware edicts to: support all versions of reader support all versions of windows support all versions of flash support all browsers write large complex, difficult to main code (8750 lines of JavaScript for example Exploits have "loose coupling" of multipe versions of software (adobe), OS, and browser. This allows specific attacks against specific versions of multiple pieces of software. Also allows exploits of more obscure software/OS/browsers and obscure versions. Gave examples of exploits that exploited 2, 3, 6, or 14 separate bugs. However, these complete exploits are more likely to be buggy or fragile in themselves and easier to defeat. Future research includes normalizing malware and Javascript. Conclusion: The coming trend is that mass-malware with mass zero-day attacks will result in mass customization of attacks. x86 Rewriting: Defeating RoP and other Shinanighans Richard Wartell Richard Wartell The attack vector we are addressing here is: First some malware causes a buffer overflow. The malware has no program access, but input access and buffer overflow code onto stack Later the stack became non-executable. The workaround malware used was to write a bogus return address to the stack jumping to malware Later came ASLR (Address Space Layout Randomization) to randomize memory layout and make addresses non-deterministic. The workaround malware used was to jump t existing code segments in the program that can be used in bad ways "RoP" is Return-oriented Programming attacks. RoP attacks use your own code and write return address on stack to (existing) expoitable code found in program ("gadgets"). Pinkie Pie was paid $60K last year for a RoP attack. One solution is using anti-RoP compilers that compile source code with NO return instructions. ASLR does not randomize address space, just "gadgets". IPR/ILR ("Instruction Location Randomization") randomizes each instruction with a virtual machine. Richard's goal was to randomize a binary with no source code access. He created "STIR" (Self-Transofrming Instruction Relocation). STIR disassembles binary and operates on "basic blocks" of code. The STIR disassembler is conservative in what to disassemble. Each basic block is moved to a random location in memory. Next, STIR writes new code sections with copies of "basic blocks" of code in randomized locations. The old code is copied and rewritten with jumps to new code. the original code sections in the file is marked non-executible. STIR has better entropy than ASLR in location of code. Makes brute force attacks much harder. STIR runs on MS Windows (PEM) and Linux (ELF). It eliminated 99.96% or more "gadgets" (i.e., moved the address). Overhead usually 5-10% on MS Windows, about 1.5-4% on Linux (but some code actually runs faster!). The unique thing about STIR is it requires no source access and the modified binary fully works! Current work is to rewrite code to enforce security policies. For example, don't create a *.{exe,msi,bat} file. Or don't connect to the network after reading from the disk. Clowntown Express: interesting bugs and running a bug bounty program Collin Greene Collin Greene, Facebook Collin talked about Facebook's bug bounty program. Background at FB: FB has good security frameworks, such as security teams, external audits, and cc'ing on diffs. But there's lots of "deep, dark, forgotten" parts of legacy FB code. Collin gave several examples of bountied bugs. Some bounty submissions were on software purchased from a third-party (but bounty claimers don't know and don't care). We use security questions, as does everyone else, but they are basically insecure (often easily discoverable). Collin didn't expect many bugs from the bounty program, but they ended getting 20+ good bugs in first 24 hours and good submissions continue to come in. Bug bounties bring people in with different perspectives, and are paid only for success. Bug bounty is a better use of a fixed amount of time and money versus just code review or static code analysis. The Bounty program started July 2011 and paid out $1.5 million to date. 14% of the submissions have been high priority problems that needed to be fixed immediately. The best bugs come from a small % of submitters (as with everything else)—the top paid submitters are paid 6 figures a year. Spammers like to backstab competitors. The youngest sumitter was 13. Some submitters have been hired. Bug bounties also allows to see bugs that were missed by tools or reviews, allowing improvement in the process. Bug bounties might not work for traditional software companies where the product has release cycle or is not on Internet. Active Fingerprinting of Encrypted VPNs Anna Shubina Anna Shubina, Dartmouth Institute for Security, Technology, and Society (I missed the start of her talk because another track went overtime. But I have the DVD of the talk, so I'll expand later) IPsec leaves fingerprints. Using netcat, one can easily visually distinguish various crypto chaining modes just from packet timing on a chart (example, DES-CBC versus AES-CBC) One can tell a lot about VPNs just from ping roundtrips (such as what router is used) Delayed packets are not informative about a network, especially if far away from the network More needed to explore about how TCP works in real life with respect to timing Making Attacks Go Backwards Fuzzynop FuzzyNop, Mandiant This talk is not about threat attribution (finding who), product solutions, politics, or sales pitches. But who are making these malware threats? It's not a single person or group—they have diverse skill levels. There's a lot of fat-fingered fumblers out there. Always look for low-hanging fruit first: "hiding" malware in the temp, recycle, or root directories creation of unnamed scheduled tasks obvious names of files and syscalls ("ClearEventLog") uncleared event logs. Clearing event log in itself, and time of clearing, is a red flag and good first clue to look for on a suspect system Reverse engineering is hard. Disassembler use takes practice and skill. A popular tool is IDA Pro, but it takes multiple interactive iterations to get a clean disassembly. Key loggers are used a lot in targeted attacks. They are typically custom code or built in a backdoor. A big tip-off is that non-printable characters need to be printed out (such as "[Ctrl]" "[RightShift]") or time stamp printf strings. Look for these in files. Presence is not proof they are used. Absence is not proof they are not used. Java exploits. Can parse jar file with idxparser.py and decomile Java file. Java typially used to target tech companies. Backdoors are the main persistence mechanism (provided externally) for malware. Also malware typically needs command and control. Application of Artificial Intelligence in Ad-Hoc Static Code Analysis John Ashaman John Ashaman, Security Innovation Initially John tried to analyze open source files with open source static analysis tools, but these showed thousands of false positives. Also tried using grep, but tis fails to find anything even mildly complex. So next John decided to write his own tool. His approach was to first generate a call graph then analyze the graph. However, the problem is that making a call graph is really hard. For example, one problem is "evil" coding techniques, such as passing function pointer. First the tool generated an Abstract Syntax Tree (AST) with the nodes created from method declarations and edges created from method use. Then the tool generated a control flow graph with the goal to find a path through the AST (a maze) from source to sink. The algorithm is to look at adjacent nodes to see if any are "scary" (a vulnerability), using heuristics for search order. The tool, called "Scat" (Static Code Analysis Tool), currently looks for C# vulnerabilities and some simple PHP. Later, he plans to add more PHP, then JSP and Java. For more information see his posts in Security Innovation blog and NRefactory on GitHub. Mask Your Checksums—The Gorry Details Eric (XlogicX) Davisson Eric (XlogicX) Davisson Sometimes in emailing or posting TCP/IP packets to analyze problems, you may want to mask the IP address. But to do this correctly, you need to mask the checksum too, or you'll leak information about the IP. Problem reports found in stackoverflow.com, sans.org, and pastebin.org are usually not masked, but a few companies do care. If only the IP is masked, the IP may be guessed from checksum (that is, it leaks data). Other parts of packet may leak more data about the IP. TCP and IP checksums both refer to the same data, so can get more bits of information out of using both checksums than just using one checksum. Also, one can usually determine the OS from the TTL field and ports in a packet header. If we get hundreds of possible results (16x each masked nibble that is unknown), one can do other things to narrow the results, such as look at packet contents for domain or geo information. With hundreds of results, can import as CSV format into a spreadsheet. Can corelate with geo data and see where each possibility is located. Eric then demoed a real email report with a masked IP packet attached. Was able to find the exact IP address, given the geo and university of the sender. Point is if you're going to mask a packet, do it right. Eric wouldn't usually bother, but do it correctly if at all, to not create a false impression of security. Adventures with weird machines thirty years after "Reflections on Trusting Trust" Sergey Bratus Sergey Bratus, Dartmouth College (and Julian Bangert and Rebecca Shapiro, not present) "Reflections on Trusting Trust" refers to Ken Thompson's classic 1984 paper. "You can't trust code that you did not totally create yourself." There's invisible links in the chain-of-trust, such as "well-installed microcode bugs" or in the compiler, and other planted bugs. Thompson showed how a compiler can introduce and propagate bugs in unmodified source. But suppose if there's no bugs and you trust the author, can you trust the code? Hell No! There's too many factors—it's Babylonian in nature. Why not? Well, Input is not well-defined/recognized (code's assumptions about "checked" input will be violated (bug/vunerabiliy). For example, HTML is recursive, but Regex checking is not recursive. Input well-formed but so complex there's no telling what it does For example, ELF file parsing is complex and has multiple ways of parsing. Input is seen differently by different pieces of program or toolchain Any Input is a program input executes on input handlers (drives state changes & transitions) only a well-defined execution model can be trusted (regex/DFA, PDA, CFG) Input handler either is a "recognizer" for the inputs as a well-defined language (see langsec.org) or it's a "virtual machine" for inputs to drive into pwn-age ELF ABI (UNIX/Linux executible file format) case study. Problems can arise from these steps (without planting bugs): compiler linker loader ld.so/rtld relocator DWARF (debugger info) exceptions The problem is you can't really automatically analyze code (it's the "halting problem" and undecidable). Only solution is to freeze code and sign it. But you can't freeze everything! Can't freeze ASLR or loading—must have tables and metadata. Any sufficiently complex input data is the same as VM byte code Example, ELF relocation entries + dynamic symbols == a Turing Complete Machine (TM). @bxsays created a Turing machine in Linux from relocation data (not code) in an ELF file. For more information, see Rebecca "bx" Shapiro's presentation from last year's Toorcon, "Programming Weird Machines with ELF Metadata" @bxsays did same thing with Mach-O bytecode Or a DWARF exception handling data .eh_frame + glibc == Turning Machine X86 MMU (IDT, GDT, TSS): used address translation to create a Turning Machine. Page handler reads and writes (on page fault) memory. Uses a page table, which can be used as Turning Machine byte code. Example on Github using this TM that will fly a glider across the screen Next Sergey talked about "Parser Differentials". That having one input format, but two parsers, will create confusion and opportunity for exploitation. For example, CSRs are parsed during creation by cert requestor and again by another parser at the CA. Another example is ELF—several parsers in OS tool chain, which are all different. Can have two different Program Headers (PHDRs) because ld.so parses multiple PHDRs. The second PHDR can completely transform the executable. This is described in paper in the first issue of International Journal of PoC. Conclusions trusting computers not only about bugs! Bugs are part of a problem, but no by far all of it complex data formats means bugs no "chain of trust" in Babylon! (that is, with parser differentials) we need to squeeze complexity out of data until data stops being "code equivalent" Further information See and langsec.org. USENIX WOOT 2013 (Workshop on Offensive Technologies) for "weird machines" papers and videos.

Read the article
Network Logon Issues with Group Policy and Network

- by bobloki

I am gravely in need of your help and assistance. We have a problem with our logon and startup to our Windows 7 Enterprise system. We have more than 3000 Windows Desktops situated in roughly 20+ buildings around campus. Almost every computer on campus has the problem that I will be describing. I have spent over one month peering over etl files from Windows Performance Analyzer (A great product) and hundreds of thousands of event logs. I come to you today humbled that I could not figure this out. The problem as simply put our logon times are extremely long. An average first time logon is roughly 2-10 minutes depending on the software installed. All computers are Windows 7, the oldest computers being 5 years old. Startup times on various computers range from good (1-2 minutes) to very bad (5-60). Our second time logons range from 30 seconds to 4 minutes. We have a gigabit connection between each computer on the network. We have 5 domain controllers which also double as our DNS servers. Initial testing led us to believe that this was a software problem. So I spent a few days testing machines only to find inconsistent results from the etl files from xperfview. Each subset of computers on campus had a different subset of software issues, none seeming to interfere with logon just startup. So I started looking at our group policy and located some very interesting event ID’s. Group Policy 1129: The processing of Group Policy failed because of lack of network connectivity to a domain controller. Group Policy 1055: The processing of Group Policy failed. Windows could not resolve the computer name. This could be caused by one of more of the following: a) Name Resolution failure on the current domain controller. b) Active Directory Replication Latency (an account created on another domain controller has not replicated to the current domain controller). NETLOGON 5719 : This computer was not able to set up a secure session with a domain controller in domain OURDOMAIN due to the following: There are currently no logon servers available to service the logon request. This may lead to authentication problems. Make sure that this computer is connected to the network. If the problem persists, please contact your domain administrator. E1kexpress 27: Intel®82567LM-3 Gigabit Network Connection – Network link is disconnected. NetBT 4300 – The driver could not be created. WMI 10 - Event filter with query "SELECT * FROM __InstanceModificationEvent WITHIN 60 WHERE TargetInstance ISA "Win32_Processor" AND TargetInstance.LoadPercentage 99" could not be reactivated in namespace "//./root/CIMV2" because of error 0x80041003. Events cannot be delivered through this filter until the problem is corrected. More or less with timestamps it becomes apparent that the network maybe the issue. 1:25:57 - Group Policy is trying to discover the domain controller information 1:25:57 - The network link has been disconnected 1:25:58 - The processing of Group Policy failed because of lack of network connectivity to a domain controller. This may be a transient condition. A success message would be generated once the machine gets connected to the domain controller and Group Policy has successfully processed. If you do not see a success message for several hours, then contact your administrator. 1:25:58 - Making LDAP calls to connect and bind to active directory. DC1.ourdomain.edu 1:25:58 - Call failed after 0 milliseconds. 1:25:58 - Forcing rediscovery of domain controller details. 1:25:58 - Group policy failed to discover the domain controller in 1030 milliseconds 1:25:58 - Periodic policy processing failed for computer OURDOMAIN\%name%$ in 1 seconds. 1:25:59 - A network link has been established at 1Gbps at full duplex 1:26:00 - The network link has been disconnected 1:26:02 - NtpClient was unable to set a domain peer to use as a time source because of discovery error. NtpClient will try again in 3473457 minutes and DOUBLE THE REATTEMPT INTERVAL thereafter. 1:26:05 - A network link has been established at 1Gbps at full duplex 1:26:08 - Name resolution for the name %Name% timed out after none of the configured DNS servers responded. 1:26:10 – The TCP/IP NetBIOS Helper service entered the running state. 1:26:11 - The time provider NtpClient is currently receiving valid time data at dc4.ourdomain.edu 1:26:14 – User Logon Notification for Customer Experience Improvement Program 1:26:15 - Group Policy received the notification Logon from Winlogon for session 1. 1:26:15 - Making LDAP calls to connect and bind to Active Directory. dc4.ourdomain.edu 1:26:18 - The LDAP call to connect and bind to Active Directory completed. dc4. ourdomain.edu. The call completed in 2309 milliseconds. 1:26:18 - Group Policy successfully discovered the Domain Controller in 2918 milliseconds. 1:26:18 - Computer details: Computer role : 2 Network name : (Blank) 1:26:18 - The LDAP call to connect and bind to Active Directory completed. dc4.ourdomain.edu. The call completed in 2309 milliseconds. 1:26:18 - Group Policy successfully discovered the Domain Controller in 2918 milliseconds. 1:26:19 - The WinHTTP Web Proxy Auto-Discovery Service service entered the running state. 1:26:46 - The Network Connections service entered the running state. 1:27:10 – Retrieved account information 1:27:10 – The system call to get account information completed. 1:27:10 - Starting policy processing due to network state change for computer OURDOMAIN\%name%$ 1:27:10 – Network state change detected 1:27:10 - Making system call to get account information. 1:27:11 - Making LDAP calls to connect and bind to Active Directory. dc4.ourdomain.edu 1:27:13 - Computer details: Computer role : 2 Network name : ourdomain.edu (Now not blank) 1:27:13 - Group Policy successfully discovered the Domain Controller in 2886 milliseconds. 1:27:13 - The LDAP call to connect and bind to Active Directory completed. dc4.ourdomain.edu The call completed in 2371 milliseconds. 1:27:15 - Estimated network bandwidth on one of the connections: 0 kbps. 1:27:15 - Estimated network bandwidth on one of the connections: 8545 kbps. 1:27:15 - A fast link was detected. The Estimated bandwidth is 8545 kbps. The slow link threshold is 500 kbps. 1:27:17 – Powershell - Engine state is changed from Available to Stopped. 1:27:20 - Completed Group Policy Local Users and Groups Extension Processing in 4539 milliseconds. 1:27:25 - Completed Group Policy Scheduled Tasks Extension Processing in 5210 milliseconds. 1:27:27 - Completed Group Policy Registry Extension Processing in 1529 milliseconds. 1:27:27 - Completed policy processing due to network state change for computer OURDOMAIN\%name%$ in 16 seconds. 1:27:27 – The Group Policy settings for the computer were processed successfully. There were no changes detected since the last successful processing of Group Policy. Any help would be appreciated. Please ask for any relevant information and it will be provided as soon as possible.

Read the article
10 Essential Tools for building ASP.NET Websites

- by Stephen Walther

I recently put together a simple public website created with ASP.NET for my company at Superexpert.com. I was surprised by the number of free tools that I ended up using to put together the website. Therefore, I thought it would be interesting to create a list of essential tools for building ASP.NET websites. These tools work equally well with both ASP.NET Web Forms and ASP.NET MVC. Performance Tools After reading Steve Souders two (very excellent) books on front-end website performance High Performance Web Sites and Even Faster Web Sites, I have been super sensitive to front-end website performance. According to Souders’ Performance Golden Rule: “Optimize front-end performance first, that's where 80% or more of the end-user response time is spent” You can use the tools below to reduce the size of the images, JavaScript files, and CSS files used by an ASP.NET application. 1. Sprite and Image Optimization Framework CSS sprites were first described in an article written for A List Apart entitled CSS sprites: Image Slicing’s Kiss of Death. When you use sprites, you combine multiple images used by a website into a single image. Next, you use CSS trickery to display particular sub-images from the combined image in a webpage. The primary advantage of sprites is that they reduce the number of requests required to display a webpage. Requesting a single large image is faster than requesting multiple small images. In general, the more resources – images, JavaScript files, CSS files – that must be moved across the wire, the slower your website. However, most people avoid using sprites because they require a lot of work. You need to combine all of the images and write just the right CSS rules to display the sub-images. The Microsoft Sprite and Image Optimization Framework enables you to avoid all of this work. The framework combines the images for you automatically. Furthermore, the framework includes an ASP.NET Web Forms control and an ASP.NET MVC helper that makes it easy to display the sub-images. You can download the Sprite and Image Optimization Framework from CodePlex at http://aspnet.codeplex.com/releases/view/50869. The Sprite and Image Optimization Framework was written by Morgan McClean who worked in the office next to mine at Microsoft. Morgan was a scary smart Intern from Canada and we discussed the Framework while he was building it (I was really excited to learn that he was working on it). Morgan added some great advanced features to this framework. For example, the Sprite and Image Optimization Framework supports something called image inlining. When you use image inlining, the actual image is stored in the CSS file. Here’s an example of what image inlining looks like: .Home_StephenWalther_small-jpg { width:75px; height:100px; background: url(data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAEsAAABkCAIAAABB1lpeAAAAB GdBTUEAALGOfPtRkwAAACBjSFJNAACHDwAAjA8AAP1SAACBQAAAfXkAAOmLAAA85QAAGcxzPIV3AAAKL s+zNfREAAAAASUVORK5CYII=) no-repeat 0% 0%; } The actual image (in this case a picture of me that is displayed on the home page of the Superexpert.com website) is stored in the CSS file. If you visit the Superexpert.com website then very few separate images are downloaded. For example, all of the images with a red border in the screenshot below take advantage of CSS sprites: Unfortunately, there are some significant Gotchas that you need to be aware of when using the Sprite and Image Optimization Framework. There are workarounds for these Gotchas. I plan to write about these Gotchas and workarounds in a future blog entry. 2. Microsoft Ajax Minifier Whenever possible you should combine, minify, compress, and cache with a far future header all of your JavaScript and CSS files. The Microsoft Ajax Minifier makes it easy to minify JavaScript and CSS files. Don’t confuse minification and compression. You need to do both. According to Souders, you can reduce the size of a JavaScript file by an additional 20% (on average) by minifying a JavaScript file after you compress the file. When you minify a JavaScript or CSS file, you use various tricks to reduce the size of the file before you compress the file. For example, you can minify a JavaScript file by replacing long JavaScript variables names with short variables names and removing unnecessary white space and comments. You can minify a CSS file by doing such things as replacing long color names such as #ffffff with shorter equivalents such as #fff. The Microsoft Ajax Minifier was created by Microsoft employee Ron Logan. Internally, this tool was being used by several large Microsoft websites. We also used the tool heavily on the ASP.NET team. I convinced Ron to publish the tool on CodePlex so that everyone in the world could take advantage of it. You can download the tool from the ASP.NET Ajax website and read documentation for the tool here. I created the installer for the Microsoft Ajax Minifier. When creating the installer, I also created a Visual Studio build task to make it easy to minify all of your JavaScript and CSS files whenever you do a build within Visual Studio automatically. Read the Ajax Minifier Quick Start to learn how to configure the build task. 3. ySlow The ySlow tool is a free add-on for Firefox created by Yahoo that enables you to test the front-end of your website. For example, here are the current test results for the Superexpert.com website: The Superexpert.com website has an overall score of B (not perfect but not bad). The ySlow tool is not perfect. For example, the Superexpert.com website received a failing grade of F for not using a Content Delivery Network even though the website using the Microsoft Ajax Content Delivery Network for JavaScript files such as jQuery. Uptime After publishing a website live to the world, you want to ensure that the website does not encounter any issues and that it stays live. I use the following tools to monitor the Superexpert.com website now that it is live. 4. ELMAH ELMAH stands for Error Logging Modules and Handlers for ASP.NET. ELMAH enables you to record any errors that happen at your website so you can review them in the future. You can download ELMAH for free from the ELMAH project website. ELMAH works great with both ASP.NET Web Forms and ASP.NET MVC. You can configure ELMAH to store errors in a number of different stores including XML files, the Event Log, an Access database, a SQL database, an Oracle database, or in computer RAM. You also can configure ELMAH to email error messages to you when they happen. By default, you can access ELMAH by requesting the elmah.axd page from a website with ELMAH installed. Here’s what the elmah page looks like from the Superexpert.com website (this page is password-protected because secret information can be revealed in an error message): If you click on a particular error message, you can view the original Yellow Screen ASP.NET error message (even when the error message was never displayed to the actual user). I installed ELMAH by taking advantage of the new package manager for ASP.NET named NuGet (originally named NuPack). You can read the details about NuGet in the following blog entry by Scott Guthrie. You can download NuGet from CodePlex. 5. Pingdom I use Pingdom to verify that the Superexpert.com website is always up. You can sign up for Pingdom by visiting Pingdom.com. You can use Pingdom to monitor a single website for free. At the Pingdom website, you configure the frequency that your website gets pinged. I verify that the Superexpert.com website is up every 5 minutes. I have the Pingdom service verify that it can retrieve the string “Contact Us” from the website homepage. If your website goes down, you can configure Pingdom so that it sends an email, Twitter, SMS, or iPhone alert. I use the Pingdom iPhone app which looks like this: 6. Host Tracker If your website does go down then you need some way of determining whether it is a problem with your local network or if your website is down for everyone. I use a website named Host-Tracker.com to check how badly a website is down. Here’s what the Host-Tracker website displays for the Superexpert.com website when the website can be successfully pinged from everywhere in the world: Notice that Host-Tracker pinged the Superexpert.com website from 68 locations including Roubaix, France and Scranton, PA. Debugging I mean debugging in the broadest possible sense. I use the following tools when building a website to verify that I have not made a mistake. 7. HTML Spell Checker Why doesn’t Visual Studio have a built-in spell checker? Don’t know – I’ve always found this mysterious. Fortunately, however, a former member of the ASP.NET team wrote a free spell checker that you can use with your ASP.NET pages. I find a spell checker indispensible. It is easy to delude yourself that you are capable of perfect spelling. I’m always super embarrassed when I actually run the spell checking tool and discover all of my spelling mistakes. The fastest way to add the HTML Spell Checker extension to Visual Studio is to select the menu option Tools, Extension Manager within Visual Studio. Click on Online Gallery and search for HTML Spell Checker: 8. IIS SEO Toolkit If people cannot find your website through Google then you should not even bother to create it. Microsoft has a great extension for IIS named the IIS Search Engine Optimization Toolkit that you can use to identify issue with your website that would hurt its page rank. You also can use this tool to quickly create a sitemap for your website that you can submit to Google or Bing. You can even generate the sitemap for an ASP.NET MVC website. Here’s what the report overview for the Superexpert.com website looks like: Notice that the Sueprexpert.com website had plenty of violations. For example, there are 65 cases in which a page has a broken hyperlink. You can drill into these violations to identity the exact page and location where these violations occur. 9. LinqPad If your ASP.NET website accesses a database then you should be using LINQ to Entities with the Entity Framework. Using LINQ involves some magic. LINQ queries written in C# get converted into SQL queries for you. If you are not careful about how you write your LINQ queries, you could unintentionally build a really badly performing website. LinqPad is a free tool that enables you to experiment with your LINQ queries. It even works with Microsoft SQL CE 4 and Azure. You can use LinqPad to execute a LINQ to Entities query and see the results. You also can use it to see the resulting SQL that gets executed against the database: 10. .NET Reflector I use .NET Reflector daily. The .NET Reflector tool enables you to take any assembly and disassemble the assembly into C# or VB.NET code. You can use .NET Reflector to see the “Source Code” of an assembly even when you do not have the actual source code. You can download a free version of .NET Reflector from the Redgate website. I use .NET Reflector primarily to help me understand what code is doing internally. For example, I used .NET Reflector with the Sprite and Image Optimization Framework to better understand how the MVC Image helper works. Here’s part of the disassembled code from the Image helper class: Summary In this blog entry, I’ve discussed several of the tools that I used to create the Superexpert.com website. These are tools that I use to improve the performance, improve the SEO, verify the uptime, or debug the Superexpert.com website. All of the tools discussed in this blog entry are free. Furthermore, all of these tools work with both ASP.NET Web Forms and ASP.NET MVC. Let me know if there are any tools that you use daily when building ASP.NET websites.

Read the article
Building The Right SharePoint Team For Your Organization

- by Mark Rackley

I see the question posted fairly often asking what kind SharePoint team an organization should have. How many people do I need? What roles do I need to fill? What is best for my organization? Well, just like every other answer in SharePoint, the correct answer is “it depends”. Do you ever get sick of hearing that??? I know I do… So, let me give you my thoughts and opinions based upon my experience and what I’ve seen and let you come to your own conclusions. What are the possible SharePoint roles? I guess the first thing you need to understand are the different roles that exist in SharePoint (and their are LOTS). Remember, SharePoint is a massive beast and you will NOT find one person who can do it all. If you are hoping to find that person you will be sorely disappointed. For the most part this is true in SharePoint 2007 and 2010. However, generally things are improved in 2010 and easier for junior individuals to grasp. SharePoint Administrator The absolutely positively only role that you should not be without no matter the size of your organization or SharePoint deployment is a SharePoint administrator. These guys are essential to keeping things running and figuring out what’s wrong when things aren’t running well. These unsung heroes do more before 10 am than I do all day. The bad thing is, when these guys are awesome, you don’t even know they exist because everything is running so smoothly. You should definitely invest some time and money here to make sure you have some competent if not rockstar help. You need an admin who truly loves SharePoint and will go that extra mile when necessary. Let me give you a real world example of what I’m talking about: We have a rockstar admin… and I’m sure she’s sick of my throwing her name around so she’ll just have to live with remaining anonymous in this post… sorry Lori… Anyway! A couple of weeks ago our Server teams came to us and said Hi Lori, I’m finalizing the MOSS servers and doing updates that require a restart; can I restart them? Seems like a harmless request from your server team does it not? Sure, go ahead and apply the patches and reboot during our scheduled maintenance window. No problem? right? Sounded fair to me… but no…. not to our fearless SharePoint admin… I need a complete list of patches that will be applied. There is an update that is out there that will break SharePoint… KB973917 is the patch that has been shown to cause issues. What? You mean Microsoft released a patch that would actually adversely affect SharePoint? If we did NOT have a rockstar admin, our server team would have applied these patches and then when some problem occurred in SharePoint we’d have to go through the fun task of tracking down exactly what caused the issue and resolve it. How much time would that have taken? If you have a junior SharePoint admin or an admin who’s not out there staying on top of what’s going on you could have spent days tracking down something so simple as applying a patch you should not have applied. I will even go as far to say the only SharePoint rockstar you NEED in your organization is a SharePoint admin. You can always outsource really complicated development projects or bring in a rockstar contractor every now and then to make sure you aren’t way off track in other areas. For your day-to-day sanity and to keep SharePoint running smoothly, you need an awesome Admin. Some rockstars in this category are: Ben Curry, Mike Watson, Joel Oleson, Todd Klindt, Shane Young, John Ferringer, Sean McDonough, and of course Lori Gowin. SharePoint Developer Another essential role for your SharePoint deployment is a SharePoint developer. Things do start to get a little hazy here and there are many flavors of “developers”. Are you writing custom code? using SharePoint Designer? What about SharePoint Branding? Are all of these considered developers? I would say yes. Are they interchangeable? I’d say no. Development in SharePoint is such a large beast in itself. I would say that it’s not so large that you can’t know it all well, but it is so large that there are many people who specialize in one particular category. If you are lucky enough to have someone on staff who knows it all well, you better make sure they are well taken care of because those guys are ready-made to move over to a consulting role and charge you 3 times what you are probably paying them. :) Some of the all-around rockstars are Eric Shupps, Andrew Connell (go Razorbacks), Rob Foster, Paul Schaeflein, and Todd Bleeker SharePoint Power User/No-Code Solutions Developer These SharePoint Swiss Army Knives are essential for quick wins in your organization. These people can twist the out-of-the-box functionality to make it do things you would not even imagine. Give these guys SharePoint Designer, jQuery, InfoPath, and a little time and they will create views, dashboards, and KPI’s that will blow your mind away and give your execs the “wow” they are looking for. Not only can they deliver that wow factor, but they can mashup, merge, and really help make your SharePoint application usable and deliver an overall better user experience. Before you hand off a project to your SharePoint Custom Code developer, let one of these rockstars look at it and show you what they can do (in probably less time). I would say the second most important role you can fill in your organization is one of these guys. Rockstars in this category are Christina Wheeler, Laura Rogers, Jennifer Mason, and Mark Miller SharePoint Developer – Custom Code If you want to really integrate SharePoint into your legacy systems, or really twist it and make it bend to your will, you are going to have to open up Visual Studio and write some custom code. Remember, SharePoint is essentially just a big, huge, ginormous .NET application, so you CAN write code to make it do ANYTHING, but do you really want to spend the time and effort to do so? At some point with every other form of SharePoint development you are going to run into SOME limitation (SPD Workflows is the big one that comes to mind). If you truly want to knock down all the walls then custom development is the way to go. PLEASE keep in mind when you are looking for a custom code developer that a .NET developer does NOT equal a SharePoint developer. Just SOME of the things these guys write are: Custom Workflows Custom Web Parts Web Service functionality Import data from legacy systems Export data to legacy systems Custom Actions Event Receivers Service Applications (2010) These guys are also the ones generally responsible for packaging everything up into solution packages (you are doing that, right?). Rockstars in this category are Phil Wicklund, Christina Wheeler, Geoff Varosky, and Brian Jackett. SharePoint Branding “But it LOOKS like SharePoint!” Somebody call the WAAAAAAAAAAAAHMbulance… Themes, Master Pages, Page Layouts, Zones, and over 2000 styles in CSS.. these guys not only have to be comfortable with all of SharePoint’s quirks and pain points when branding, but they have to know it TWICE for publishing and non-publishing sites. Not only that, but these guys really need to have an eye for graphic design and be able to translate the ramblings of business into something visually stunning. They also have to be comfortable with XSLT, XML, and be able to hand off what they do to your custom developers for them to package as solutions (which you are doing, right?). These rockstars include Heater Waterman, Cathy Dew, and Marcy Kellar SharePoint Architect SharePoint Architects are generally SharePoint Admins or Developers who have moved into more of a BA role? Is that fair to say? These guys really have a grasp and understanding for what SharePoint IS and what it can do. These guys help you structure your farms to meet your needs and help you design your applications the correct way. It’s always a good idea to bring in a rockstar SharePoint Architect to do a sanity check and make sure you aren’t doing anything stupid. Most organizations probably do not have a rockstar architect on staff. These guys are generally brought in at the deployment of a farm, upgrade of a farm, or for large development projects. I personally also find architects very useful for sitting down with the business to translate their needs into what SharePoint can do. A good architect will be able to pick out what can be done out-of-the-box and what has to be custom built and hand those requirements to the development Staff. Architects can generally fill in as an admin or a developer when needed. Some rockstar architects are Rick Taylor, Dan Usher, Bill English, Spence Harbar, Neil Hodgkins, Eric Harlan, and Bjørn Furuknap. Other Roles / Specialties On top of all these other roles you also get these people who specialize in things like Reporting, BDC (BCS in 2010), Search, Performance, Security, Project Management, etc... etc... etc... Again, most organizations will not have one of these gurus on staff, they’ll just pay out the nose for them when they need them. :) SharePoint End User Everyone else in your organization that touches SharePoint falls into this category. What they actually DO in SharePoint is determined by your governance and what permissions you give these guys. Hopefully you have these guys on a fairly short leash and are NOT giving them access to tools like SharePoint Designer. Sadly end users are the ones who truly make your deployment a success by using it, but are also your biggest enemy in breaking it. :) We love you guys… really!!! Okay, all that’s fine and dandy, but what should MY SharePoint team look like? It depends! Okay… Are you just doing out of the box team sites with no custom development? Then you are probably fine with a great Admin team and a great No-Code Solution Development team. How many people do you need? Depends on how busy you can keep them. Sorry, can’t answer the question about numbers without knowing your specific needs. I can just tell you who you MIGHT need and what they will do for you. I’ll leave you with what my ideal SharePoint Team would look like for a particular scenario: Farm / Organization Structure Dev, QA, and 2 Production Farms. 5000 – 10000 Users Custom Development and Integration with legacy systems Team Sites, My Sites, Intranet, Document libraries and overall company collaboration Team Rockstar SharePoint Administrator 2-3 junior SharePoint Administrators SharePoint Architect / Lead Developer 2 Power User / No-Code Solution Developers 2-3 Custom Code developers Branding expert With a team of that size and skill set, they should be able to keep a substantial SharePoint deployment running smoothly and meet your business needs. This does NOT mean that you would not need to bring in contract help from time to time when you need an uber specialist in one area. Also, this team assumes there will be ongoing development for the life of your SharePoint farm. If you are just going to be doing sporadic custom development, it might make sense to partner with an awesome firm that specializes in that sort of work (I can give you the name of a couple if you are interested). Again though, the size of your team depends on the number of requests you are receiving and how much active deployment you are doing. So, don’t bring in a team that looks like this and then yell at me because they are sitting around with nothing to do or are so overwhelmed that nothing is getting done. I do URGE you to take the proper time to asses your needs and determine what team is BEST for your organization. Also, PLEASE PLEASE PLEASE do not skimp on the talent. When it comes to SharePoint you really do get what you pay for when it comes to employees, contractors, and software. SharePoint can become absolutely critical to your business and because you skimped on hiring a developer he created a web part that brings down the farm because he doesn’t know what he’s doing, or you hire an admin who thinks it’s fine to stick everything in the same Content Database and then can’t figure out why people are complaining. SharePoint can be an enormous blessing to an organization or it’s biggest curse. Spend the time and money to do it right, or be prepared to spending even more time and money later to fix it.

Read the article
Help analyzing traceroute

- by Abdulla

Hello, my name is Abdulla and I'm from Kuwait. Sorry for my question as I know its not technically challenging. I'm facing some problems with my internet connection. My company has a DSL 2mb connection. My main problem is latency, in the morning its good but after that its gets really bad. My Internet provider says there's nothing wrong and that everything is working perfectly. I tried to explain to them the latency issue but they say that as long as I'm getting the download speed there isn't anything I can do about it. I only want to know if this is true and that the company can't do anything before I change my internet provider, as I feel that the guys at the contact center might getting back to me without asking tech support. Below are 2 traces I made, one in the morning and the other in the afternoon: This was taken around 17:00 Microsoft Windows XP [Version 5.1.2600] (C) Copyright 1985-2001 Microsoft Corp. C:\Documents and Settings\Administrator>ping google.com Pinging google.com [66.102.9.104] with 32 bytes of data: Reply from 66.102.9.104: bytes=32 time=387ms TTL=49 Reply from 66.102.9.104: bytes=32 time=388ms TTL=49 Reply from 66.102.9.104: bytes=32 time=375ms TTL=49 Reply from 66.102.9.104: bytes=32 time=375ms TTL=49 Ping statistics for 66.102.9.104: Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 375ms, Maximum = 388ms, Average = 381ms C:\Documents and Settings\Administrator>ping google.com /t Pinging google.com [66.102.9.104] with 32 bytes of data: Reply from 66.102.9.104: bytes=32 time=376ms TTL=49 Reply from 66.102.9.104: bytes=32 time=382ms TTL=49 Reply from 66.102.9.104: bytes=32 time=371ms TTL=49 Reply from 66.102.9.104: bytes=32 time=378ms TTL=49 Reply from 66.102.9.104: bytes=32 time=374ms TTL=49 Reply from 66.102.9.104: bytes=32 time=371ms TTL=49 Reply from 66.102.9.104: bytes=32 time=365ms TTL=49 Reply from 66.102.9.104: bytes=32 time=366ms TTL=49 Reply from 66.102.9.104: bytes=32 time=353ms TTL=49 Reply from 66.102.9.104: bytes=32 time=331ms TTL=49 Reply from 66.102.9.104: bytes=32 time=333ms TTL=49 Reply from 66.102.9.104: bytes=32 time=348ms TTL=49 Reply from 66.102.9.104: bytes=32 time=365ms TTL=49 Reply from 66.102.9.104: bytes=32 time=346ms TTL=49 Reply from 66.102.9.104: bytes=32 time=335ms TTL=49 Reply from 66.102.9.104: bytes=32 time=340ms TTL=49 Reply from 66.102.9.104: bytes=32 time=344ms TTL=49 Reply from 66.102.9.104: bytes=32 time=333ms TTL=49 Reply from 66.102.9.104: bytes=32 time=328ms TTL=49 Reply from 66.102.9.104: bytes=32 time=332ms TTL=49 Reply from 66.102.9.104: bytes=32 time=326ms TTL=49 Reply from 66.102.9.104: bytes=32 time=333ms TTL=49 Reply from 66.102.9.104: bytes=32 time=325ms TTL=49 Reply from 66.102.9.104: bytes=32 time=333ms TTL=49 Reply from 66.102.9.104: bytes=32 time=338ms TTL=49 Reply from 66.102.9.104: bytes=32 time=341ms TTL=49 Ping statistics for 66.102.9.104: Packets: Sent = 26, Received = 26, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 325ms, Maximum = 382ms, Average = 348ms Control-C ^C C:\Documents and Settings\Administrator>travert google.com 'travert' is not recognized as an internal or external command, operable program or batch file. C:\Documents and Settings\Administrator>tracert google.com Tracing route to google.com [66.102.9.104] over a maximum of 30 hops: 1 <1 ms <1 ms <1 ms 192.168.0.1 2 6 ms 6 ms 6 ms 80-184-31-1.adsl.kems.net [80.184.31.1] 3 7 ms 7 ms 8 ms 168.187.0.226 4 7 ms 8 ms 9 ms 168.187.0.125 5 180 ms 187 ms 188 ms if-11-2.core1.RSD-Riyad.as6453.net [116.0.78.89] 6 209 ms 222 ms 204 ms 195.219.167.57 7 541 ms 536 ms 540 ms 195.219.167.42 8 553 ms 552 ms 538 ms Vlan1102.icore1.PVU-Paris.as6453.net [195.219.24 1.109] 9 547 ms 543 ms 542 ms xe-9-1-0.edge4.paris1.level3.net [4.68.110.213] 10 540 ms 523 ms 531 ms ae-33-51.ebr1.Paris1.Level3.net [4.69.139.193] 11 755 ms 761 ms 695 ms ae-45-45.ebr1.London1.Level3.net [4.69.143.101] 12 271 ms 263 ms 400 ms ae-11-51.car1.London1.Level3.net [4.69.139.66] 13 701 ms 730 ms 742 ms 195.50.118.210 14 659 ms 641 ms 660 ms 209.85.255.76 15 280 ms 283 ms 292 ms 209.85.251.190 16 308 ms 293 ms 296 ms 72.14.232.239 17 679 ms 700 ms 721 ms 64.233.174.18 18 268 ms 281 ms 269 ms lm-in-f104.1e100.net [66.102.9.104] Trace complete. C:\Documents and Settings\Administrator> This was taken at 10:00am Microsoft Windows XP [Version 5.1.2600] (C) Copyright 1985-2001 Microsoft Corp. C:\Documents and Settings\Administrator>ping google.com Pinging google.com [66.102.9.106] with 32 bytes of data: Reply from 66.102.9.106: bytes=32 time=110ms TTL=49 Reply from 66.102.9.106: bytes=32 time=111ms TTL=49 Reply from 66.102.9.106: bytes=32 time=112ms TTL=49 Reply from 66.102.9.106: bytes=32 time=120ms TTL=49 Ping statistics for 66.102.9.106: Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 110ms, Maximum = 120ms, Average = 113ms C:\Documents and Settings\Administrator>ping google.com /t Pinging google.com [66.102.9.106] with 32 bytes of data: Reply from 66.102.9.106: bytes=32 time=109ms TTL=49 Reply from 66.102.9.106: bytes=32 time=110ms TTL=49 Reply from 66.102.9.106: bytes=32 time=111ms TTL=49 Reply from 66.102.9.106: bytes=32 time=111ms TTL=49 Reply from 66.102.9.106: bytes=32 time=112ms TTL=49 Reply from 66.102.9.106: bytes=32 time=112ms TTL=49 Reply from 66.102.9.106: bytes=32 time=116ms TTL=49 Reply from 66.102.9.106: bytes=32 time=110ms TTL=49 Reply from 66.102.9.106: bytes=32 time=109ms TTL=49 Reply from 66.102.9.106: bytes=32 time=110ms TTL=49 Reply from 66.102.9.106: bytes=32 time=109ms TTL=49 Reply from 66.102.9.106: bytes=32 time=110ms TTL=49 Reply from 66.102.9.106: bytes=32 time=112ms TTL=49 Reply from 66.102.9.106: bytes=32 time=109ms TTL=49 Reply from 66.102.9.106: bytes=32 time=110ms TTL=49 Reply from 66.102.9.106: bytes=32 time=115ms TTL=49 Reply from 66.102.9.106: bytes=32 time=110ms TTL=49 Reply from 66.102.9.106: bytes=32 time=109ms TTL=49 Reply from 66.102.9.106: bytes=32 time=110ms TTL=49 Reply from 66.102.9.106: bytes=32 time=113ms TTL=49 Reply from 66.102.9.106: bytes=32 time=115ms TTL=49 Reply from 66.102.9.106: bytes=32 time=109ms TTL=49 Reply from 66.102.9.106: bytes=32 time=110ms TTL=49 Ping statistics for 66.102.9.106: Packets: Sent = 32, Received = 32, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 109ms, Maximum = 135ms, Average = 112ms Control-C ^C C:\Documents and Settings\Administrator>tracert google.com Tracing route to google.com [66.102.9.104] over a maximum of 30 hops: 1 <1 ms <1 ms <1 ms 192.168.0.1 2 6 ms 6 ms 6 ms 80-184-31-1.adsl.kems.net [80.184.31.1] 3 8 ms 7 ms 6 ms 168.187.0.226 4 6 ms 7 ms 7 ms 168.187.0.125 5 20 ms 20 ms 18 ms if-11-2.core1.RSD-Riyad.as6453.net [116.0.78.89] 6 171 ms 205 ms 215 ms 195.219.167.57 7 191 ms 215 ms 226 ms 195.219.167.42 8 * 103 ms 94 ms Vlan1102.icore1.PVU-Paris.as6453.net [195.219.24 1.109] 9 94 ms 95 ms 97 ms xe-9-1-0.edge4.paris1.level3.net [4.68.110.213] 10 94 ms 94 ms 94 ms ae-33-51.ebr1.Paris1.Level3.net [4.69.139.193] 11 101 ms 101 ms 101 ms ae-48-48.ebr1.London1.Level3.net [4.69.143.113] 12 102 ms 102 ms 101 ms ae-11-51.car1.London1.Level3.net [4.69.139.66] 13 103 ms 102 ms 103 ms 195.50.118.210 14 137 ms 103 ms 100 ms 209.85.255.76 15 130 ms 124 ms 124 ms 209.85.251.190 16 114 ms 116 ms 116 ms 72.14.232.239 17 135 ms 113 ms 126 ms 64.233.174.18 18 126 ms 125 ms 127 ms lm-in-f104.1e100.net [66.102.9.104] Trace complete. C:\Documents and Settings\Administrator>

Read the article
Basic Spatial Data with SQL Server and Entity Framework 5.0

- by Rick Strahl

In my most recent project we needed to do a bit of geo-spatial referencing. While spatial features have been in SQL Server for a while using those features inside of .NET applications hasn't been as straight forward as could be, because .NET natively doesn't support spatial types. There are workarounds for this with a few custom project like SharpMap or a hack using the Sql Server specific Geo types found in the Microsoft.SqlTypes assembly that ships with SQL server. While these approaches work for manipulating spatial data from .NET code, they didn't work with database access if you're using Entity Framework. Other ORM vendors have been rolling their own versions of spatial integration. In Entity Framework 5.0 running on .NET 4.5 the Microsoft ORM finally adds support for spatial types as well. In this post I'll describe basic geography features that deal with single location and distance calculations which is probably the most common usage scenario. SQL Server Transact-SQL Syntax for Spatial Data Before we look at how things work with Entity framework, lets take a look at how SQL Server allows you to use spatial data to get an understanding of the underlying semantics. The following SQL examples should work with SQL 2008 and forward. Let's start by creating a test table that includes a Geography field and also a pair of Long/Lat fields that demonstrate how you can work with the geography functions even if you don't have geography/geometry fields in the database. Here's the CREATE command:CREATE TABLE [dbo].[Geo]( [id] [int] IDENTITY(1,1) NOT NULL, [Location] [geography] NULL, [Long] [float] NOT NULL, [Lat] [float] NOT NULL ) Now using plain SQL you can insert data into the table using geography::STGeoFromText SQL CLR function:insert into Geo( Location , long, lat ) values ( geography::STGeomFromText ('POINT(-121.527200 45.712113)', 4326), -121.527200, 45.712113 ) insert into Geo( Location , long, lat ) values ( geography::STGeomFromText ('POINT(-121.517265 45.714240)', 4326), -121.517265, 45.714240 ) insert into Geo( Location , long, lat ) values ( geography::STGeomFromText ('POINT(-121.511536 45.714825)', 4326), -121.511536, 45.714825) The STGeomFromText function accepts a string that points to a geometric item (a point here but can also be a line or path or polygon and many others). You also need to provide an SRID (Spatial Reference System Identifier) which is an integer value that determines the rules for how geography/geometry values are calculated and returned. For mapping/distance functionality you typically want to use 4326 as this is the format used by most mapping software and geo-location libraries like Google and Bing. The spatial data in the Location field is stored in binary format which looks something like this: Once the location data is in the database you can query the data and do simple distance computations very easily. For example to calculate the distance of each of the values in the database to another spatial point is very easy to calculate. Distance calculations compare two points in space using a direct line calculation. For our example I'll compare a new point to all the points in the database. Using the Location field the SQL looks like this:-- create a source point DECLARE @s geography SET @s = geography:: STGeomFromText('POINT(-121.527200 45.712113)' , 4326); --- return the ids select ID, Location as Geo , Location .ToString() as Point , @s.STDistance( Location) as distance from Geo order by distance The code defines a new point which is the base point to compare each of the values to. You can also compare values from the database directly, but typically you'll want to match a location to another location and determine the difference for which you can use the geography::STDistance function. This query produces the following output: The STDistance function returns the straight line distance between the passed in point and the point in the database field. The result for SRID 4326 is always in meters. Notice that the first value passed was the same point so the difference is 0. The other two points are two points here in town in Hood River a little ways away - 808 and 1256 meters respectively. Notice also that you can order the result by the resulting distance, which effectively gives you results that are ordered radially out from closer to further away. This is great for searches of points of interest near a central location (YOU typically!). These geolocation functions are also available to you if you don't use the Geography/Geometry types, but plain float values. It's a little more work, as each point has to be created in the query using the string syntax, but the following code doesn't use a geography field but produces the same result as the previous query.--- using float fields select ID, geography::STGeomFromText ('POINT(' + STR (long, 15,7 ) + ' ' + Str(lat ,15, 7) + ')' , 4326), geography::STGeomFromText ('POINT(' + STR (long, 15,7 ) + ' ' + Str(lat ,15, 7) + ')' , 4326). ToString(), @s.STDistance( geography::STGeomFromText ('POINT(' + STR(long ,15, 7) + ' ' + Str(lat ,15, 7) + ')' , 4326)) as distance from geo order by distance Spatial Data in the Entity Framework Prior to Entity Framework 5.0 on .NET 4.5 consuming of the data above required using stored procedures or raw SQL commands to access the spatial data. In Entity Framework 5 however, Microsoft introduced the new DbGeometry and DbGeography types. These immutable location types provide a bunch of functionality for manipulating spatial points using geometry functions which in turn can be used to do common spatial queries like I described in the SQL syntax above. The DbGeography/DbGeometry types are immutable, meaning that you can't write to them once they've been created. They are a bit odd in that you need to use factory methods in order to instantiate them - they have no constructor() and you can't assign to properties like Latitude and Longitude. Creating a Model with Spatial Data Let's start by creating a simple Entity Framework model that includes a Location property of type DbGeography: public class GeoLocationContext : DbContext { public DbSet<GeoLocation> Locations { get; set; } } public class GeoLocation { public int Id { get; set; } public DbGeography Location { get; set; } public string Address { get; set; } } That's all there's to it. When you run this now against SQL Server, you get a Geography field for the Location property, which looks the same as the Location field in the SQL examples earlier. Adding Spatial Data to the Database Next let's add some data to the table that includes some latitude and longitude data. An easy way to find lat/long locations is to use Google Maps to pinpoint your location, then right click and click on What's Here. Click on the green marker to get the GPS coordinates. To add the actual geolocation data create an instance of the GeoLocation type and use the DbGeography.PointFromText() factory method to create a new point to assign to the Location property:[TestMethod] public void AddLocationsToDataBase() { var context = new GeoLocationContext(); // remove all context.Locations.ToList().ForEach( loc => context.Locations.Remove(loc)); context.SaveChanges(); var location = new GeoLocation() { // Create a point using native DbGeography Factory method Location = DbGeography.PointFromText( string.Format("POINT({0} {1})", -121.527200,45.712113) ,4326), Address = "301 15th Street, Hood River" }; context.Locations.Add(location); location = new GeoLocation() { Location = CreatePoint(45.714240, -121.517265), Address = "The Hatchery, Bingen" }; context.Locations.Add(location); location = new GeoLocation() { // Create a point using a helper function (lat/long) Location = CreatePoint(45.708457, -121.514432), Address = "Kaze Sushi, Hood River" }; context.Locations.Add(location); location = new GeoLocation() { Location = CreatePoint(45.722780, -120.209227), Address = "Arlington, OR" }; context.Locations.Add(location); context.SaveChanges(); } As promised, a DbGeography object has to be created with one of the static factory methods provided on the type as the Location.Longitude and Location.Latitude properties are read only. Here I'm using PointFromText() which uses a "Well Known Text" format to specify spatial data. In the first example I'm specifying to create a Point from a longitude and latitude value, using an SRID of 4326 (just like earlier in the SQL examples). You'll probably want to create a helper method to make the creation of Points easier to avoid that string format and instead just pass in a couple of double values. Here's my helper called CreatePoint that's used for all but the first point creation in the sample above:public static DbGeography CreatePoint(double latitude, double longitude) { var text = string.Format(CultureInfo.InvariantCulture.NumberFormat, "POINT({0} {1})", longitude, latitude); // 4326 is most common coordinate system used by GPS/Maps return DbGeography.PointFromText(text, 4326); } Using the helper the syntax becomes a bit cleaner, requiring only a latitude and longitude respectively. Note that my method intentionally swaps the parameters around because Latitude and Longitude is the common format I've seen with mapping libraries (especially Google Mapping/Geolocation APIs with their LatLng type). When the context is changed the data is written into the database using the SQL Geography type which looks the same as in the earlier SQL examples shown. Querying Once you have some location data in the database it's now super easy to query the data and find out the distance between locations. A common query is to ask for a number of locations that are near a fixed point - typically your current location and order it by distance. Using LINQ to Entities a query like this is easy to construct:[TestMethod] public void QueryLocationsTest() { var sourcePoint = CreatePoint(45.712113, -121.527200); var context = new GeoLocationContext(); // find any locations within 5 kilometers ordered by distance var matches = context.Locations .Where(loc => loc.Location.Distance(sourcePoint) < 5000) .OrderBy( loc=> loc.Location.Distance(sourcePoint) ) .Select( loc=> new { Address = loc.Address, Distance = loc.Location.Distance(sourcePoint) }); Assert.IsTrue(matches.Count() > 0); foreach (var location in matches) { Console.WriteLine("{0} ({1:n0} meters)", location.Address, location.Distance); } } This example produces: 301 15th Street, Hood River (0 meters)The Hatchery, Bingen (809 meters)Kaze Sushi, Hood River (1,074 meters) The first point in the database is the same as my source point I'm comparing against so the distance is 0. The other two are within the 5 mile radius, while the Arlington location which is 65 miles or so out is not returned. The result is ordered by distance from closest to furthest away. In the code, I first create a source point that is the basis for comparison. The LINQ query then selects all locations that are within 5km of the source point using the Location.Distance() function, which takes a source point as a parameter. You can either use a pre-defined value as I'm doing here, or compare against another database DbGeography property (say when you have to points in the same database for things like routes). What's nice about this query syntax is that it's very clean and easy to read and understand. You can calculate the distance and also easily order by the distance to provide a result that shows locations from closest to furthest away which is a common scenario for any application that places a user in the context of several locations. It's now super easy to accomplish this. Meters vs. Miles As with the SQL Server functions, the Distance() method returns data in meters, so if you need to work with miles or feet you need to do some conversion. Here are a couple of helpers that might be useful (can be found in GeoUtils.cs of the sample project):/// <summary> /// Convert meters to miles /// </summary> /// <param name="meters"></param> /// <returns></returns> public static double MetersToMiles(double? meters) { if (meters == null) return 0F; return meters.Value * 0.000621371192; } /// <summary> /// Convert miles to meters /// </summary> /// <param name="miles"></param> /// <returns></returns> public static double MilesToMeters(double? miles) { if (miles == null) return 0; return miles.Value * 1609.344; } Using these two helpers you can query on miles like this:[TestMethod] public void QueryLocationsMilesTest() { var sourcePoint = CreatePoint(45.712113, -121.527200); var context = new GeoLocationContext(); // find any locations within 5 miles ordered by distance var fiveMiles = GeoUtils.MilesToMeters(5); var matches = context.Locations .Where(loc => loc.Location.Distance(sourcePoint) <= fiveMiles) .OrderBy(loc => loc.Location.Distance(sourcePoint)) .Select(loc => new { Address = loc.Address, Distance = loc.Location.Distance(sourcePoint) }); Assert.IsTrue(matches.Count() > 0); foreach (var location in matches) { Console.WriteLine("{0} ({1:n1} miles)", location.Address, GeoUtils.MetersToMiles(location.Distance)); } } which produces: 301 15th Street, Hood River (0.0 miles)The Hatchery, Bingen (0.5 miles)Kaze Sushi, Hood River (0.7 miles) Nice 'n simple. .NET 4.5 Only Note that DbGeography and DbGeometry are exclusive to Entity Framework 5.0 (not 4.4 which ships in the same NuGet package or installer) and requires .NET 4.5. That's because the new DbGeometry and DbGeography (and related) types are defined in the 4.5 version of System.Data.Entity which is a CLR assembly and is only updated by major versions of .NET. Why this decision was made to add these types to System.Data.Entity rather than to the frequently updated EntityFramework assembly that would have possibly made this work in .NET 4.0 is beyond me, especially given that there are no native .NET framework spatial types to begin with. I find it also odd that there is no native CLR spatial type. The DbGeography and DbGeometry types are specific to Entity Framework and live on those assemblies. They will also work for general purpose, non-database spatial data manipulation, but then you are forced into having a dependency on System.Data.Entity, which seems a bit silly. There's also a System.Spatial assembly that's apparently part of WCF Data Services which in turn don't work with Entity framework. Another example of multiple teams at Microsoft not communicating and implementing the same functionality (differently) in several different places. Perplexed as a I may be, for EF specific code the Entity framework specific types are easy to use and work well. Working with pre-.NET 4.5 Entity Framework and Spatial Data If you can't go to .NET 4.5 just yet you can also still use spatial features in Entity Framework, but it's a lot more work as you can't use the DbContext directly to manipulate the location data. You can still run raw SQL statements to write data into the database and retrieve results using the same TSQL syntax I showed earlier using Context.Database.ExecuteSqlCommand(). Here's code that you can use to add location data into the database:[TestMethod] public void RawSqlEfAddTest() { string sqlFormat = @"insert into GeoLocations( Location, Address) values ( geography::STGeomFromText('POINT({0} {1})', 4326),@p0 )"; var sql = string.Format(sqlFormat,-121.527200, 45.712113); Console.WriteLine(sql); var context = new GeoLocationContext(); Assert.IsTrue(context.Database.ExecuteSqlCommand(sql,"301 N. 15th Street") > 0); } Here I'm using the STGeomFromText() function to add the location data. Note that I'm using string.Format here, which usually would be a bad practice but is required here. I was unable to use ExecuteSqlCommand() and its named parameter syntax as the longitude and latitude parameters are embedded into a string. Rest assured it's required as the following does not work:string sqlFormat = @"insert into GeoLocations( Location, Address) values ( geography::STGeomFromText('POINT(@p0 @p1)', 4326),@p2 )";context.Database.ExecuteSqlCommand(sql, -121.527200, 45.712113, "301 N. 15th Street") Explicitly assigning the point value with string.format works however. There are a number of ways to query location data. You can't get the location data directly, but you can retrieve the point string (which can then be parsed to get Latitude and Longitude) and you can return calculated values like distance. Here's an example of how to retrieve some geo data into a resultset using EF's and SqlQuery method:[TestMethod] public void RawSqlEfQueryTest() { var sqlFormat = @" DECLARE @s geography SET @s = geography:: STGeomFromText('POINT({0} {1})' , 4326); SELECT Address, Location.ToString() as GeoString, @s.STDistance( Location) as Distance FROM GeoLocations ORDER BY Distance"; var sql = string.Format(sqlFormat, -121.527200, 45.712113); var context = new GeoLocationContext(); var locations = context.Database.SqlQuery<ResultData>(sql); Assert.IsTrue(locations.Count() > 0); foreach (var location in locations) { Console.WriteLine(location.Address + " " + location.GeoString + " " + location.Distance); } } public class ResultData { public string GeoString { get; set; } public double Distance { get; set; } public string Address { get; set; } } Hopefully you don't have to resort to this approach as it's fairly limited. Using the new DbGeography/DbGeometry types makes this sort of thing so much easier. When I had to use code like this before I typically ended up retrieving data pks only and then running another query with just the PKs to retrieve the actual underlying DbContext entities. This was very inefficient and tedious but it did work. Summary For the current project I'm working on we actually made the switch to .NET 4.5 purely for the spatial features in EF 5.0. This app heavily relies on spatial queries and it was worth taking a chance with pre-release code to get this ease of integration as opposed to manually falling back to stored procedures or raw SQL string queries to return spatial specific queries. Using native Entity Framework code makes life a lot easier than the alternatives. It might be a late addition to Entity Framework, but it sure makes location calculations and storage easy. Where do you want to go today? ;-) Resources Download Sample Project© Rick Strahl, West Wind Technologies, 2005-2012Posted in ADO.NET Sql Server .NET Tweet !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs"); (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

Read the article
How to find and fix performance problems in ORM powered applications

- by FransBouma

Once in a while we get requests about how to fix performance problems with our framework. As it comes down to following the same steps and looking into the same things every single time, I decided to write a blogpost about it instead, so more people can learn from this and solve performance problems in their O/R mapper powered applications. In some parts it's focused on LLBLGen Pro but it's also usable for other O/R mapping frameworks, as the vast majority of performance problems in O/R mapper powered applications are not specific for a certain O/R mapper framework. Too often, the developer looks at the wrong part of the application, trying to fix what isn't a problem in that part, and getting frustrated that 'things are so slow with <insert your favorite framework X here>'. I'm in the O/R mapper business for a long time now (almost 10 years, full time) and as it's a small world, we O/R mapper developers know almost all tricks to pull off by now: we all know what to do to make task ABC faster and what compromises (because there are almost always compromises) to deal with if we decide to make ABC faster that way. Some O/R mapper frameworks are faster in X, others in Y, but you can be sure the difference is mainly a result of a compromise some developers are willing to deal with and others aren't. That's why the O/R mapper frameworks on the market today are different in many ways, even though they all fetch and save entities from and to a database. I'm not suggesting there's no room for improvement in today's O/R mapper frameworks, there always is, but it's not a matter of 'the slowness of the application is caused by the O/R mapper' anymore. Perhaps query generation can be optimized a bit here, row materialization can be optimized a bit there, but it's mainly coming down to milliseconds. Still worth it if you're a framework developer, but it's not much compared to the time spend inside databases and in user code: if a complete fetch takes 40ms or 50ms (from call to entity object collection), it won't make a difference for your application as that 10ms difference won't be noticed. That's why it's very important to find the real locations of the problems so developers can fix them properly and don't get frustrated because their quest to get a fast, performing application failed. Performance tuning basics and rules Finding and fixing performance problems in any application is a strict procedure with four prescribed steps: isolate, analyze, interpret and fix, in that order. It's key that you don't skip a step nor make assumptions: these steps help you find the reason of a problem which seems to be there, and how to fix it or leave it as-is. Skipping a step, or when you assume things will be bad/slow without doing analysis will lead to the path of premature optimization and won't actually solve your problems, only create new ones. The most important rule of finding and fixing performance problems in software is that you have to understand what 'performance problem' actually means. Most developers will say "when a piece of software / code is slow, you have a performance problem". But is that actually the case? If I write a Linq query which will aggregate, group and sort 5 million rows from several tables to produce a resultset of 10 rows, it might take more than a couple of milliseconds before that resultset is ready to be consumed by other logic. If I solely look at the Linq query, the code consuming the resultset of the 10 rows and then look at the time it takes to complete the whole procedure, it will appear to me to be slow: all that time taken to produce and consume 10 rows? But if you look closer, if you analyze and interpret the situation, you'll see it does a tremendous amount of work, and in that light it might even be extremely fast. With every performance problem you encounter, always do realize that what you're trying to solve is perhaps not a technical problem at all, but a perception problem. The second most important rule you have to understand is based on the old saying "Penny wise, Pound Foolish": the part which takes e.g. 5% of the total time T for a given task isn't worth optimizing if you have another part which takes a much larger part of the total time T for that same given task. Optimizing parts which are relatively insignificant for the total time taken is not going to bring you better results overall, even if you totally optimize that part away. This is the core reason why analysis of the complete set of application parts which participate in a given task is key to being successful in solving performance problems: No analysis -> no problem -> no solution. One warning up front: hunting for performance will always include making compromises. Fast software can be made maintainable, but if you want to squeeze as much performance out of your software, you will inevitably be faced with the dilemma of compromising one or more from the group {readability, maintainability, features} for the extra performance you think you'll gain. It's then up to you to decide whether it's worth it. In almost all cases it's not. The reason for this is simple: the vast majority of performance problems can be solved by implementing the proper algorithms, the ones with proven Big O-characteristics so you know the performance you'll get plus you know the algorithm will work. The time taken by the algorithm implementing code is inevitable: you already implemented the best algorithm. You might find some optimizations on the technical level but in general these are minor. Let's look at the four steps to see how they guide us through the quest to find and fix performance problems. Isolate The first thing you need to do is to isolate the areas in your application which are assumed to be slow. For example, if your application is a web application and a given page is taking several seconds or even minutes to load, it's a good candidate to check out. It's important to start with the isolate step because it allows you to focus on a single code path per area with a clear begin and end and ignore the rest. The rest of the steps are taken per identified problematic area. Keep in mind that isolation focuses on tasks in an application, not code snippets. A task is something that's started in your application by either another task or the user, or another program, and has a beginning and an end. You can see a task as a piece of functionality offered by your application. Analyze Once you've determined the problem areas, you have to perform analysis on the code paths of each area, to see where the performance problems occur and which areas are not the problem. This is a multi-layered effort: an application which uses an O/R mapper typically consists of multiple parts: there's likely some kind of interface (web, webservice, windows etc.), a part which controls the interface and business logic, the O/R mapper part and the RDBMS, all connected with either a network or inter-process connections provided by the OS or other means. Each of these parts, including the connectivity plumbing, eat up a part of the total time it takes to complete a task, e.g. load a webpage with all orders of a given customer X. To understand which parts participate in the task / area we're investigating and how much they contribute to the total time taken to complete the task, analysis of each participating task is essential. Start with the code you wrote which starts the task, analyze the code and track the path it follows through your application. What does the code do along the way, verify whether it's correct or not. Analyze whether you have implemented the right algorithms in your code for this particular area. Remember we're looking at one area at a time, which means we're ignoring all other code paths, just the code path of the current problematic area, from begin to end and back. Don't dig in and start optimizing at the code level just yet. We're just analyzing. If your analysis reveals big architectural stupidity, it's perhaps a good idea to rethink the architecture at this point. For the rest, we're analyzing which means we collect data about what could be wrong, for each participating part of the complete application. Reviewing the code you wrote is a good tool to get deeper understanding of what is going on for a given task but ultimately it lacks precision and overview what really happens: humans aren't good code interpreters, computers are. We therefore need to utilize tools to get deeper understanding about which parts contribute how much time to the total task, triggered by which other parts and for example how many times are they called. There are two different kind of tools which are necessary: .NET profilers and O/R mapper / RDBMS profilers. .NET profiling .NET profilers (e.g. dotTrace by JetBrains or Ants by Red Gate software) show exactly which pieces of code are called, how many times they're called, and the time it took to run that piece of code, at the method level and sometimes even at the line level. The .NET profilers are essential tools for understanding whether the time taken to complete a given task / area in your application is consumed by .NET code, where exactly in your code, the path to that code, how many times that code was called by other code and thus reveals where hotspots are located: the areas where a solution can be found. Importantly, they also reveal which areas can be left alone: remember our penny wise pound foolish saying: if a profiler reveals that a group of methods are fast, or don't contribute much to the total time taken for a given task, ignore them. Even if the code in them is perhaps complex and looks like a candidate for optimization: you can work all day on that, it won't matter. As we're focusing on a single area of the application, it's best to start profiling right before you actually activate the task/area. Most .NET profilers support this by starting the application without starting the profiling procedure just yet. You navigate to the particular part which is slow, start profiling in the profiler, in your application you perform the actions which are considered slow, and afterwards you get a snapshot in the profiler. The snapshot contains the data collected by the profiler during the slow action, so most data is produced by code in the area to investigate. This is important, because it allows you to stay focused on a single area. O/R mapper and RDBMS profiling .NET profilers give you a good insight in the .NET side of things, but not in the RDBMS side of the application. As this article is about O/R mapper powered applications, we're also looking at databases, and the software making it possible to consume the database in your application: the O/R mapper. To understand which parts of the O/R mapper and database participate how much to the total time taken for task T, we need different tools. There are two kind of tools focusing on O/R mappers and database performance profiling: O/R mapper profilers and RDBMS profilers. For O/R mapper profilers, you can look at LLBLGen Prof by hibernating rhinos or the Linq to Sql/LLBLGen Pro profiler by Huagati. Hibernating rhinos also have profilers for other O/R mappers like NHibernate (NHProf) and Entity Framework (EFProf) and work the same as LLBLGen Prof. For RDBMS profilers, you have to look whether the RDBMS vendor has a profiler. For example for SQL Server, the profiler is shipped with SQL Server, for Oracle it's build into the RDBMS, however there are also 3rd party tools. Which tool you're using isn't really important, what's important is that you get insight in which queries are executed during the task / area we're currently focused on and how long they took. Here, the O/R mapper profilers have an advantage as they collect the time it took to execute the query from the application's perspective so they also collect the time it took to transport data across the network. This is important because a query which returns a massive resultset or a resultset with large blob/clob/ntext/image fields takes more time to get transported across the network than a small resultset and a database profiler doesn't take this into account most of the time. Another tool to use in this case, which is more low level and not all O/R mappers support it (though LLBLGen Pro and NHibernate as well do) is tracing: most O/R mappers offer some form of tracing or logging system which you can use to collect the SQL generated and executed and often also other activity behind the scenes. While tracing can produce a tremendous amount of data in some cases, it also gives insight in what's going on. Interpret After we've completed the analysis step it's time to look at the data we've collected. We've done code reviews to see whether we've done anything stupid and which parts actually take place and if the proper algorithms have been implemented. We've done .NET profiling to see which parts are choke points and how much time they contribute to the total time taken to complete the task we're investigating. We've performed O/R mapper profiling and RDBMS profiling to see which queries were executed during the task, how many queries were generated and executed and how long they took to complete, including network transportation. All this data reveals two things: which parts are big contributors to the total time taken and which parts are irrelevant. Both aspects are very important. The parts which are irrelevant (i.e. don't contribute significantly to the total time taken) can be ignored from now on, we won't look at them. The parts which contribute a lot to the total time taken are important to look at. We now have to first look at the .NET profiler results, to see whether the time taken is consumed in our own code, in .NET framework code, in the O/R mapper itself or somewhere else. For example if most of the time is consumed by DbCommand.ExecuteReader, the time it took to complete the task is depending on the time the data is fetched from the database. If there was just 1 query executed, according to tracing or O/R mapper profilers / RDBMS profilers, check whether that query is optimal, uses indexes or has to deal with a lot of data. Interpret means that you follow the path from begin to end through the data collected and determine where, along the path, the most time is contributed. It also means that you have to check whether this was expected or is totally unexpected. My previous example of the 10 row resultset of a query which groups millions of rows will likely reveal that a long time is spend inside the database and almost no time is spend in the .NET code, meaning the RDBMS part contributes the most to the total time taken, the rest is compared to that time, irrelevant. Considering the vastness of the source data set, it's expected this will take some time. However, does it need tweaking? Perhaps all possible tweaks are already in place. In the interpret step you then have to decide that further action in this area is necessary or not, based on what the analysis results show: if the analysis results were unexpected and in the area where the most time is contributed to the total time taken is room for improvement, action should be taken. If not, you can only accept the situation and move on. In all cases, document your decision together with the analysis you've done. If you decide that the perceived performance problem is actually expected due to the nature of the task performed, it's essential that in the future when someone else looks at the application and starts asking questions you can answer them properly and new analysis is only necessary if situations changed. Fix After interpreting the analysis results you've concluded that some areas need adjustment. This is the fix step: you're actively correcting the performance problem with proper action targeted at the real cause. In many cases related to O/R mapper powered applications it means you'll use different features of the O/R mapper to achieve the same goal, or apply optimizations at the RDBMS level. It could also mean you apply caching inside your application (compromise memory consumption over performance) to avoid unnecessary re-querying data and re-consuming the results. After applying a change, it's key you re-do the analysis and interpretation steps: compare the results and expectations with what you had before, to see whether your actions had any effect or whether it moved the problem to a different part of the application. Don't fall into the trap to do partly analysis: do the full analysis again: .NET profiling and O/R mapper / RDBMS profiling. It might very well be that the changes you've made make one part faster but another part significantly slower, in such a way that the overall problem hasn't changed at all. Performance tuning is dealing with compromises and making choices: to use one feature over the other, to accept a higher memory footprint, to go away from the strict-OO path and execute queries directly onto the RDBMS, these are choices and compromises which will cross your path if you want to fix performance problems with respect to O/R mappers or data-access and databases in general. In most cases it's not a big issue: alternatives are often good choices too and the compromises aren't that hard to deal with. What is important is that you document why you made a choice, a compromise: which analysis data, which interpretation led you to the choice made. This is key for good maintainability in the years to come. Most common performance problems with O/R mappers Below is an incomplete list of common performance problems related to data-access / O/R mappers / RDBMS code. It will help you with fixing the hotspots you found in the interpretation step. SELECT N+1: (Lazy-loading specific). Lazy loading triggered performance bottlenecks. Consider a list of Orders bound to a grid. You have a Field mapped onto a related field in Order, Customer.CompanyName. Showing this column in the grid will make the grid fetch (indirectly) for each row the Customer row. This means you'll get for the single list not 1 query (for the orders) but 1+(the number of orders shown) queries. To solve this: use eager loading using a prefetch path to fetch the customers with the orders. SELECT N+1 is easy to spot with an O/R mapper profiler or RDBMS profiler: if you see a lot of identical queries executed at once, you have this problem. Prefetch paths using many path nodes or sorting, or limiting. Eager loading problem. Prefetch paths can help with performance, but as 1 query is fetched per node, it can be the number of data fetched in a child node is bigger than you think. Also consider that data in every node is merged on the client within the parent. This is fast, but it also can take some time if you fetch massive amounts of entities. If you keep fetches small, you can use tuning parameters like the ParameterizedPrefetchPathThreshold setting to get more optimal queries. Deep inheritance hierarchies of type Target Per Entity/Type. If you use inheritance of type Target per Entity / Type (each type in the inheritance hierarchy is mapped onto its own table/view), fetches will join subtype- and supertype tables in many cases, which can lead to a lot of performance problems if the hierarchy has many types. With this problem, keep inheritance to a minimum if possible, or switch to a hierarchy of type Target Per Hierarchy, which means all entities in the inheritance hierarchy are mapped onto the same table/view. Of course this has its own set of drawbacks, but it's a compromise you might want to take. Fetching massive amounts of data by fetching large lists of entities. LLBLGen Pro supports paging (and limiting the # of rows returned), which is often key to process through large sets of data. Use paging on the RDBMS if possible (so a query is executed which returns only the rows in the page requested). When using paging in a web application, be sure that you switch server-side paging on on the datasourcecontrol used. In this case, paging on the grid alone is not enough: this can lead to fetching a lot of data which is then loaded into the grid and paged there. Keep note that analyzing queries for paging could lead to the false assumption that paging doesn't occur, e.g. when the query contains a field of type ntext/image/clob/blob and DISTINCT can't be applied while it should have (e.g. due to a join): the datareader will do DISTINCT filtering on the client. this is a little slower but it does perform paging functionality on the data-reader so it won't fetch all rows even if the query suggests it does. Fetch massive amounts of data because blob/clob/ntext/image fields aren't excluded. LLBLGen Pro supports field exclusion for queries. You can exclude fields (also in prefetch paths) per query to avoid fetching all fields of an entity, e.g. when you don't need them for the logic consuming the resultset. Excluding fields can greatly reduce the amount of time spend on data-transport across the network. Use this optimization if you see that there's a big difference between query execution time on the RDBMS and the time reported by the .NET profiler for the ExecuteReader method call. Doing client-side aggregates/scalar calculations by consuming a lot of data. If possible, try to formulate a scalar query or group by query using the projection system or GetScalar functionality of LLBLGen Pro to do data consumption on the RDBMS server. It's far more efficient to process data on the RDBMS server than to first load it all in memory, then traverse the data in-memory to calculate a value. Using .ToList() constructs inside linq queries. It might be you use .ToList() somewhere in a Linq query which makes the query be run partially in-memory. Example: var q = from c in metaData.Customers.ToList() where c.Country=="Norway" select c; This will actually fetch all customers in-memory and do an in-memory filtering, as the linq query is defined on an IEnumerable<T>, and not on the IQueryable<T>. Linq is nice, but it can often be a bit unclear where some parts of a Linq query might run. Fetching all entities to delete into memory first. To delete a set of entities it's rather inefficient to first fetch them all into memory and then delete them one by one. It's more efficient to execute a DELETE FROM ... WHERE query on the database directly to delete the entities in one go. LLBLGen Pro supports this feature, and so do some other O/R mappers. It's not always possible to do this operation in the context of an O/R mapper however: if an O/R mapper relies on a cache, these kind of operations are likely not supported because they make it impossible to track whether an entity is actually removed from the DB and thus can be removed from the cache. Fetching all entities to update with an expression into memory first. Similar to the previous point: it is more efficient to update a set of entities directly with a single UPDATE query using an expression instead of fetching the entities into memory first and then updating the entities in a loop, and afterwards saving them. It might however be a compromise you don't want to take as it is working around the idea of having an object graph in memory which is manipulated and instead makes the code fully aware there's a RDBMS somewhere. Conclusion Performance tuning is almost always about compromises and making choices. It's also about knowing where to look and how the systems in play behave and should behave. The four steps I provided should help you stay focused on the real problem and lead you towards the solution. Knowing how to optimally use the systems participating in your own code (.NET framework, O/R mapper, RDBMS, network/services) is key for success as well as knowing what's going on inside the application you built. I hope you'll find this guide useful in tracking down performance problems and dealing with them in a useful way.

Read the article
eBooks on iPad vs. Kindle: More Debate than Smackdown

- by andrewbrust

When the iPad was presented at its San Francisco launch event on January 28th, Steve Jobs spent a significant amount of time explaining how well the device would serve as an eBook reader. He showed the iBooks reader application and iBookstore and laid down the gauntlet before Amazon and its beloved Kindle device. Almost immediately afterwards, criticism came rushing forth that the iPad could never beat the Kindle for book reading. The curious part of that criticism is that virtually no one offering it had actually used the iPad yet. A few weeks later, on April 3rd, the iPad was released for sale in the United States. I bought one on that day and in the few additional weeks that have elapsed, I’ve given quite a workout to most of its capabilities, including its eBook features. I’ve also spent some time with the Kindle, albeit a first-generation model, to see how it actually compares to the iPad. I had some expectations going in, but I came away with conclusions about each device that were more scenario-based than absolute. I present my findings to you here. Vital Statistics Let’s start with an inventory of each device’s underlying technology. The iPad has a color, backlit LCD screen and an on-screen keyboard. It has a battery which, on a full charge, lasts anywhere from 6-10 hours. The Kindle offers a monochrome, reflective E Ink display, a physical keyboard and a battery that on my first gen loaner unit can go up to a week between charges (Amazon claims the battery on the Kindle 2 can last up to 2 weeks on a single charge). The Kindle connects to Amazon’s Kindle Store using a 3G modem (the technology and network vary depending on the model) that incurs no airtime service charges whatsoever. The iPad units that are on-sale today work over WiFi only. 3G-equipped models will be on sale shortly and will command a $130 premium over their WiFi-only counterparts. 3G service on the iPad, in the U.S. from AT&T, will be fee-based, with a 250MB plan at $14.99 per month and an unlimited plan at $29.99. No contract is required for 3G service. All these tech specs aside, I think a more useful observation is that the iPad is a multi-purpose Internet-connected entertainment device, while the Kindle is a dedicated reading device. The question is whether those differences in design and intended use create a clear-cut winner for reading electronic publications. Let’s take a look at each device, in isolation, now. Kindle To me, what’s most innovative about the Kindle is its E Ink display. E Ink really looks like ink on a sheet of paper. It requires no backlight, it’s fully visible in direct sunlight and it causes almost none of the eyestrain that LCD-based computer display technology (like that used on the iPad) does. It’s really versatile in an all-around way. Forgive me if this sounds precious, but reading on it is really a joy. In fact, it’s a genuinely relaxing experience. Through the Kindle Store, Amazon allows users to download books (including audio books), magazines, newspapers and blog feeds. Books and magazines can be purchased either on a single-issue basis or as an annual subscription. Books, of course, are purchased singly. Oddly, blogs are not free, but instead carry a monthly subscription fee, typically $1.99. To me this is ludicrous, but I suppose the free 3G service is partially to blame. Books and magazine issues download quickly. Magazine and blog subscriptions cause new issues or posts to be pushed to your device on an automated basis. Available blogs include 9000-odd feeds that Amazon offers on the Kindle Store; unless I missed something, arbitrary RSS feeds are not supported (though there are third party workarounds to this limitation). The shopping experience is integrated well, has an huge selection, and offers certain graphical perks. For example, magazine and newspaper logos are displayed in menus, and book cover thumbnails appear as well. A simple search mechanism is provided and text entry through the physical keyboard is relatively painless. It’s very easy and straightforward to enter the store, find something you like and start reading it quickly. If you know what you’re looking for, it’s even faster. Given Kindle’s high portability, very reliable battery, instant-on capability and highly integrated content acquisition, it makes reading on whim, and in random spurts of downtime, very attractive. The Kindle’s home screen lists all of your publications, and easily lets you select one, then start reading it. Once opened, publications display in crisp, attractive text that is adjustable in size. “Turning” pages is achieved through buttons dedicated to the task. Notes can be recorded, bookmarks can be saved and pages can be saved as clippings. I am not an avid book reader, and yet I found the Kindle made it really fun, convenient and soothing to read. There’s something about the easy access to the material and the simplicity of the display that makes the Kindle seduce you into chilling out and reading page after page. On the other hand, the Kindle has an awkward navigation interface. While menus are displayed clearly on the screen, the method of selecting menu items is tricky: alongside the right-hand edge of the main display is a thin column that acts as a second display. It has a white background, and a scrollable silver cursor that is moved up or down through the use of the device’s scrollwheel. Picking a menu item on the main display involves scrolling the silver cursor to a position parallel to that menu item and pushing the scrollwheel in. This navigation technique creates a disconnect, literally. You don’t really click on a selection so much as you gesture toward it. I got used to this technique quickly, but I didn’t love it. It definitely created a kind of anxiety in me, making me feel the need to speed through menus and get to my destination document quickly. Once there, I could calm down and relax. Books are great on the Kindle. Magazines and newspapers much less so. I found the rendering of photographs, and even illustrations, to be unacceptably crude. For this reason, I expect that reading textbooks on the Kindle may leave students wanting. I found that the original flow and layout of any publication was sacrificed on the Kindle. In effect, browsing a magazine or newspaper was almost impossible. Reading the text of individual articles was enjoyable, but having to read this way made the whole experience much more “a la carte” than cohesive and thematic between articles. I imagine that for academic journals this is ideal, but for consumer publications it imposes a stripped-down, low-fidelity experience that evokes a sense of deprivation. In general, the Kindle is great for reading text. For just about anything else, especially activity that involves exploratory browsing, meandering and short-attention-span reading, it presents a real barrier to entry and adoption. Avid book readers will enjoy the Kindle (if they’re not already). It’s a great device for losing oneself in a book over long sittings. Multitaskers who are more interested in periodicals, be they online or off, will like it much less, as they will find compromise, and even sacrifice, to be palpable. iPad The iPad is a very different device from the Kindle. While the Kindle is oriented to pages of text, the iPad orbits around applications and their interfaces. Be it the pinch and zoom experience in the browser, the rich media features that augment content on news and weather sites, or the ability to interact with social networking services like Twitter, the iPad is versatile. While it shares a slate-like form factor with the Kindle, it’s effectively an elegant personal computer. One of its many features is the iBook application and integration of the iBookstore. But it’s a multi-purpose device. That turns out to be good and bad, depending on what you’re reading. The iBookstore is great for browsing. It’s color, rich animation-laden user interface make it possible to shop for books, rather than merely search and acquire them. Unfortunately, its selection is rather sparse at the moment. If you’re looking for a New York Times bestseller, or other popular titles, you should be OK. If you want to read something more specialized, it’s much harder. Unlike the awkward navigation interface of the Kindle, the iPad offers a nearly flawless touch-screen interface that seduces the user into tinkering and kibitzing every bit as much as the Kindle lulls you into a deep, concentrated read. It’s a dynamic and interactive device, whereas the Kindle is static and passive. The iBook reader is slick and fun. Use the iPad in landscape mode and you can read the book in 2-up (left/right 2-page) display; use it in portrait mode and you can read one page at a time. Rather than clicking a hardware button to turn pages, you simply drag and wipe from right-to-left to flip the single or right-hand page. The page actually travels through an animated path as it would in a physical book. The intuitiveness of the interface is uncanny. The reader also accommodates saving of bookmarks, searching of the text, and the ability to highlight a word and look it up in a dictionary. Pages display brightly and clearly. They’re easy to read. But the backlight and the glare made me less comfortable than I was with the Kindle. The knowledge that completely different applications (including the Web and email and Twitter) were just a few taps away made me antsy and very tempted to task-switch. The knowledge that battery life is an issue created subtle discomfort. If the Kindle makes you feel like you’re in a library reading room, then the iPad makes you feel, at best, like you’re under fluorescent lights at a Barnes and Noble or Borders store. If you’re lucky, you’d be on a couch or at a reading table in the store, but you might also be standing up, in the aisles. Clearly, I didn’t find this conducive to focused and sustained reading. But that may have more to do with my own tendency to read periodicals far more than books, and my neurotic . And, truth be known, the book reading experience, when not explicitly compared to Kindle’s, was still pleasant. It is also important to point out that Kindle Store-sourced books can be read on the iPad through a Kindle reader application, from Amazon, specific to the device. This offered a less rich experience than the iBooks reader, but it was completely adequate. Despite the Kindle brand of the reader, however, it offered little in terms of simulating the reading experience on its namesake device. When it comes to periodicals, the iPad wins hands down. Magazines, even if merely scanned images of their print editions, read on the iPad in a way that felt similar to reading hard copy. The full color display, touch navigation and even the ability to render advertisements in their full glory makes the iPad a great way to read through any piece of work that is measured in pages, rather than chapters. There are many ways to get magazines and newspapers onto the iPad, including the Zinio reader, and publication-specific applications like the Wall Street Journal’s and Popular Science’s. The New York Times’ free Editors’ Choice application offers a Times Reader-like interface to a subset of the Gray Lady’s daily content. The completely Web-based but iPad-optimized Times Skimmer site (at www.nytimes.com/timesskimmer) works well too. Even conventional Web sites themselves can be read much like magazines, given the iPad’s ability to zoom in on the text and crop out advertisements on the margins. While the Kindle does have an experimental Web browser, it reminded me a lot of early mobile phone browsers, only in a larger size. For text-heavy sites with simple layout, it works fine. For just about anything else, it becomes more trouble than it’s worth. And given the way magazine articles make me think of things I want to look up online, I think that’s a real liability for the Kindle. Summing Up What I came to realize is that the Kindle isn’t so much a computer or even an Internet device as it is a printer. While it doesn’t use physical paper, it still renders its content a page at a time, just like a laser printer does, and its output appears strikingly similar. You can read the rendered text, but you can’t interact with it in any way. That’s why the navigation requires a separate cursor display area. And because of the page-oriented rendering behavior, turning pages causes a flash on the display and requires a sometimes long pause before the next page is rendered. The good side of this is that once the page is generated, no battery power is required to display it. That makes for great battery life, optimal viewing under most lighting conditions (as long as there is some light) and low-eyestrain text-centric display of content. The Kindle is highly portable, has an excellent selection in its store and is refreshingly distraction-free. All of this is ideal for reading books. And iPad doesn’t offer any of it. What iPad does offer is versatility, variety, richness and luxury. It’s flush with accoutrements even if it’s low on focused, sustained text display. That makes it inferior to the Kindle for book reading. But that also makes it better than the Kindle for almost everything else. As such, and given that its book reading experience is still decent (even if not superior), I think the iPad will give Kindle a run for its money. True book lovers, and people on a budget, will want the Kindle. People with a robust amount of discretionary income may want both devices. Everyone else who is interested in a slate form factor e-reading device, especially if they also wish to have leisure-friendly Internet access, will likely choose the iPad exclusively. One thing is for sure: iPad has reduced Kindle’s market, and may have shifted its mass market potential to a mere niche play. If Amazon is smart, it will improve its iPad-based Kindle reader app significantly. It can then leverage the iPad channel as a significant market for the Kindle Store. After all, selling the eBooks themselves is what Amazon should care most about.

Read the article
Please bear with me, can someone analyze this trace route please

- by Abdulla

Hello, my name is Abdulla and I'm from Kuwait. Sorry for my question as I know its not technically challenging. I'm facing some problems with my internet connection while gaming, I have DSL 2mb connection. My main problem is latency, in the morning its good but after that its gets really bad. My internet provider says there's nothing wrong and that everything is working perfectly. I tried to explain to them the latency issue but they say that as long as I'm getting the download speed there isn't anything I can do about it. I only want to know if this is true and that the company can't do anything before I change my internet provider, as I feel that the guys at the contact center might getting back to me without asking tech support. Below are 2 traces I made, one in the morning and the other in the afternoon: This was taken around 17:00 Microsoft Windows XP [Version 5.1.2600] (C) Copyright 1985-2001 Microsoft Corp. C:\Documents and Settings\Administrator>ping google.com Pinging google.com [66.102.9.104] with 32 bytes of data: Reply from 66.102.9.104: bytes=32 time=387ms TTL=49 Reply from 66.102.9.104: bytes=32 time=388ms TTL=49 Reply from 66.102.9.104: bytes=32 time=375ms TTL=49 Reply from 66.102.9.104: bytes=32 time=375ms TTL=49 Ping statistics for 66.102.9.104: Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 375ms, Maximum = 388ms, Average = 381ms C:\Documents and Settings\Administrator>ping google.com /t Pinging google.com [66.102.9.104] with 32 bytes of data: Reply from 66.102.9.104: bytes=32 time=376ms TTL=49 Reply from 66.102.9.104: bytes=32 time=382ms TTL=49 Reply from 66.102.9.104: bytes=32 time=371ms TTL=49 Reply from 66.102.9.104: bytes=32 time=378ms TTL=49 Reply from 66.102.9.104: bytes=32 time=374ms TTL=49 Reply from 66.102.9.104: bytes=32 time=371ms TTL=49 Reply from 66.102.9.104: bytes=32 time=365ms TTL=49 Reply from 66.102.9.104: bytes=32 time=366ms TTL=49 Reply from 66.102.9.104: bytes=32 time=353ms TTL=49 Reply from 66.102.9.104: bytes=32 time=331ms TTL=49 Reply from 66.102.9.104: bytes=32 time=333ms TTL=49 Reply from 66.102.9.104: bytes=32 time=348ms TTL=49 Reply from 66.102.9.104: bytes=32 time=365ms TTL=49 Reply from 66.102.9.104: bytes=32 time=346ms TTL=49 Reply from 66.102.9.104: bytes=32 time=335ms TTL=49 Reply from 66.102.9.104: bytes=32 time=340ms TTL=49 Reply from 66.102.9.104: bytes=32 time=344ms TTL=49 Reply from 66.102.9.104: bytes=32 time=333ms TTL=49 Reply from 66.102.9.104: bytes=32 time=328ms TTL=49 Reply from 66.102.9.104: bytes=32 time=332ms TTL=49 Reply from 66.102.9.104: bytes=32 time=326ms TTL=49 Reply from 66.102.9.104: bytes=32 time=333ms TTL=49 Reply from 66.102.9.104: bytes=32 time=325ms TTL=49 Reply from 66.102.9.104: bytes=32 time=333ms TTL=49 Reply from 66.102.9.104: bytes=32 time=338ms TTL=49 Reply from 66.102.9.104: bytes=32 time=341ms TTL=49 Ping statistics for 66.102.9.104: Packets: Sent = 26, Received = 26, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 325ms, Maximum = 382ms, Average = 348ms Control-C ^C C:\Documents and Settings\Administrator>travert google.com 'travert' is not recognized as an internal or external command, operable program or batch file. C:\Documents and Settings\Administrator>tracert google.com Tracing route to google.com [66.102.9.104] over a maximum of 30 hops: 1 <1 ms <1 ms <1 ms 192.168.0.1 2 6 ms 6 ms 6 ms 80-184-31-1.adsl.kems.net [80.184.31.1] 3 7 ms 7 ms 8 ms 168.187.0.226 4 7 ms 8 ms 9 ms 168.187.0.125 5 180 ms 187 ms 188 ms if-11-2.core1.RSD-Riyad.as6453.net [116.0.78.89] 6 209 ms 222 ms 204 ms 195.219.167.57 7 541 ms 536 ms 540 ms 195.219.167.42 8 553 ms 552 ms 538 ms Vlan1102.icore1.PVU-Paris.as6453.net [195.219.24 1.109] 9 547 ms 543 ms 542 ms xe-9-1-0.edge4.paris1.level3.net [4.68.110.213] 10 540 ms 523 ms 531 ms ae-33-51.ebr1.Paris1.Level3.net [4.69.139.193] 11 755 ms 761 ms 695 ms ae-45-45.ebr1.London1.Level3.net [4.69.143.101] 12 271 ms 263 ms 400 ms ae-11-51.car1.London1.Level3.net [4.69.139.66] 13 701 ms 730 ms 742 ms 195.50.118.210 14 659 ms 641 ms 660 ms 209.85.255.76 15 280 ms 283 ms 292 ms 209.85.251.190 16 308 ms 293 ms 296 ms 72.14.232.239 17 679 ms 700 ms 721 ms 64.233.174.18 18 268 ms 281 ms 269 ms lm-in-f104.1e100.net [66.102.9.104] Trace complete. C:\Documents and Settings\Administrator> This was taken at 10:00am Microsoft Windows XP [Version 5.1.2600] (C) Copyright 1985-2001 Microsoft Corp. C:\Documents and Settings\Administrator>ping google.com Pinging google.com [66.102.9.106] with 32 bytes of data: Reply from 66.102.9.106: bytes=32 time=110ms TTL=49 Reply from 66.102.9.106: bytes=32 time=111ms TTL=49 Reply from 66.102.9.106: bytes=32 time=112ms TTL=49 Reply from 66.102.9.106: bytes=32 time=120ms TTL=49 Ping statistics for 66.102.9.106: Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 110ms, Maximum = 120ms, Average = 113ms C:\Documents and Settings\Administrator>ping google.com /t Pinging google.com [66.102.9.106] with 32 bytes of data: Reply from 66.102.9.106: bytes=32 time=109ms TTL=49 Reply from 66.102.9.106: bytes=32 time=110ms TTL=49 Reply from 66.102.9.106: bytes=32 time=111ms TTL=49 Reply from 66.102.9.106: bytes=32 time=111ms TTL=49 Reply from 66.102.9.106: bytes=32 time=112ms TTL=49 Reply from 66.102.9.106: bytes=32 time=112ms TTL=49 Reply from 66.102.9.106: bytes=32 time=116ms TTL=49 Reply from 66.102.9.106: bytes=32 time=110ms TTL=49 Reply from 66.102.9.106: bytes=32 time=109ms TTL=49 Reply from 66.102.9.106: bytes=32 time=110ms TTL=49 Reply from 66.102.9.106: bytes=32 time=109ms TTL=49 Reply from 66.102.9.106: bytes=32 time=110ms TTL=49 Reply from 66.102.9.106: bytes=32 time=112ms TTL=49 Reply from 66.102.9.106: bytes=32 time=109ms TTL=49 Reply from 66.102.9.106: bytes=32 time=110ms TTL=49 Reply from 66.102.9.106: bytes=32 time=115ms TTL=49 Reply from 66.102.9.106: bytes=32 time=110ms TTL=49 Reply from 66.102.9.106: bytes=32 time=109ms TTL=49 Reply from 66.102.9.106: bytes=32 time=110ms TTL=49 Reply from 66.102.9.106: bytes=32 time=113ms TTL=49 Reply from 66.102.9.106: bytes=32 time=115ms TTL=49 Reply from 66.102.9.106: bytes=32 time=109ms TTL=49 Reply from 66.102.9.106: bytes=32 time=110ms TTL=49 Ping statistics for 66.102.9.106: Packets: Sent = 32, Received = 32, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 109ms, Maximum = 135ms, Average = 112ms Control-C ^C C:\Documents and Settings\Administrator>tracert google.com Tracing route to google.com [66.102.9.104] over a maximum of 30 hops: 1 <1 ms <1 ms <1 ms 192.168.0.1 2 6 ms 6 ms 6 ms 80-184-31-1.adsl.kems.net [80.184.31.1] 3 8 ms 7 ms 6 ms 168.187.0.226 4 6 ms 7 ms 7 ms 168.187.0.125 5 20 ms 20 ms 18 ms if-11-2.core1.RSD-Riyad.as6453.net [116.0.78.89] 6 171 ms 205 ms 215 ms 195.219.167.57 7 191 ms 215 ms 226 ms 195.219.167.42 8 * 103 ms 94 ms Vlan1102.icore1.PVU-Paris.as6453.net [195.219.24 1.109] 9 94 ms 95 ms 97 ms xe-9-1-0.edge4.paris1.level3.net [4.68.110.213] 10 94 ms 94 ms 94 ms ae-33-51.ebr1.Paris1.Level3.net [4.69.139.193] 11 101 ms 101 ms 101 ms ae-48-48.ebr1.London1.Level3.net [4.69.143.113] 12 102 ms 102 ms 101 ms ae-11-51.car1.London1.Level3.net [4.69.139.66] 13 103 ms 102 ms 103 ms 195.50.118.210 14 137 ms 103 ms 100 ms 209.85.255.76 15 130 ms 124 ms 124 ms 209.85.251.190 16 114 ms 116 ms 116 ms 72.14.232.239 17 135 ms 113 ms 126 ms 64.233.174.18 18 126 ms 125 ms 127 ms lm-in-f104.1e100.net [66.102.9.104] Trace complete. C:\Documents and Settings\Administrator>

Read the article
Sorting Algorithms

- by MarkPearl

General Every time I go back to university I find myself wading through sorting algorithms and their implementation in C++. Up to now I haven’t really appreciated their true value. However as I discovered this last week with Dictionaries in C# – having a knowledge of some basic programming principles can greatly improve the performance of a system and make one think twice about how to tackle a problem. I’m going to cover briefly in this post the following: Selection Sort Insertion Sort Shellsort Quicksort Mergesort Heapsort (not complete) Selection Sort Array based selection sort is a simple approach to sorting an unsorted array. Simply put, it repeats two basic steps to achieve a sorted collection. It starts with a collection of data and repeatedly parses it, each time sorting out one element and reducing the size of the next iteration of parsed data by one. So the first iteration would go something like this… Go through the entire array of data and find the lowest value Place the value at the front of the array The second iteration would go something like this… Go through the array from position two (position one has already been sorted with the smallest value) and find the next lowest value in the array. Place the value at the second position in the array This process would be completed until the entire array had been sorted. A positive about selection sort is that it does not make many item movements. In fact, in a worst case scenario every items is only moved once. Selection sort is however a comparison intensive sort. If you had 10 items in a collection, just to parse the collection you would have 10+9+8+7+6+5+4+3+2=54 comparisons to sort regardless of how sorted the collection was to start with. If you think about it, if you applied selection sort to a collection already sorted, you would still perform relatively the same number of iterations as if it was not sorted at all. Many of the following algorithms try and reduce the number of comparisons if the list is already sorted – leaving one with a best case and worst case scenario for comparisons. Likewise different approaches have different levels of item movement. Depending on what is more expensive, one may give priority to one approach compared to another based on what is more expensive, a comparison or a item move. Insertion Sort Insertion sort tries to reduce the number of key comparisons it performs compared to selection sort by not “doing anything” if things are sorted. Assume you had an collection of numbers in the following order… 10 18 25 30 23 17 45 35 There are 8 elements in the list. If we were to start at the front of the list – 10 18 25 & 30 are already sorted. Element 5 (23) however is smaller than element 4 (30) and so needs to be repositioned. We do this by copying the value at element 5 to a temporary holder, and then begin shifting the elements before it up one. So… Element 5 would be copied to a temporary holder 10 18 25 30 23 17 45 35 – T 23 Element 4 would shift to Element 5 10 18 25 30 30 17 45 35 – T 23 Element 3 would shift to Element 4 10 18 25 25 30 17 45 35 – T 23 Element 2 (18) is smaller than the temporary holder so we put the temporary holder value into Element 3. 10 18 23 25 30 17 45 35 – T 23 We now have a sorted list up to element 6. And so we would repeat the same process by moving element 6 to a temporary value and then shifting everything up by one from element 2 to element 5. As you can see, one major setback for this technique is the shifting values up one – this is because up to now we have been considering the collection to be an array. If however the collection was a linked list, we would not need to shift values up, but merely remove the link from the unsorted value and “reinsert” it in a sorted position. Which would reduce the number of transactions performed on the collection. So.. Insertion sort seems to perform better than selection sort – however an implementation is slightly more complicated. This is typical with most sorting algorithms – generally, greater performance leads to greater complexity. Also, insertion sort performs better if a collection of data is already sorted. If for instance you were handed a sorted collection of size n, then only n number of comparisons would need to be performed to verify that it is sorted. It’s important to note that insertion sort (array based) performs a number item moves – every time an item is “out of place” several items before it get shifted up. Shellsort – Diminishing Increment Sort So up to now we have covered Selection Sort & Insertion Sort. Selection Sort makes many comparisons and insertion sort (with an array) has the potential of making many item movements. Shellsort is an approach that takes the normal insertion sort and tries to reduce the number of item movements. In Shellsort, elements in a collection are viewed as sub-collections of a particular size. Each sub-collection is sorted so that the elements that are far apart move closer to their final position. Suppose we had a collection of 15 elements… 10 20 15 45 36 48 7 60 18 50 2 19 43 30 55 First we may view the collection as 7 sub-collections and sort each sublist, lets say at intervals of 7 10 60 55 – 20 18 – 15 50 – 45 2 – 36 19 – 48 43 – 7 30 10 55 60 – 18 20 – 15 50 – 2 45 – 19 36 – 43 48 – 7 30 (Sorted) We then sort each sublist at a smaller inter – lets say 4 10 55 60 18 – 20 15 50 2 – 45 19 36 43 – 48 7 30 10 18 55 60 – 2 15 20 50 – 19 36 43 45 – 7 30 48 (Sorted) We then sort elements at a distance of 1 (i.e. we apply a normal insertion sort) 10 18 55 60 2 15 20 50 19 36 43 45 7 30 48 2 7 10 15 18 19 20 30 36 43 45 48 50 55 (Sorted) The important thing with shellsort is deciding on the increment sequence of each sub-collection. From what I can tell, there isn’t any definitive method and depending on the order of your elements, different increment sequences may perform better than others. There are however certain increment sequences that you may want to avoid. An even based increment sequence (e.g. 2 4 8 16 32 …) should typically be avoided because it does not allow for even elements to be compared with odd elements until the final sort phase – which in a way would negate many of the benefits of using sub-collections. The performance on the number of comparisons and item movements of Shellsort is hard to determine, however it is considered to be considerably better than the normal insertion sort. Quicksort Quicksort uses a divide and conquer approach to sort a collection of items. The collection is divided into two sub-collections – and the two sub-collections are sorted and combined into one list in such a way that the combined list is sorted. The algorithm is in general pseudo code below… Divide the collection into two sub-collections Quicksort the lower sub-collection Quicksort the upper sub-collection Combine the lower & upper sub-collection together As hinted at above, quicksort uses recursion in its implementation. The real trick with quicksort is to get the lower and upper sub-collections to be of equal size. The size of a sub-collection is determined by what value the pivot is. Once a pivot is determined, one would partition to sub-collections and then repeat the process on each sub collection until you reach the base case. With quicksort, the work is done when dividing the sub-collections into lower & upper collections. The actual combining of the lower & upper sub-collections at the end is relatively simple since every element in the lower sub-collection is smaller than the smallest element in the upper sub-collection. Mergesort With quicksort, the average-case complexity was O(nlog2n) however the worst case complexity was still O(N*N). Mergesort improves on quicksort by always having a complexity of O(nlog2n) regardless of the best or worst case. So how does it do this? Mergesort makes use of the divide and conquer approach to partition a collection into two sub-collections. It then sorts each sub-collection and combines the sorted sub-collections into one sorted collection. The general algorithm for mergesort is as follows… Divide the collection into two sub-collections Mergesort the first sub-collection Mergesort the second sub-collection Merge the first sub-collection and the second sub-collection As you can see.. it still pretty much looks like quicksort – so lets see where it differs… Firstly, mergesort differs from quicksort in how it partitions the sub-collections. Instead of having a pivot – merge sort partitions each sub-collection based on size so that the first and second sub-collection of relatively the same size. This dividing keeps getting repeated until the sub-collections are the size of a single element. If a sub-collection is one element in size – it is now sorted! So the trick is how do we put all these sub-collections together so that they maintain their sorted order. Sorted sub-collections are merged into a sorted collection by comparing the elements of the sub-collection and then adjusting the sorted collection. Lets have a look at a few examples… Assume 2 sub-collections with 1 element each 10 & 20 Compare the first element of the first sub-collection with the first element of the second sub-collection. Take the smallest of the two and place it as the first element in the sorted collection. In this scenario 10 is smaller than 20 so 10 is taken from sub-collection 1 leaving that sub-collection empty, which means by default the next smallest element is in sub-collection 2 (20). So the sorted collection would be 10 20 Lets assume 2 sub-collections with 2 elements each 10 20 & 15 19 So… again we would Compare 10 with 15 – 10 is the winner so we add it to our sorted collection (10) leaving us with 20 & 15 19 Compare 20 with 15 – 15 is the winner so we add it to our sorted collection (10 15) leaving us with 20 & 19 Compare 20 with 19 – 19 is the winner so we add it to our sorted collection (10 15 19) leaving us with 20 & _ 20 is by default the winner so our sorted collection is 10 15 19 20. Make sense? Heapsort (still needs to be completed) So by now I am tired of sorting algorithms and trying to remember why they were so important. I think every year I go through this stuff I wonder to myself why are we made to learn about selection sort and insertion sort if they are so bad – why didn’t we just skip to Mergesort & Quicksort. I guess the only explanation I have for this is that sometimes you learn things so that you can implement them in future – and other times you learn things so that you know it isn’t the best way of implementing things and that you don’t need to implement it in future. Anyhow… luckily this is going to be the last one of my sorts for today. The first step in heapsort is to convert a collection of data into a heap. After the data is converted into a heap, sorting begins… So what is the definition of a heap? If we have to convert a collection of data into a heap, how do we know when it is a heap and when it is not? The definition of a heap is as follows: A heap is a list in which each element contains a key, such that the key in the element at position k in the list is at least as large as the key in the element at position 2k +1 (if it exists) and 2k + 2 (if it exists). Does that make sense? At first glance I’m thinking what the heck??? But then after re-reading my notes I see that we are doing something different – up to now we have really looked at data as an array or sequential collection of data that we need to sort – a heap represents data in a slightly different way – although the data is stored in a sequential collection, for a sequential collection of data to be in a valid heap – it is “semi sorted”. Let me try and explain a bit further with an example… Example 1 of Potential Heap Data Assume we had a collection of numbers as follows 1[1] 2[2] 3[3] 4[4] 5[5] 6[6] For this to be a valid heap element with value of 1 at position [1] needs to be greater or equal to the element at position [3] (2k +1) and position [4] (2k +2). So in the above example, the collection of numbers is not in a valid heap. Example 2 of Potential Heap Data Lets look at another collection of numbers as follows 6[1] 5[2] 4[3] 3[4] 2[5] 1[6] Is this a valid heap? Well… element with the value 6 at position 1 must be greater or equal to the element at position [3] and position [4]. Is 6 > 4 and 6 > 3? Yes it is. Lets look at element 5 as position 2. It must be greater than the values at [4] & [5]. Is 5 > 3 and 5 > 2? Yes it is. If you continued to examine this second collection of data you would find that it is in a valid heap based on the definition of a heap.

Read the article

Search Results

Search found 8275 results on 331 pages for 'bad appz'.

Page 326/331 | < Previous Page | 322 323 324 325 326 327 328 329 330 331 | Next Page >

- by Ciobanu Alexandru

- by user488001

- by Stephen Walther

- by Rick Strahl

- by Rick Strahl

- by Tim Dexter

- by Benjamin Nolan

- by Stephen Walther

- by terje

- by SeanMcAlinden

- by ScottGu

- by user35335

- by Martin Hinshelwood

- by 4x10

- by Paul White

- by danx

- by bobloki

- by Stephen Walther

- by Mark Rackley

- by Abdulla

- by Rick Strahl

- by FransBouma

- by andrewbrust

- by Abdulla

- by MarkPearl

< Previous Page | 322 323 324 325 326 327 328 329 330 331 | Next Page >