rig - Page 5 - Developer IT

What's the best-practice way to update an Adapter's underlying data?

- by skyler

I'm running into an IllegalStateException updating an underlying List to an Adapter (might be an ArrayAdapter or an extension of BaseAdapter, I don't remember). I do not have or remember the text of the exception at the moment, but it says something to the effect of the List's content changing without the Adapter having been notified of the change. This List /may/ be updated from another thread other than the UI thread (main). After I update this list (adding an item), I call notifyDataSetChanged. The issue seems to be that the Adapter, or ListView attached to the Adapter attempts to update itself before this method is invoked. When this happens, the IllegalStateException is thrown. If I set the ListView's visibility to GONE before the update, then VISIBLE again, no error occurs. But this isn't always practical. I read somewhere that you cannot modify the underlying this from another thread--this would seem to limit an MVC pattern, as with this particular List, I want to add items from different threads. I assumed that as long as I called notifyDataSetChanged() I'd be safe--that the Adapter didn't revisit the underlying List until this method was invoked but this doesn't seem to be the case. I suppose what I'm asking is, can it be safe to update the underlying List from threads other than the UI? Additionally, if I want to modify the data within an Adapter, do I modify the underlying List or the Adapter itself (via its add(), etc. methods). Modifying the data through the Adapter seems wrong. I came across a thread on another site from someone who seems to be having a similar problem to mine: http://osdir.com/ml/Android-Developers/2010-04/msg01199.html (this is from where I grabbed the Visibility.GONE and .VISIBLE idea). To give you a better idea of my particular problem, I'll describe a bit of how my List, Adapter, etc. are set up. I've an object named Queue that contains a LinkedList. Queue extends Observable, and when things are added to its internal list through its methods, I call setChanged() and notifyListeners(). This Queue object can have items added or removed from any number of threads. I have a single "queue view" Activity that contains an Adapter. This Activity, in its onCreate() method, registers an Observer listener to my Queue object. In the Observer's update() method I call notifyDataSetChanged() on the Adapter. I added a lot of log output and determined that when this IllegalStateExcption occurs that my Observer callback was never invoked. So it's as if the Adapter noticed the List's change before the Observer had a chance to notify its Observers, and call my method to notify the Adapter that the contents had changed. So I suppose what I'm asking is, is this a good way to rig-up an Adapter? Is this a problem because I'm updating the Adapter's contents from a thread other than the UI thread? If this is the case, I may have a solution in mind (give the Queue object a Handler to the UI thread when it's created, and make all List modifications using that Handler, but this seems improper). I realize that this is a very open-ended post, but I'm a bit lost on this and would appreciate any comments on what I've written.

Read the article

Load and Web Performance Testing using Visual Studio Ultimate 2010-Part 2

- by Tarun Arora

Welcome back, in part 1 of Load and Web Performance Testing using Visual Studio 2010 I talked about why Performance Testing the application is important, the test tools available in Visual Studio Ultimate 2010 and various test rig topologies. In this blog post I’ll get into the details of web performance & load tests as well as why it’s important to follow a goal based pattern while performance testing your application. Tools => Options => Test Tools Have you visited the treasures of Visual Studio Menu bar tools => Options => Test Tools lately? The options to enable disable prompts on creating, editing, deleting or running manual/automated tests can be controller from here. The default test project language and default test types created on a new test project creation could be selected/unselected from here. Ever wondered how you can change the default limit of 25 test results, this can again be changed from here. If you record a lot of Web Tests and wish for the web test recorder to start with “that” URL populated, well this again can be specified from here. If you haven’t so far, I would urge you to spend 2 minutes in the test tools options. Test Menu => Ready Steady Test Action! The Test tools are under the Test Menu in Visual Studio, apart from being able to create a new Test and Test List you can also load an existing vsmdi file. You can also manage your test controllers from here. A solution can have one or more test setting files, but there can only be one active test settings file at any time. Again, this selection can be done from here. You can open the various test windows from under the windows option from the test menu. If you open the Test view window you will see that you have the option to group the tests by work items, project, test type, etc. You can set these properties by right clicking a test in the test list and choosing properties from the context menu. So, what is a vsmdi file? vsmdi stands for Visual Studio Test Metadata File. Placed under the Solution Items this file keeps track of the list of unit tests in your solution. If you open the vsmdi file as an xml file you will see a series of Test Links nested with in the list Test List tags along with the Run Configuration tag. When in visual studio you run tests, the IDE looks at the vsmdi file to see what tests need to be run. You also have the option of using the vsmdi file in your team builds to specify which tests need to run as part of the build. Refer here for a walkthrough from a fellow blogger on how to use the vsmdi file in the team builds. Web Performance Test – The Truth! In Visual Studio 2010 “Web Tests” have been renamed to “Web Performance Tests”. Apart from renaming this test type there have been several improvements to this test type in visual studio 2010. I am very active on the MSDN Visual Studio And Load Testing forum and a frequent question from many users is “Do Web Tests support Pages that run JavaScript?” I will start with a little bit of background before answering this question. Web Performance Tests operate at the HTTP Layer, but why? To enable you to generate high loads with a relatively low amount of hardware, Web performance tests are driven at the protocol layer rather than instantiating a browser.The most common source of confusion is that users do not realize Web Performance Tests work at the HTTP layer. The tool adds to that misconception. After all, you record in IE, and when running a Web test you can select which browser to use, and then the result viewer shows the results in a browser window. So that means the tests run through the browser, right? NO! The Web test engine works at the HTTP layer, and does not instantiate a browser. What does that mean? In the diagram below, you can see there are no browsers running when the engine is sending and receiving requests. Does that mean I can’t test pages that use Java script? The best example for java script generating HTTP traffic is AJAX calls. The most common example of browser plugins are Silverlight or Flash. The Web test recorder will record HTTP traffic from AJAX calls and from most (but not all) browser plugins. This means you will still be able to web performance test pages that use java script or plugin and play back the results but the playback engine will not show the java script or plug in results in the ‘browser control’. If you want to test the page behaviour as a result of the java script or plug in consider using Coded UI Tests. This page looks like it failed, when in fact it succeeded! Looking closely at the response, and subsequent requests, it is clear the operation succeeded. As stated above, the reason why the browser control is pasting this message is because java script has been disabled in this control. So, to reiterate, the web performance test recorder: - Sends and receives data at the HTTP layer. - Does NOT run a browser. - Does NOT run java script. - Does NOT host ActiveX controls or plugins. There is a great series of blog posts from Ed Glas, i would highly recommend his blog to any one performing Load/Performance testing through Visual Studio. Demo – Web Performance Test [Demo] - Visual Studio Ultimate 2010: Test Settings and Configuration [Demo]–Visual Studio Ultimate 2010: Web Performance Test In this short video I try and answer the following questions, Why is performance Testing important? How does Visual Studio Help you performance Test your applications? How do i record a web performance test? How do make a web performance test data driven, transaction driven, loop driven, convert to code, add validations? Best practices for recording Web Performance Tests. I have a web performance test, what next? Creating the Web Performance Test was the first step towards load testing your application. Now that we have the base test we can test the page behaviour when N-users access the page. Have you ever had the head of business call you and mention that the marketing team has done a fantastic job and are expecting increased traffic on the web site, can the website survive the weekend with that additional load? This is the perfect opportunity to capacity test your application to see how your website holds up under various levels of load, you can work the results backwards to see how much hardware you may need to scale up your application to survive the weekend. Apart from that it is always a good idea to have some benchmarks around how the application performs under light loads for short duration, under heavy load for long duration and soak test the application run a constant load for a very week or two to record the effects of constant load for really long durations, this is a great way of identifying how your application handles the default IIS application pool reset which by default is configured to once every 25 hours. These bench marks will act as the perfect yard stick to measure performance gains when you start making improvements. BUT there are some best practices! => Goal Based Load Testing Approach Since the subject is vast and there are a lot of things to measure and analyse, … it is very easy to get distracted from the real goal! You can optimize your application once you know where the pain points are. There is no point performing a load test of 5000 users if your intranet application will only have a 100 simultaneous users, it is important to keep focussed on the real goals of the project. So the idea is to have a user story around your load testing scenarios and test realistically. So it is recommended that you follow the below outline, It is an Iterative process, refine your objectives, identify the key scenarios, what is the expected workload, key metrics you want to report, record the web performance tests, simulate load and analyse results. Is your application already deployed in Production? This is great! You can analyse the IIS Logs to understand the user behaviour… But what are IIS LOGS? The IIS logs allow you to record events for each application and Web site on the Web server. You can create separate logs for each of your applications and Web sites. Logging information in IIS goes beyond the scope of the event logging or performance monitoring features provided by Windows. The IIS logs can include information, such as who has visited your site, what the visitor viewed, and when the information was last viewed. You can use the IIS logs to identify any attempts to gain unauthorized access to your Web server. How to configure IIS LOGS? For those Ninjas who already have IIS Logs configured (by the way its on by default) and need a way to analyse the IIS Logs, can use the Windows IIS Utility – Log Parser. Log Parser is a very powerful tool that provides a generic SQL-like language on top of many types of data like IIS Logs, Event Viewer entries, XML files, CSV files, File System and others; and it allows you to export the result of the queries to many output formats such as CSV, XML, SQL Server, Charts and others; and it works well with IIS 5, 6, 7 and 7.5. Frequently used Log Parser queries. Demo – Load Test [Demo]–Visual Studio Ultimate 2010: Load Testing In this short video I try and answer the following questions, - Types of Performance Testing? - Perform Goal driven Load Testing, analyse Test Run Result and Generate a report? Recap A quick recap of what we have covered so far, Thank you for taking the time out and reading this blog post, in part III of this blog series I’ll be getting into the details of Test Result Analysis, Test Result Drill through, Test Report Generation, Test Run Comparison, and the Asp.net Profiler. If you enjoyed the post, remember to subscribe to http://feeds.feedburner.com/TarunArora. Questions/Feedback/Suggestions, etc please leave a comment. See you on in Part III Share this post : CodeProject

Read the article

SQL Developer Q&A from ODTUG Tips & Tricks Webcast

- by thatjeffsmith

Another great webcast yesterday – if you’re a paying member of ODTUG you can watch the show for yourself in their archives. If not, you can get my slide deck off of SlideShare. About 150 of you brave souls sat through an entire hour of me talking and then 10 more minutes of Q&A. We went through everything rapid-fire style, so I thought I would post the questions and my refined answers here for your perusal. In the order in which I received them: You showed the preference to choose between resultsets in same tab or ain a new tab. I understand that we can not have it both using different hotkeys? For example: F5 run and resultset to same tab, ctrl-f5 same but to new tab? Sometimes you want the one other times the other. The questioner is asking about this preference, Tools Preferences Database Worksheet ‘Show query results in new tabs.’ This is an all or nothing proposition. But, there’s another, perhaps better way: the document PINs. If you have a result set you don’t want to lose, ‘pin it.’ Pin multiple result sets or plans for review and comparisons. You mentioned that sometimes it’s hard to remember where a certain preference is. I agree. So enhancement request: add a search-box to the preferences window. Maybe like in, for example, UltraEdit. It shows you all preferences containing your search criteria. Actually, we do have a search mechanism type the search string, we auto-filter the preferences Is there a version of SQL Developer that will connect to an 8i database (Yes, I realize how old that database version is!) Sorry, no. We also don’t have a version that will run on Windows 3.11 for Workgroups…probably. How do we access your blog? Carefully, and with much trepidation. When you’re ready, go to http://www.thatjeffsmith.com Is there a way to get good formatting with predefined settings? I believe the questioner is referring to the script output a la SQL*Plus formatting commands. Yes, there is. You can build your formatting commands into your login.sql script, and those will be applied for your script execution sessions. Example here. Why this version 4.0 doesn’t support external plugins? It does, it just requires the plugin developer to re-factor it for OSGi. This came about when we updated the JDeveloper framework to the later 11g/12c stuff. Any change in hookup with SVN? The only change with Subversion is that internally we’re using 1.7 stuff now. You can use SQLDev to work with a 1.8 SVN server, but if you get a working copy with a 1.8 client SQLDev won’t be able to do anything with it… Command line utilities ? improvements Yes! The long answer is here. Is that a Hint or a Comment?? /*CSV*/ It’s a comment – the database won’t recognize it, but SQLDev does when it goes through our statement pre-processor. We’ll redirect the output through our CSV formatter before displaying the results in the Script Output panel. That’s why this will ONLY work in SQL Developer. Are you selecting “”Run Script”" to get that CSV or HTML output, rather than “”Run Statement”"? Yes, the formatter hints like the CSV one mentioned above only make sense in a script output panel vs a grid. How do you save relational models once they’re defined? I’ve had trouble with setting one up, “”saving”" it, then the design work I did is longer there when loading it later. File – Data Modeler – Save. If you’re running the Modeler inside of SQL Developer, the menu’ing interface can get a bit tricky. That’s why I recommend using the stand along if you’re doing anything with a model that takes more than 5 minutes. See how the Data Modeler menus are folded up under the SQL Dev menus? Can u unplug and plug into another container in a database with only sqldeveloper? Yes, you can ‘Detach’ a multitentant 12c Database ‘pluggable’ and plug it into another instance. You have the option to copy or move the files. This isn’t a trivial operation, pay attention Can you run APEX code directly on the adopter? No, at least not as I understand your question. Give me an example and I can give you a better example. Is there a way that when u click on a particular table it wouldn’t show the table with the info but just to see the columns underneath clicking on the node? Yes, another one of my tips! Disable Tools Preferences Datbase ObjectViewer ‘Open Object on Single Click.’ Is there a patch to allow a double click on a procedure on an open package body to take you to that procedure in the editor? This has been fixed for EA3 – to be released soon. Can you open the spec with the body? You can open the spec or the body, and then also open the other. But you can’t open both with a single click. So if you want you can set it to CSV but can you also see it as a regular result set in rows and then click in the results to export to excel? If you run your query as a statement with Ctrl-Enter, you can send the data to Excel via the Export dialog. Will it do intellisense like using the alias and pop up the column, object names? Yes! You can select more than one column… Can a DBA turn off items from a high level for users so the only thing they can perform would be selects? A DBA should turn things ON, not OFF. Create a user with only CONNECT and required SELECT privs and you’re good to go, regardless of which application they are using. I use PL/SQL Developer from allround automations and was SQL Developer illiterate and now I like this for myself as a DBA. Now I get to train developers on this tool since they have been asking how to use this tool. Thank you. No, THANK YOU! Can you run multi queries in the worksheet after you added it to the worksheet? Yes, highlight what you want to run, and hit Ctrl-Enter. Can you export the result sets to excel, etc. Yes. In version 4.0 and going forward, I recommend you use the XLSX option for exports. It will run faster and consume much, much less memory. Will this be available after the webinar? If you are a ODTUG member, check out the webinar recordings in the archives. That’s worth the $99 right there. Ask your boss if they have $99 in their training budget for you. If not, maybe time to look for another job? Can you run command lines from this tool? Like executes without issuing a command line prompt? Ok, I’m stumped on this one. Not sure what you’re asking. You can setup external tools under the Tools menu, and from there you could probably rig what you’re looking for, but I’m not sure what you’re looking for… This maybe?Where and when to put the program Is there any way to save a copy database command set (certain tables/views etc) in a script? Yes! Create a cart with the objects you want to be used in the Copy. Then use the new command-line interface to kick off SQL Developer to do the copy of those said objects. How can we export the preference and then import them into different or same version of SQL Developer ? Today, there’s no interface for this. But you could copy the files around manually…Kris Rice has a cool idea where you can set your preferences to be saved to your local drop box folder and then you can use SQL Developer from anywhere with the same preferences What happens to SQL*Plus commands like COL & BREAK Nothing. Those are not currently supported. Is there a place where all “”hotkey”" functionality is listed? thanks Yes. Tools – Preferences – Shortcut Keys. And you can change them! Any tips for the DBA side of things? will the SQL generated for objects have more information (e.g. user privileges) in v4? You can get this now. In Tools – Preferences – Database – Utilities – Export, check ‘Grants.’ Voila! You now have the code necessary to recreate your object privileges Is there a limit on the number of rows that could be imported / exported from/to excel ? The only hard-coded limit lies in Excel. For best performance, use v4 and XLSX formats for Exports. Is there a way to see/watch active sessions to see current SQL and the explain plan being used, etc. Kind of like that frog product. Cough, yes. Tools – Monitor Sessions. Click on session, see SQL and plan. The plan was added in v4. If you’re not in version 4, use the Reports – Active Sessions to get the plans. In the DBA section is there a way to manage say tablespaces to add data files, shrink, edit profiles, etc. Yes, we support all of that. View – DBA. Connect, go to the Storage node. Are you (Jeff) available for a live presentation at our Oracle User Group here in Indiana? Maybe. Email me and we’ll see, [email protected] Where do I go to download sql developer 4.0? The Internet of course! Can you directly edit query results? Nope. But what I think you’re asking is, can I edit the data in the tables that are reflected in my query results? You can change the query results by changing your query of course. Or this. Can you show html example? Sure. I’d embed the HTML here, but it’s a lot of code, try it for yourself! How can I quickly close many SQL worksheet windows, but not all? Window – Documents. Multi-select, hit the ‘Close Document(s)’ button. What does the vertical red line denote? That’s the margin. Tells you when you’ve typed too far and it’s time for a carriage return. Did DBA/Database Status/Instance Viewer make it officially into 4.0? It was sort-of included in the first EA. I have NO idea what you’re talking about, WINK-WINK. No, it’s not in v4.0. Is there a “”handy”" way to debug trigger code? Yes, open your trigger. Hit the debug button. Works great as long as it’s a DML trigger. Will you make your presentation file available for us ( in PPT and/or PDF format ) ? It’s on SlideShare. How do you get SqlDeveloper to escape ‘ correctly when you use the wizard to export data as insert statements? If it’s not doing that, it’s a bug. I’ll take a look at that scenario ASAP.

Read the article

Load and Web Performance Testing using Visual Studio Ultimate 2010-Part 3

- by Tarun Arora

Welcome back once again, in Part 1 of Load and Web Performance Testing using Visual Studio 2010 I talked about why Performance Testing the application is important, the test tools available in Visual Studio Ultimate 2010 and various test rig topologies, in Part 2 of Load and Web Performance Testing using Visual Studio 2010 I discussed the details of web performance & load tests as well as why it’s important to follow a goal based pattern while performance testing your application. In part 3 I’ll be discussing Test Result Analysis, Test Result Drill through, Test Report Generation, Test Run Comparison, Asp.net Profiler and some closing thoughts. Test Results – I see some creepy worms! In Part 2 we put together a web performance test and a load test, lets run the test to see load test to see how the Web site responds to the load simulation. While the load test is running you will be able to see close to real time analysis in the Load Test Analyser window. You can use the Load Test Analyser to conduct load test analysis in three ways: Monitor a running load test - A condensed set of the performance counter data is maintained in memory. To prevent the results memory requirements from growing unbounded, up to 200 samples for each performance counter are maintained. This includes 100 evenly spaced samples that span the current elapsed time of the run and the most recent 100 samples. After the load test run is completed - The test controller spools all collected performance counter data to a database while the test is running. Additional data, such as timing details and error details, is loaded into the database when the test completes. The performance data for a completed test is loaded from the database and analysed by the Load Test Analyser. Below you can see a screen shot of the summary view, this provides key results in a format that is compact and easy to read. You can also print the load test summary, this is generated after the test has completed or been stopped. Analyse the load test results of a previously run load test – We’ll see this in the section where i discuss comparison between two test runs. The performance counters can be plotted on the graphs. You also have the option to highlight a selected part of the test and view details, drill down to the user activity chart where you can hover over to see more details of the test run. Generate Report => Test Run Comparisons The level of reports you can generate using the Load Test Analyser is astonishing. You have the option to create excel reports and conduct side by side analysis of two test results or to track trend analysis. The tools also allows you to export the graph data either to MS Excel or to a CSV file. You can view the ASP.NET profiler report to conduct further analysis as well. View Data and Diagnostic Attachments opens the Choose Diagnostic Data Adapter Attachment dialog box to select an adapter to analyse the result type. For example, you can select an IntelliTrace adapter, click OK and open the IntelliTrace summary for the test agent that was used in the load test. Compare results This creates a set of reports that compares the data from two load test results using tables and bar charts. I have taken these screen shots from the MSDN documentation, I would highly recommend exploring the wealth of knowledge available on MSDN. Leaving Thoughts While load testing the application with an excessive load for a longer duration of time, i managed to bring the IIS to its knees by piling up a huge queue of requests waiting to be processed. This clearly means that the IIS had run out of threads as all the threads were busy processing existing request, one easy way of fixing this is by increasing the default number of allocated threads, but this might escalate the problem. The better suggestion is to try and drill down to the actual root cause of the problem. When ever the garbage collection runs it stops processing any pages so all requests that come in during that period are queued up, but realistically the garbage collection completes in fraction of a a second. To understand this better lets look at the .net heap, it is divided into large heap and small heap, anything greater than 85kB in size will be allocated to the Large object heap, the Large object heap is non compacting and remember large objects are expensive to move around, so if you are allocating something in the large object heap, make sure that you really need it! The small object heap on the other hand is divided into generations, so all objects that are supposed to be short-lived are suppose to live in Gen-0 and the long living objects eventually move to Gen-2 as garbage collection goes through. As you can see in the picture below all < 85 KB size objects are first assigned to Gen-0, when Gen-0 fills up and a new object comes in and finds Gen-0 full, the garbage collection process is started, the process checks for all the dead objects and assigns them as the valid candidate for deletion to free up memory and promotes all the remaining objects in Gen-0 to Gen-1. So in the future when ever you clean up Gen-1 you have to clean up Gen-0 as well. When you fill up Gen – 0 again, all of Gen – 1 dead objects are drenched and rest are moved to Gen-2 and Gen-0 objects are moved to Gen-1 to free up Gen-0, but by this time your Garbage collection process has started to take much more time than it usually takes. Now as I mentioned earlier when garbage collection is being run all page requests that come in during that period are queued up. Does this explain why possibly page requests are getting queued up, apart from this it could also be the case that you are waiting for a long running database process to complete. Lets explore the heap a bit more… What is really a case of crisis is when the objects are living long enough to make it to Gen-2 and then dying, this is definitely a high cost operation. But sometimes you need objects in memory, for example when you cache data you hold on to the objects because you need to use them right across the user session, which is acceptable. But if you wanted to see what extreme caching can do to your server then write a simple application that chucks in a lot of data in cache, run a load test over it for about 10-15 minutes, forcing a lot of data in memory causing the heap to run out of memory. If you get to such a state where you start running out of memory the IIS as a mode of recovery restarts the worker process. It is great way to free up all your memory in the heap but this would clear the cache. The problem with this is if the customer had 10 items in their shopping basket and that data was stored in the application cache, the user basket will now be empty forcing them either to get frustrated and go to a competitor website or if the customer is really patient, give it another try! How can you address this, well two ways of addressing this; 1. Workaround – A x86 bit processor only allows a maximum of 4GB of RAM, this means the machine effectively has around 3.4 GB of RAM available, the OS needs about 1.5 GB of RAM to run efficiently, the IIS and .net framework also need their share of memory, leaving you a heap of around 800 MB to play with. Because Team builds by default build your application in ‘Compile as any mode’ it means the application is build such that it will run in x86 bit mode if run on a x86 bit processor and run in a x64 bit mode if run on a x64 but processor. The problem with this is not all applications are really x64 bit compatible specially if you are using com objects or external libraries. So, as a quick win if you compiled your application in x86 bit mode by changing the compile as any selection to compile as x86 in the team build, you will be able to run your application on a x64 bit machine in x86 bit mode (WOW – By running Windows on Windows) and what that means is, you could use 8GB+ worth of RAM, if you take away everything else your application will roughly get a heap size of at least 4 GB to play with, which is immense. If you need a heap size of more than 4 GB you have either build a software for NASA or there is something fundamentally wrong in your application. 2. Solution – Now that you have put a workaround in place the IIS will not restart the worker process that regularly, which means you can take a breather and start working to get to the root cause of this memory leak. But this begs a question “How do I Identify possible memory leaks in my application?” Well i won’t say that there is one single tool that can tell you where the memory leak is, but trust me, ‘Performance Profiling’ is a great start point, it definitely gets you started in the right direction, let’s have a look at how. Performance Wizard - Start the Performance Wizard and select Instrumentation, this lets you measure function call counts and timings. Before running the performance session right click the performance session settings and chose properties from the context menu to bring up the Performance session properties page and as shown in the screen shot below, check the check boxes in the group ‘.NET memory profiling collection’ namely ‘Collect .NET object allocation information’ and ‘Also collect the .NET Object lifetime information’. Now if you fire off the profiling session on your pages you will notice that the results allows you to view ‘Object Lifetime’ which shows you the number of objects that made it to Gen-0, Gen-1, Gen-2, Large heap, etc. Another great feature about the profile is that if your application has > 5% cases where objects die right after making to the Gen-2 storage a threshold alert is generated to alert you. Since you have the option to also view the most expensive methods and by capturing the IntelliTrace data you can drill in to narrow down to the line of code that is the root cause of the problem. Well now that we have seen how crucial memory management is and how easy Visual Studio Ultimate 2010 makes it for us to identify and reproduce the problem with the best of breed tools in the product. Caching One of the main ways to improve performance is Caching. Which basically means you tell the web server that instead of going to the database for each request you keep the data in the webserver and when the user asks for it you serve it from the webserver itself. BUT that can have consequences! Let’s look at some code, trust me caching code is not very intuitive, I define a cache key for almost all searches made through the common search page and cache the results. The approach works fine, first time i get the data from the database and second time data is served from the cache, significant performance improvement, EXCEPT when two users try to do the same operation and run into each other. But it is easy to handle this by adding the lock as you can see in the snippet below. So, as long as a user comes in and finds that the cache is empty, the user locks and starts to get the cache no more concurrency issues. But lets say you are processing 10 requests per second, by the time i have locked the operation to get the results from the database, 9 other users came in and found that the cache key is null so after i have come out and populated the cache they will still go in to get the results again. The application will still be faster because the next set of 10 users and so on would continue to get data from the cache. BUT if we added another null check after locking to build the cache and before actual call to the db then the 9 users who follow me would not make the extra trip to the database at all and that would really increase the performance, but didn’t i say that the code won’t be very intuitive, may be you should leave a comment you don’t want another developer to come in and think what a fresher why is he checking for the cache key null twice !!! The downside of caching is, you are storing the data outside of the database and the data could be wrong because the updates applied to the database would make the data cached at the web server out of sync. So, how do you invalidate the cache? Well if you only had one way of updating the data lets say only one entry point to the data update you can write some logic to say that every time new data is entered set the cache object to null. But this approach will not work as soon as you have several ways of feeding data to the system or your system is scaled out across a farm of web servers. The perfect solution to this is Micro Caching which means you cache the query for a set time duration and invalidate the cache after that set duration. The advantage is every time the user queries for that data with in the time span for which you have cached the results there are no calls made to the database and the data is served right from the server which makes the response immensely quick. Now figuring out the appropriate time span for which you micro cache the query results really depends on the application. Lets say your website gets 10 requests per second, if you retain the cache results for even 1 minute you will have immense performance gains. You would reduce 90% hits to the database for searching. Ever wondered why when you go to e-bookers.com or xpedia.com or yatra.com to book a flight and you click on the book button because the fare seems too exciting and you get an error message telling you that the fare is not valid any more. Yes, exactly => That is a cache failure! These travel sites or price compare engines are not going to hit the database every time you hit the compare button instead the results will be served from the cache, because the query results are micro cached, its a perfect trade-off, by micro caching the results the site gains 100% performance benefits but every once in a while annoys a customer because the fare has expired. But the trade off works in the favour of these sites as they are still able to process up to 30+ page requests per second which means cater to the site traffic by may be losing 1 customer every once in a while to a competitor who is also using a similar caching technique what are the odds that the user will not come back to their site sooner or later? Recap Resources Below are some Key resource you might like to review. I would highly recommend the documentation, walkthroughs and videos available on MSDN. You can always make use of Fiddler to debug Web Performance Tests. Some community test extensions and plug ins available on Codeplex might also be of interest to you. The Road Ahead Thank you for taking the time out and reading this blog post, you may also want to read Part I and Part II if you haven’t so far. If you enjoyed the post, remember to subscribe to http://feeds.feedburner.com/TarunArora. Questions/Feedback/Suggestions, etc please leave a comment. Next ‘Load Testing in the cloud’, I’ll be working on exploring the possibilities of running Test controller/Agents in the Cloud. See you on the other side! Thank You! Share this post : CodeProject

Read the article

Seeking on a Heap, and Two Useful DMVs

- by Paul White

So far in this mini-series on seeks and scans, we have seen that a simple ‘seek’ operation can be much more complex than it first appears. A seek can contain one or more seek predicates – each of which can either identify at most one row in a unique index (a singleton lookup) or a range of values (a range scan). When looking at a query plan, we will often need to look at the details of the seek operator in the Properties window to see how many operations it is performing, and what type of operation each one is. As you saw in the first post in this series, the number of hidden seeking operations can have an appreciable impact on performance. Measuring Seeks and Scans I mentioned in my last post that there is no way to tell from a graphical query plan whether you are seeing a singleton lookup or a range scan. You can work it out – if you happen to know that the index is defined as unique and the seek predicate is an equality comparison, but there’s no separate property that says ‘singleton lookup’ or ‘range scan’. This is a shame, and if I had my way, the query plan would show different icons for range scans and singleton lookups – perhaps also indicating whether the operation was one or more of those operations underneath the covers. In light of all that, you might be wondering if there is another way to measure how many seeks of either type are occurring in your system, or for a particular query. As is often the case, the answer is yes – we can use a couple of dynamic management views (DMVs): sys.dm_db_index_usage_stats and sys.dm_db_index_operational_stats. Index Usage Stats The index usage stats DMV contains counts of index operations from the perspective of the Query Executor (QE) – the SQL Server component that is responsible for executing the query plan. It has three columns that are of particular interest to us: user_seeks – the number of times an Index Seek operator appears in an executed plan user_scans – the number of times a Table Scan or Index Scan operator appears in an executed plan user_lookups – the number of times an RID or Key Lookup operator appears in an executed plan An operator is counted once per execution (generating an estimated plan does not affect the totals), so an Index Seek that executes 10,000 times in a single plan execution adds 1 to the count of user seeks. Even less intuitively, an operator is also counted once per execution even if it is not executed at all. I will show you a demonstration of each of these things later in this post. Index Operational Stats The index operational stats DMV contains counts of index and table operations from the perspective of the Storage Engine (SE). It contains a wealth of interesting information, but the two columns of interest to us right now are: range_scan_count – the number of range scans (including unrestricted full scans) on a heap or index structure singleton_lookup_count – the number of singleton lookups in a heap or index structure This DMV counts each SE operation, so 10,000 singleton lookups will add 10,000 to the singleton lookup count column, and a table scan that is executed 5 times will add 5 to the range scan count. The Test Rig To explore the behaviour of seeks and scans in detail, we will need to create a test environment. The scripts presented here are best run on SQL Server 2008 Developer Edition, but the majority of the tests will work just fine on SQL Server 2005. A couple of tests use partitioning, but these will be skipped if you are not running an Enterprise-equivalent SKU. Ok, first up we need a database: USE master; GO IF DB_ID('ScansAndSeeks') IS NOT NULL DROP DATABASE ScansAndSeeks; GO CREATE DATABASE ScansAndSeeks; GO USE ScansAndSeeks; GO ALTER DATABASE ScansAndSeeks SET ALLOW_SNAPSHOT_ISOLATION OFF ; ALTER DATABASE ScansAndSeeks SET AUTO_CLOSE OFF, AUTO_SHRINK OFF, AUTO_CREATE_STATISTICS OFF, AUTO_UPDATE_STATISTICS OFF, PARAMETERIZATION SIMPLE, READ_COMMITTED_SNAPSHOT OFF, RESTRICTED_USER ; Notice that several database options are set in particular ways to ensure we get meaningful and reproducible results from the DMVs. In particular, the options to auto-create and update statistics are disabled. There are also three stored procedures, the first of which creates a test table (which may or may not be partitioned). The table is pretty much the same one we used yesterday: The table has 100 rows, and both the key_col and data columns contain the same values – the integers from 1 to 100 inclusive. The table is a heap, with a non-clustered primary key on key_col, and a non-clustered non-unique index on the data column. The only reason I have used a heap here, rather than a clustered table, is so I can demonstrate a seek on a heap later on. The table has an extra column (not shown because I am too lazy to update the diagram from yesterday) called padding – a CHAR(100) column that just contains 100 spaces in every row. It’s just there to discourage SQL Server from choosing table scan over an index + RID lookup in one of the tests. The first stored procedure is called ResetTest: CREATE PROCEDURE dbo.ResetTest @Partitioned BIT = 'false' AS BEGIN SET NOCOUNT ON ; IF OBJECT_ID(N'dbo.Example', N'U') IS NOT NULL BEGIN DROP TABLE dbo.Example; END ; -- Test table is a heap -- Non-clustered primary key on 'key_col' CREATE TABLE dbo.Example ( key_col INTEGER NOT NULL, data INTEGER NOT NULL, padding CHAR(100) NOT NULL DEFAULT SPACE(100), CONSTRAINT [PK dbo.Example key_col] PRIMARY KEY NONCLUSTERED (key_col) ) ; IF @Partitioned = 'true' BEGIN -- Enterprise, Trial, or Developer -- required for partitioning tests IF SERVERPROPERTY('EngineEdition') = 3 BEGIN EXECUTE (' DROP TABLE dbo.Example ; IF EXISTS ( SELECT 1 FROM sys.partition_schemes WHERE name = N''PS'' ) DROP PARTITION SCHEME PS ; IF EXISTS ( SELECT 1 FROM sys.partition_functions WHERE name = N''PF'' ) DROP PARTITION FUNCTION PF ; CREATE PARTITION FUNCTION PF (INTEGER) AS RANGE RIGHT FOR VALUES (20, 40, 60, 80, 100) ; CREATE PARTITION SCHEME PS AS PARTITION PF ALL TO ([PRIMARY]) ; CREATE TABLE dbo.Example ( key_col INTEGER NOT NULL, data INTEGER NOT NULL, padding CHAR(100) NOT NULL DEFAULT SPACE(100), CONSTRAINT [PK dbo.Example key_col] PRIMARY KEY NONCLUSTERED (key_col) ) ON PS (key_col); '); END ELSE BEGIN RAISERROR('Invalid SKU for partition test', 16, 1); RETURN; END; END ; -- Non-unique non-clustered index on the 'data' column CREATE NONCLUSTERED INDEX [IX dbo.Example data] ON dbo.Example (data) ; -- Add 100 rows INSERT dbo.Example WITH (TABLOCKX) ( key_col, data ) SELECT key_col = V.number, data = V.number FROM master.dbo.spt_values AS V WHERE V.[type] = N'P' AND V.number BETWEEN 1 AND 100 ; END; GO The second stored procedure, ShowStats, displays information from the Index Usage Stats and Index Operational Stats DMVs: CREATE PROCEDURE dbo.ShowStats @Partitioned BIT = 'false' AS BEGIN -- Index Usage Stats DMV (QE) SELECT index_name = ISNULL(I.name, I.type_desc), scans = IUS.user_scans, seeks = IUS.user_seeks, lookups = IUS.user_lookups FROM sys.dm_db_index_usage_stats AS IUS JOIN sys.indexes AS I ON I.object_id = IUS.object_id AND I.index_id = IUS.index_id WHERE IUS.database_id = DB_ID(N'ScansAndSeeks') AND IUS.object_id = OBJECT_ID(N'dbo.Example', N'U') ORDER BY I.index_id ; -- Index Operational Stats DMV (SE) IF @Partitioned = 'true' SELECT index_name = ISNULL(I.name, I.type_desc), partitions = COUNT(IOS.partition_number), range_scans = SUM(IOS.range_scan_count), single_lookups = SUM(IOS.singleton_lookup_count) FROM sys.dm_db_index_operational_stats ( DB_ID(N'ScansAndSeeks'), OBJECT_ID(N'dbo.Example', N'U'), NULL, NULL ) AS IOS JOIN sys.indexes AS I ON I.object_id = IOS.object_id AND I.index_id = IOS.index_id GROUP BY I.index_id, -- Key I.name, I.type_desc ORDER BY I.index_id; ELSE SELECT index_name = ISNULL(I.name, I.type_desc), range_scans = SUM(IOS.range_scan_count), single_lookups = SUM(IOS.singleton_lookup_count) FROM sys.dm_db_index_operational_stats ( DB_ID(N'ScansAndSeeks'), OBJECT_ID(N'dbo.Example', N'U'), NULL, NULL ) AS IOS JOIN sys.indexes AS I ON I.object_id = IOS.object_id AND I.index_id = IOS.index_id GROUP BY I.index_id, -- Key I.name, I.type_desc ORDER BY I.index_id; END; The final stored procedure, RunTest, executes a query written against the example table: CREATE PROCEDURE dbo.RunTest @SQL VARCHAR(8000), @Partitioned BIT = 'false' AS BEGIN -- No execution plan yet SET STATISTICS XML OFF ; -- Reset the test environment EXECUTE dbo.ResetTest @Partitioned ; -- Previous call will throw an error if a partitioned -- test was requested, but SKU does not support it IF @@ERROR = 0 BEGIN -- IO statistics and plan on SET STATISTICS XML, IO ON ; -- Test statement EXECUTE (@SQL) ; -- Plan and IO statistics off SET STATISTICS XML, IO OFF ; EXECUTE dbo.ShowStats @Partitioned; END; END; The Tests The first test is a simple scan of the heap table: EXECUTE dbo.RunTest @SQL = 'SELECT * FROM Example'; The top result set comes from the Index Usage Stats DMV, so it is the Query Executor’s (QE) view. The lower result is from Index Operational Stats, which shows statistics derived from the actions taken by the Storage Engine (SE). We see that QE performed 1 scan operation on the heap, and SE performed a single range scan. Let’s try a single-value equality seek on a unique index next: EXECUTE dbo.RunTest @SQL = 'SELECT key_col FROM Example WHERE key_col = 32'; This time we see a single seek on the non-clustered primary key from QE, and one singleton lookup on the same index by the SE. Now for a single-value seek on the non-unique non-clustered index: EXECUTE dbo.RunTest @SQL = 'SELECT data FROM Example WHERE data = 32'; QE shows a single seek on the non-clustered non-unique index, but SE shows a single range scan on that index – not the singleton lookup we saw in the previous test. That makes sense because we know that only a single-value seek into a unique index is a singleton seek. A single-value seek into a non-unique index might retrieve any number of rows, if you think about it. The next query is equivalent to the IN list example seen in the first post in this series, but it is written using OR (just for variety, you understand): EXECUTE dbo.RunTest @SQL = 'SELECT data FROM Example WHERE data = 32 OR data = 33'; The plan looks the same, and there’s no difference in the stats recorded by QE, but the SE shows two range scans. Again, these are range scans because we are looking for two values in the data column, which is covered by a non-unique index. I’ve added a snippet from the Properties window to show that the query plan does show two seek predicates, not just one. Now let’s rewrite the query using BETWEEN: EXECUTE dbo.RunTest @SQL = 'SELECT data FROM Example WHERE data BETWEEN 32 AND 33'; Notice the seek operator only has one predicate now – it’s just a single range scan from 32 to 33 in the index – as the SE output shows. For the next test, we will look up four values in the key_col column: EXECUTE dbo.RunTest @SQL = 'SELECT key_col FROM Example WHERE key_col IN (2,4,6,8)'; Just a single seek on the PK from the Query Executor, but four singleton lookups reported by the Storage Engine – and four seek predicates in the Properties window. On to a more complex example: EXECUTE dbo.RunTest @SQL = 'SELECT * FROM Example WITH (INDEX([PK dbo.Example key_col])) WHERE key_col BETWEEN 1 AND 8'; This time we are forcing use of the non-clustered primary key to return eight rows. The index is not covering for this query, so the query plan includes an RID lookup into the heap to fetch the data and padding columns. The QE reports a seek on the PK and a lookup on the heap. The SE reports a single range scan on the PK (to find key_col values between 1 and 8), and eight singleton lookups on the heap. Remember that a bookmark lookup (RID or Key) is a seek to a single value in a ‘unique index’ – it finds a row in the heap or cluster from a unique RID or clustering key – so that’s why lookups are always singleton lookups, not range scans. Our next example shows what happens when a query plan operator is not executed at all: EXECUTE dbo.RunTest @SQL = 'SELECT key_col FROM Example WHERE key_col = 8 AND @@TRANCOUNT < 0'; The Filter has a start-up predicate which is always false (if your @@TRANCOUNT is less than zero, call CSS immediately). The index seek is never executed, but QE still records a single seek against the PK because the operator appears once in an executed plan. The SE output shows no activity at all. This next example is 2008 and above only, I’m afraid: EXECUTE dbo.RunTest @SQL = 'SELECT * FROM Example WHERE key_col BETWEEN 1 AND 30', @Partitioned = 'true'; This is the first example to use a partitioned table. QE reports a single seek on the heap (yes – a seek on a heap), and the SE reports two range scans on the heap. SQL Server knows (from the partitioning definition) that it only needs to look at partitions 1 and 2 to find all the rows where key_col is between 1 and 30 – the engine seeks to find the two partitions, and performs a range scan seek on each partition. The final example for today is another seek on a heap – try to work out the output of the query before running it! EXECUTE dbo.RunTest @SQL = 'SELECT TOP (2) WITH TIES * FROM Example WHERE key_col BETWEEN 1 AND 50 ORDER BY $PARTITION.PF(key_col) DESC', @Partitioned = 'true'; Notice the lack of an explicit Sort operator in the query plan to enforce the ORDER BY clause, and the backward range scan. © 2011 Paul White email: [email protected] twitter: @SQL_Kiwi

Read the article

Need help unformatting text

- by Axilus

I am currently programming a Visual C# service to receive emails from various sources then I take certain info and organize it in a database using Regex to retrieve the deferent cell values (such as header, body, problem, cost, etc.etc.). My problem is that I am currently using a Hotmail account to email the service which the service then extracts data and writes it to a csv file; however this is all going fine an dandy except for the fact that the text is formated so when there is a "\n" or something of the sort, the program decides to not input the data that follows that into the cell. For instance, if I emailed this: Cost:$1000.00 Body: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed vulputate mattis dolor, a dapibus turpis lacinia mollis. Fusce in enim nulla, sit amet gravida dolor. Suspendisse at nisi velit, vel ornare odio. Integer metus justo, imperdiet et pellentesque in, facilisis dignissim metus. Suspendisse potenti. Vivamus purus nisl, hendrerit sit amet rutrum eu, euismod in felis. Maecenas blandit, metus ac eleifend vulputate, nibh ligula mollis mi, non malesuada nunc tellus ac risus. In at rutrum elit. Proin metus sem, ullamcorper ut rhoncus sed, semper nec tellus. Maecenas adipiscing nisl nec elit egestas vel bibendum justo vehicula. Aliquam erat volutpat. Nullam fermentum enim in magna consequat a lacinia felis iaculis. Ut odio justo, consectetur nec cursus eu, dignissim non sapien. Duis tincidunt fringilla aliquet. Vivamus elementum lobortis massa vel posuere. Aenean non congue odio. Aenean aliquam elit volutpat tortor tempor pharetra. Mauris non est eu orci ultricies lacinia. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Ut vitae orci lectus, sit amet convallis nunc. Vivamus feugiat ante at justo auctor at pretium ante congue. In hac habitasse platea dictumst. Sed at feugiat odio. The body cell would look as follows: <span class=3D"ecxecxApple-style-s= pan" style=3D"font-family:Arial=2C Helvetica=2C sans=3Bfont-size:11px"><p s= tyle=3D"text-align:justify=3Bfont-size:11px=3Bline-height:14px=3Bmargin-rig= ht:0px=3Bmargin-bottom:14px=3Bmargin-left:0px=3Bpadding-top:0px=3Bpadding-r= ight:0px=3Bpadding-bottom:0px=3Bpadding-left:0px">Lorem ipsum dolor sit ame= t=2C consectetur adipiscing elit. Praesent in augue nec justo tempor varius= eu et tellus. Nunc id massa tortor=2C ut lobortis sem. Class aptent taciti= sociosqu ad litora torquent per conubia nostra=2C per inceptos himenaeos. = Maecenas quis nisl nec quam tristique posuere sed at nibh. Cras fringilla v= estibulum metus vel porttitor. Cras iaculis=2C erat nec gravida accumsan=2C= metus felis vestibulum risus=2C quis venenatis nisl nulla sed diam. Aenean= quis viverra velit. Etiam quis massa lectus=2C faucibus facilisis sem. Cur= abitur non eros tellus. Sed at ligula neque. Donec elementum rhoncus volutp= at. Curabitur eu accumsan erat. Phasellus auctor odio dolor=2C ut ornare au= gue. Suspendisse vel est nibh. Vivamus facilisis placerat augue sit amet al= iquam. Maecenas viverra=2C ipsum a tincidunt elementum=2C arcu tellus rutru= m ipsum=2C et dignissim urna orci ac mi. Vivamus non odio massa. Nulla cong= ue massa eu leo pretium non consequat urna molestie.</p><p style=3D"text-al= ign:justify=3Bfont-size:11px=3Bline-height:14px=3Bmargin-right:0px=3Bmargin= -bottom:14px=3Bmargin-left:0px=3Bpadding-top:0px=3Bpadding-right:0px=3Bpadd= ing-bottom:0px=3Bpadding-left:0px">Integer neque odio=2C scelerisque at mol= estie quis=2C congue sed arcu. Praesent a arcu odio. Donec sollicitudin=2C = quam vel tincidunt lobortis=2C urna augue cursus lorem=2C in eleifend nunc = risus nec neque. Donec euismod mauris non nibh blandit sollicitudin. Vivamu= s sed tincidunt augue. Suspendisse iaculis massa ut tellus rutrum auctor. C= ras venenatis consequat urna in viverra. Ut blandit imperdiet dolor non sce= lerisque. Suspendisse potenti. Sed vitae lacus ac odio euismod tempus. Aene= an ut sem odio. Curabitur auctor purus a diam iaculis facilisis. Integer mo= lestie commodo mauris a imperdiet. Nunc aliquet tempus orci sit amet viverr= a.</p><p style=3D"text-align:justify=3Bfont-size:11px=3Bline-height:14px=3B= margin-right:0px=3Bmargin-bottom:14px=3Bmargin-left:0px=3Bpadding-top:0px= =3Bpadding-right:0px=3Bpadding-bottom:0px=3Bpadding-left:0px">Morbi ultrici= es fermentum magna=2C et ultricies urna convallis non. Aenean nibh felis=2C= faucibus et pellentesque ultrices=2C accumsan a ligula. Aliquam vulputate = nisi vitae mi pretium et pretium nulla aliquet. Nam egestas diam vel elit c= ommodo fermentum. Aenean venenatis bibendum tellus=2C eget scelerisque risu= s consequat ut. In porta interdum eleifend. Cras laoreet venenatis pulvinar= .. Praesent ultricies tristique lorem=2C quis interdum arcu scelerisque nec.= Quisque arcu tellus=2C consectetur vel mattis nec=2C feugiat ac quam. Prae= sent sit amet fermentum nulla. Nulla lobortis=2C odio vitae elementum aucto= r=2C libero turpis condimentum mi=2C sed aliquet felis sapien nec tortor. I= nteger vehicula=2C neque in egestas accumsan=2C felis metus sagittis nulla= =2C eu dapibus ligula ipsum ut sapien. Nulla quis urna tortor=2C sed facili= sis leo. In at metus sed velit venenatis varius. Fusce aliquam mattis enim= =2C vitae tincidunt sem cursus in.</p><p style=3D"text-align:justify=3Bfont= -size:11px=3Bline-height:14px=3Bmargin-right:0px=3Bmargin-bottom:14px=3Bmar= gin-left:0px=3Bpadding-top:0px=3Bpadding-right:0px=3Bpadding-bottom:0px=3Bp= adding-left:0px">Proin tincidunt ligula at ligula bibendum vitae condimentu= m nunc congue. Curabitur ac magna nibh=2C vel accumsan nisl. Duis nec eros = et purus vestibulum tincidunt at sit amet libero. Donec eu nibh eros. Pelle= ntesque habitant morbi tristique senectus et netus et malesuada fames ac tu= rpis egestas. Donec accumsan=2C tellus at luctus faucibus=2C est nibh sempe= r diam=2C vitae adipiscing lorem tellus vel nulla. Donec eget ipsum ut lore= m tristique ultricies. Aliquam sem diam=2C semper sit amet volutpat pretium= =2C lobortis et eros. Sed vel iaculis metus. Phasellus malesuada elementum = porta.</p><p style=3D"text-align:justify=3Bfont-size:11px=3Bline-height:14p= x=3Bmargin-right:0px=3Bmargin-bottom:14px=3Bmargin-left:0px=3Bpadding-top:0= px=3Bpadding-right:0px=3Bpadding-bottom:0px=3Bpadding-left:0px">Fusce tinci= dunt dignissim massa quis dapibus. Sed aliquet consequat orci=2C eu cursus = libero dapibus vitae. Pellentesque at felis felis=2C vitae condimentum libe= ro. Vivamus eros erat=2C elementum et tristique vitae=2C mattis et neque. P= raesent bibendum leo ac tortor congue id mollis libero ornare. Pellentesque= adipiscing accumsan mi=2C a bibendum purus dignissim id. Cum sociis natoqu= e penatibus et magnis dis parturient montes=2C nascetur ridiculus mus. Morb= i mollis nisi in nibh cursus facilisis. Ut eu quam dolor=2C sit amet congue= orci. Aliquam quam dolor=2C viverra vitae varius sed=2C molestie et quam. = Suspendisse purus mauris=2C fermentum condimentum pharetra at=2C molestie a= nunc. Nam rhoncus euismod venenatis. Nam pellentesque quam ac ipsum volutp= at a eleifend odio imperdiet. Class aptent taciti sociosqu ad litora torque= nt per conubia nostra=2C per inceptos himenaeos. Nulla in nunc magna. Lorem= ipsum dolor sit amet=2C consectetur adipiscing elit. Donec pretium tincidu= nt gravida.</p></span> As you can tell I need a way to get rid of all that html junk and make it readable again. Is there anyway to do this with Regex? Or an easier way if possible. Cheers

Read the article

How John Got 15x Improvement Without Really Trying

- by rchrd

The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here. How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

Read the article

Retrieving saved checkboxes' name and values from database

- by sermed

I have a form with checkboxes, each one has a value. When the registered user select any checkbox the value is incremented (the summation) and then then registred user save his selection of checkbox if he satisfied with the result of summation into database all this work fine ...i want to enable the registred user to view his selection history by retriving and displaying the checkboxes he selected in a page with thier values ... How I can do that? I'm just able to save the selected checkboxes as choice 1, choice 2, for example .. I want to view the selected checkboxes that is saved in database as the appear in the page when the user first select them: for example if the registred user selects these 3 options LEAD DEEP KEEL (1825) FULLY BATTENED MAINSAIL (558) TEAK SIDE DECKS (2889) They will be saved as for example (choice1, choice2, choice3). But if he want to view selected checkboxes the appear exactly as first he selects them: LEAD DEEP KEEL (1825) FULLY BATTENED MAINSAIL (558) TEAK SIDE DECKS (2889) This is my user table: $query="CREATE TABLE User( user_id varchar(20), password varchar(40), user_type varchar(20), firstname varchar(30), lastname varchar(30), street varchar(50), city varchar(50), county varchar(50), post_code varchar(10), country varchar(50), gender varchar(6), dob varchar(15), tel_no varchar(50), vals varchar(50), email varchar(50))"; and the code to inser the options selected to database <?php include("databaseconnection.php"); $str = ''; foreach($_POST as $key => $val) if (strpos($key,'choice') !== false) $str .= $key.','; $query = "INSERT INTO User (vals) VALUES('$str')"; $result=mysql_query($query,$conn); if ($result) { (mysql_error(); } else { echo " done"; } ?> And this is my form: function checkTotal() { document.listForm.total.value = ''; var sum = 0; for (i=0;i <form name="listForm" method="post" action="insert_options.php" > <TABLE cellPadding=3 width=600 border=0> <TBODY> <TR> <TH align=left width="87%" bgColor=#b0b3b4><SPAN class=whiteText>Item</SPAN></TH> <TH align=right width="13%" bgColor=#b0b3b4><SPAN class=whiteText>Select</SPAN></TH></TR> <TR> <TD bgcolor="#9da8af"colSpan=2><SPAN class=normalText><B>General</B></SPAN></TD></TR> <TR> <TD bgcolor="#c4c8ca"><SPAN class=normalText >TEAK SIDE DECKS (2889)</SPAN></TD> <TD align=right bgColor=#c4c8ca><input name="choice" value="2889" type="checkbox" onchange="checkTotal()" /></TD></TR> <TR> <TD bgColor=#c4c8ca><SPAN class=normalText>LEAD DEEP KEEL (1825)</SPAN></TD> <TD align=right bgColor=#c4c8ca><input name="choice" value="1825" type="checkbox" onchange="checkTotal()"></TD></TR> <TR> <TD bgColor=#c4c8ca><SPAN class=normalText>FULLY BATTENED MAINSAIL (558)</SPAN></TD> <TD align=right bgColor=#c4c8ca><input name="choice" value="558" type="checkbox" onchange="checkTotal()"></TD></TR> <TR> <TD bgColor=#c4c8ca><SPAN class=normalText>HIGH TECH SAILS FOR CONVENTIONAL RIG (1979)</SPAN></TD> <TD align=right bgColor=#c4c8ca><input name="choice" value="1979" type="checkbox" onchange="checkTotal()"></TD></TR> <TR> <TD bgColor=#c4c8ca><SPAN class=normalText>IN MAST REEFING WITH HIGH TECH SAILS (2539)</SPAN></TD> <TD align=right bgColor=#c4c8ca><input name="choice" value="2539" type="checkbox" onchange="checkTotal()"></TD></TR> <TR> <TD bgColor=#c4c8ca><SPAN class=normalText>SPlNNAKER GEAR (POLE LINES DECK FITTINGS) (820)</SPAN></TD> <TD align=right bgColor=#c4c8ca><input name="choice" value="820" type="checkbox" onchange="checkTotal()"></TD></TR> <TR> <TD bgColor=#c4c8ca><SPAN class=normalText>SPINNAKER POLE VERTICAL STOWAGE SYSTEM (214)</SPAN></TD> <TD align=right bgColor=#c4c8ca><input name="choice" value="214" type="checkbox" onchange="checkTotal()"></TD></TR> <TR> <TD bgColor=#c4c8ca><SPAN class=normalText>GAS ROD KICKER (208)</SPAN></TD> <TD align=right bgColor=#c4c8ca><input name="choice" value="208" type="checkbox" onchange="checkTotal()"></TD></TR> <TR> <TD bgColor=#c4c8ca><SPAN class=normalText>SIDE RAIL OPENINGS (BOTH SIDES) (392)</SPAN></TD> <TD align=right bgColor=#c4c8ca><input name="choice" value="392" type="checkbox" onchange="checkTotal()"></TD></TR> <TR> <TD bgColor=#c4c8ca><SPAN class=normalText>SPRING CLEATS MIDSHIPS -ALUMIMIUM (148)</SPAN></TD> <TD align=right bgColor=#c4c8ca><input name="choice" value="148" type="checkbox" onchange="checkTotal()"></TD></TR> <TR> <TD bgColor=#c4c8ca><SPAN class=normalText>ELECTRIC ANCHOR WINDLASS (1189)</SPAN></TD> <TD align=right bgColor=#c4c8ca><input name="choice" value="1189" type="checkbox" onchange="checkTotal()"> </TD></TR> <TR> <TD bgColor=#c4c8ca><SPAN class=normalText>ANCHOR CHAIN GALVANISED (50m) (202)</SPAN></TD> <TD align=right bgColor=#c4c8ca><input name="choice" value="202" type="checkbox" onchange="checkTotal()"> </TD></TR> <TR> <TD bgColor=#c4c8ca><SPAN class=normalText>ANCHOR CHAIN GALVANISED (50m) (1141)</SPAN></TD> <TD align=right bgColor=#c4c8ca><input name="choice" value="1141" type="checkbox" onchange="checkTotal()"></TD></TR> <TR> <TD bgcolor="#9da8af"colSpan=2><SPAN class=normalText><B>NAVIGATION & ELECTRONICS</B></SPAN></TD></TR> <TR> <TD bgcolor="#c4c8ca"><SPAN class=normalText >WIND VANE (STAINLESS STEEL)(41)</SPAN></TD> <TD align=right bgColor=#c4c8ca><input name="choice" value="41" type="checkbox" onchange="checkTotal()" /></TD></TR> <TR> <TD bgColor=#c4c8ca><SPAN class=normalText>RAYMARINE ST6O LOG & DEPTH (SEPARATE UNITS)(226)</SPAN></TD> <TD align=right bgColor=#c4c8ca><input name="choice" value="226" type="checkbox" onchange="checkTotal()"></TD></TR> <TR> <TD bgcolor="#9da8af"colSpan=2><SPAN class=normalText><B>ENGINES & ELECTRICS</B></SPAN></TD></TR> <TR> <TD bgColor=#c4c8ca><SPAN class=normalText>SHORE SUPPLY (220V) WITH 3 OUTLETS (EXCLUDJNG SHORE CABLE) (327)</SPAN></TD> <TD align=right bgColor=#c4c8ca><input name="choice" value="327" type="checkbox" onchange="checkTotal()"></TD></TR> <TR> <TD bgColor=#c4c8ca><SPAN class=normalText>3rd BATTERY(14OA/H)(196)</SPAN></TD> <TD align=right bgColor=#c4c8ca><input name="choice" value="196" type="checkbox" onchange="checkTotal()"></TD></TR> <TD bgColor=#c4c8ca><SPAN class=normalText>24 AMP BATTERY CHARGER (475)</SPAN></TD> <TD align=right bgColor=#c4c8ca><input name="choice" value="475" type="checkbox" onchange="checkTotal()"></TD></TR> <TD bgColor=#c4c8ca><SPAN class=normalText>2 BLADED FOLDING PROPELLER (UPGRADE)(299)</SPAN></TD> <TD align=right bgColor=#c4c8ca><input name="choice" value="299" type="checkbox" onchange="checkTotal()"></TD></TR> <TR> <TD bgcolor="#9da8af"colSpan=2><SPAN class=normalText><B>BELOW DECKS/DOMESTIC</B></SPAN></TD></TR> <TD bgColor=#c4c8ca><SPAN class=normalText>WARM WATER (FROM ENGINE & 220V)(749)</SPAN></TD> <TD align=right bgColor=#c4c8ca><input name="choice" value="749" type="checkbox" onchange="checkTotal()"></TD></TR> <TD bgColor=#c4c8ca><SPAN class=normalText>SHOWER IN AFT HEADS WITH PUMPOUT(446)</SPAN></TD> <TD align=right bgColor=#c4c8ca><input name="choice" value="446" type="checkbox" onchange="checkTotal()"></TD></TR> <TD bgColor=#c4c8ca><SPAN class=normalText>DECK SUCTION DISPOSAL FOR HOLDINGTANK(166)</SPAN></TD> <TD align=right bgColor=#c4c8ca><input name="choice" value="166" type="checkbox" onchange="checkTotal()"></TD></TR> <TD bgColor=#c4c8ca><SPAN class=normalText>REFRIGERATED COOLBOX (12V)(666)</SPAN></TD> <TD align=right bgColor=#c4c8ca><input name="choice" value="666" type="checkbox" onchange="checkTotal()"></TD></TR> <TD bgColor=#c4c8ca><SPAN class=normalText>LFS SAFETY PACKAGE (COCKPIT HARNESS POINTS STAINLESS STEEL JACKSTAYS)(208)</SPAN></TD> <TD align=right bgColor=#c4c8ca><input name="choice" value="208" type="checkbox" onchange="checkTotal()"></TD></TR> <TD bgColor=#c4c8ca><SPAN class=normalText>UPHOLSTERY UPGRADE IN SALOON (SUEDETYPE)(701)</SPAN></TD> <TD align=right bgColor=#c4c8ca><input name="choice" value="701" type="checkbox" onchange="checkTotal()"></TD></TR> <TR> <TD bgcolor="#9da8af"colSpan=2><SPAN class=normalText><B>NAVIGATION ELECTRONICS & ELECTRICS</B></SPAN></TD></TR> <TD bgColor=#c4c8ca><SPAN class=normalText>VHF RADIO AERIAL CABLED TO NAVIGATION AREA(178)</SPAN></TD> <TD align=right bgColor=#c4c8ca><input name="choice" value="178" type="checkbox" onchange="checkTotal()"></TD></TR> </table>

Developer IT