Search Results

Search found 2628 results on 106 pages for 'lexical analysis'.

Page 31/106 | < Previous Page | 27 28 29 30 31 32 33 34 35 36 37 38 | Next Page >

Load and Web Performance Testing using Visual Studio Ultimate 2010-Part 3

- by Tarun Arora

Welcome back once again, in Part 1 of Load and Web Performance Testing using Visual Studio 2010 I talked about why Performance Testing the application is important, the test tools available in Visual Studio Ultimate 2010 and various test rig topologies, in Part 2 of Load and Web Performance Testing using Visual Studio 2010 I discussed the details of web performance & load tests as well as why it’s important to follow a goal based pattern while performance testing your application. In part 3 I’ll be discussing Test Result Analysis, Test Result Drill through, Test Report Generation, Test Run Comparison, Asp.net Profiler and some closing thoughts. Test Results – I see some creepy worms! In Part 2 we put together a web performance test and a load test, lets run the test to see load test to see how the Web site responds to the load simulation. While the load test is running you will be able to see close to real time analysis in the Load Test Analyser window. You can use the Load Test Analyser to conduct load test analysis in three ways: Monitor a running load test - A condensed set of the performance counter data is maintained in memory. To prevent the results memory requirements from growing unbounded, up to 200 samples for each performance counter are maintained. This includes 100 evenly spaced samples that span the current elapsed time of the run and the most recent 100 samples. After the load test run is completed - The test controller spools all collected performance counter data to a database while the test is running. Additional data, such as timing details and error details, is loaded into the database when the test completes. The performance data for a completed test is loaded from the database and analysed by the Load Test Analyser. Below you can see a screen shot of the summary view, this provides key results in a format that is compact and easy to read. You can also print the load test summary, this is generated after the test has completed or been stopped. Analyse the load test results of a previously run load test – We’ll see this in the section where i discuss comparison between two test runs. The performance counters can be plotted on the graphs. You also have the option to highlight a selected part of the test and view details, drill down to the user activity chart where you can hover over to see more details of the test run. Generate Report => Test Run Comparisons The level of reports you can generate using the Load Test Analyser is astonishing. You have the option to create excel reports and conduct side by side analysis of two test results or to track trend analysis. The tools also allows you to export the graph data either to MS Excel or to a CSV file. You can view the ASP.NET profiler report to conduct further analysis as well. View Data and Diagnostic Attachments opens the Choose Diagnostic Data Adapter Attachment dialog box to select an adapter to analyse the result type. For example, you can select an IntelliTrace adapter, click OK and open the IntelliTrace summary for the test agent that was used in the load test. Compare results This creates a set of reports that compares the data from two load test results using tables and bar charts. I have taken these screen shots from the MSDN documentation, I would highly recommend exploring the wealth of knowledge available on MSDN. Leaving Thoughts While load testing the application with an excessive load for a longer duration of time, i managed to bring the IIS to its knees by piling up a huge queue of requests waiting to be processed. This clearly means that the IIS had run out of threads as all the threads were busy processing existing request, one easy way of fixing this is by increasing the default number of allocated threads, but this might escalate the problem. The better suggestion is to try and drill down to the actual root cause of the problem. When ever the garbage collection runs it stops processing any pages so all requests that come in during that period are queued up, but realistically the garbage collection completes in fraction of a a second. To understand this better lets look at the .net heap, it is divided into large heap and small heap, anything greater than 85kB in size will be allocated to the Large object heap, the Large object heap is non compacting and remember large objects are expensive to move around, so if you are allocating something in the large object heap, make sure that you really need it! The small object heap on the other hand is divided into generations, so all objects that are supposed to be short-lived are suppose to live in Gen-0 and the long living objects eventually move to Gen-2 as garbage collection goes through. As you can see in the picture below all < 85 KB size objects are first assigned to Gen-0, when Gen-0 fills up and a new object comes in and finds Gen-0 full, the garbage collection process is started, the process checks for all the dead objects and assigns them as the valid candidate for deletion to free up memory and promotes all the remaining objects in Gen-0 to Gen-1. So in the future when ever you clean up Gen-1 you have to clean up Gen-0 as well. When you fill up Gen – 0 again, all of Gen – 1 dead objects are drenched and rest are moved to Gen-2 and Gen-0 objects are moved to Gen-1 to free up Gen-0, but by this time your Garbage collection process has started to take much more time than it usually takes. Now as I mentioned earlier when garbage collection is being run all page requests that come in during that period are queued up. Does this explain why possibly page requests are getting queued up, apart from this it could also be the case that you are waiting for a long running database process to complete. Lets explore the heap a bit more… What is really a case of crisis is when the objects are living long enough to make it to Gen-2 and then dying, this is definitely a high cost operation. But sometimes you need objects in memory, for example when you cache data you hold on to the objects because you need to use them right across the user session, which is acceptable. But if you wanted to see what extreme caching can do to your server then write a simple application that chucks in a lot of data in cache, run a load test over it for about 10-15 minutes, forcing a lot of data in memory causing the heap to run out of memory. If you get to such a state where you start running out of memory the IIS as a mode of recovery restarts the worker process. It is great way to free up all your memory in the heap but this would clear the cache. The problem with this is if the customer had 10 items in their shopping basket and that data was stored in the application cache, the user basket will now be empty forcing them either to get frustrated and go to a competitor website or if the customer is really patient, give it another try! How can you address this, well two ways of addressing this; 1. Workaround – A x86 bit processor only allows a maximum of 4GB of RAM, this means the machine effectively has around 3.4 GB of RAM available, the OS needs about 1.5 GB of RAM to run efficiently, the IIS and .net framework also need their share of memory, leaving you a heap of around 800 MB to play with. Because Team builds by default build your application in ‘Compile as any mode’ it means the application is build such that it will run in x86 bit mode if run on a x86 bit processor and run in a x64 bit mode if run on a x64 but processor. The problem with this is not all applications are really x64 bit compatible specially if you are using com objects or external libraries. So, as a quick win if you compiled your application in x86 bit mode by changing the compile as any selection to compile as x86 in the team build, you will be able to run your application on a x64 bit machine in x86 bit mode (WOW – By running Windows on Windows) and what that means is, you could use 8GB+ worth of RAM, if you take away everything else your application will roughly get a heap size of at least 4 GB to play with, which is immense. If you need a heap size of more than 4 GB you have either build a software for NASA or there is something fundamentally wrong in your application. 2. Solution – Now that you have put a workaround in place the IIS will not restart the worker process that regularly, which means you can take a breather and start working to get to the root cause of this memory leak. But this begs a question “How do I Identify possible memory leaks in my application?” Well i won’t say that there is one single tool that can tell you where the memory leak is, but trust me, ‘Performance Profiling’ is a great start point, it definitely gets you started in the right direction, let’s have a look at how. Performance Wizard - Start the Performance Wizard and select Instrumentation, this lets you measure function call counts and timings. Before running the performance session right click the performance session settings and chose properties from the context menu to bring up the Performance session properties page and as shown in the screen shot below, check the check boxes in the group ‘.NET memory profiling collection’ namely ‘Collect .NET object allocation information’ and ‘Also collect the .NET Object lifetime information’. Now if you fire off the profiling session on your pages you will notice that the results allows you to view ‘Object Lifetime’ which shows you the number of objects that made it to Gen-0, Gen-1, Gen-2, Large heap, etc. Another great feature about the profile is that if your application has > 5% cases where objects die right after making to the Gen-2 storage a threshold alert is generated to alert you. Since you have the option to also view the most expensive methods and by capturing the IntelliTrace data you can drill in to narrow down to the line of code that is the root cause of the problem. Well now that we have seen how crucial memory management is and how easy Visual Studio Ultimate 2010 makes it for us to identify and reproduce the problem with the best of breed tools in the product. Caching One of the main ways to improve performance is Caching. Which basically means you tell the web server that instead of going to the database for each request you keep the data in the webserver and when the user asks for it you serve it from the webserver itself. BUT that can have consequences! Let’s look at some code, trust me caching code is not very intuitive, I define a cache key for almost all searches made through the common search page and cache the results. The approach works fine, first time i get the data from the database and second time data is served from the cache, significant performance improvement, EXCEPT when two users try to do the same operation and run into each other. But it is easy to handle this by adding the lock as you can see in the snippet below. So, as long as a user comes in and finds that the cache is empty, the user locks and starts to get the cache no more concurrency issues. But lets say you are processing 10 requests per second, by the time i have locked the operation to get the results from the database, 9 other users came in and found that the cache key is null so after i have come out and populated the cache they will still go in to get the results again. The application will still be faster because the next set of 10 users and so on would continue to get data from the cache. BUT if we added another null check after locking to build the cache and before actual call to the db then the 9 users who follow me would not make the extra trip to the database at all and that would really increase the performance, but didn’t i say that the code won’t be very intuitive, may be you should leave a comment you don’t want another developer to come in and think what a fresher why is he checking for the cache key null twice !!! The downside of caching is, you are storing the data outside of the database and the data could be wrong because the updates applied to the database would make the data cached at the web server out of sync. So, how do you invalidate the cache? Well if you only had one way of updating the data lets say only one entry point to the data update you can write some logic to say that every time new data is entered set the cache object to null. But this approach will not work as soon as you have several ways of feeding data to the system or your system is scaled out across a farm of web servers. The perfect solution to this is Micro Caching which means you cache the query for a set time duration and invalidate the cache after that set duration. The advantage is every time the user queries for that data with in the time span for which you have cached the results there are no calls made to the database and the data is served right from the server which makes the response immensely quick. Now figuring out the appropriate time span for which you micro cache the query results really depends on the application. Lets say your website gets 10 requests per second, if you retain the cache results for even 1 minute you will have immense performance gains. You would reduce 90% hits to the database for searching. Ever wondered why when you go to e-bookers.com or xpedia.com or yatra.com to book a flight and you click on the book button because the fare seems too exciting and you get an error message telling you that the fare is not valid any more. Yes, exactly => That is a cache failure! These travel sites or price compare engines are not going to hit the database every time you hit the compare button instead the results will be served from the cache, because the query results are micro cached, its a perfect trade-off, by micro caching the results the site gains 100% performance benefits but every once in a while annoys a customer because the fare has expired. But the trade off works in the favour of these sites as they are still able to process up to 30+ page requests per second which means cater to the site traffic by may be losing 1 customer every once in a while to a competitor who is also using a similar caching technique what are the odds that the user will not come back to their site sooner or later? Recap Resources Below are some Key resource you might like to review. I would highly recommend the documentation, walkthroughs and videos available on MSDN. You can always make use of Fiddler to debug Web Performance Tests. Some community test extensions and plug ins available on Codeplex might also be of interest to you. The Road Ahead Thank you for taking the time out and reading this blog post, you may also want to read Part I and Part II if you haven’t so far. If you enjoyed the post, remember to subscribe to http://feeds.feedburner.com/TarunArora. Questions/Feedback/Suggestions, etc please leave a comment. Next ‘Load Testing in the cloud’, I’ll be working on exploring the possibilities of running Test controller/Agents in the Cloud. See you on the other side! Thank You! Share this post : CodeProject

Read the article
CodePlex Daily Summary for Tuesday, November 01, 2011

CodePlex Daily Summary for Tuesday, November 01, 2011Popular ReleasesAgFx Windows Phone Application Data Caching Framework: AgFx November Update: This update is primarily a bug fix release, which provides a number of stability and performance enhancements to AgFx. Available as a NuGet package here. Release Notes Fix IsoStore exceptions during shutdown (Changelist 81729) Thread safety and sequencings issues (Changelists 78682,79252, 80707) Fix landscape layout issue in auth sample Fix async keyword collision Supporting overlapping/multiple notifications for Load requestsiTuner - The iTunes Companion: iTuner 1.4.4322: Added German (unverified, apologies if incorrect) Properly source invariant resources with correct resIDs Replaced obsolete lyric providers with working providers Fix Pseudolater to correctly morph every third char Fix null reference in CatalogBaseDevpad: 4.6: Whats new for Devpad 4.6: New Recent Files New Run in Safari Minor Bug Fix's, improvements and speed upsWindows Workflow Foundation on Codeplex: Microsoft.Activities v1.8.8: Microsoft.Activities Overview How do I install Microsoft.Activities? Updates in this release9318Technical Analysis Engine for .NET: Technical Analysis Engine 1.25: What's new in the 1.25 release (2011-11-01): - Added Williams %R indicator - Added Moving Average Envelopes indicatorBF3Rcon.NET: BF3Rcon 3.0: This release is targeted for RCON documentation based on R3. Everything should be beta stable, but it's alpha because I haven't been able to fully test it. When a stable release is ready, a proper changelog will be kept. Important Edit: There's one method that will keep this from working in Mono. GeneratePasswordHash uses void HashAlgorithm.Dispose(), which isn't in Mono. This will have to be changed to Clear() in the next release. If anyone needs a Mono version of this immediately, you can...BoxWorld: BoxWorld_2011.10.30: BoxWorld - 8.0.1110.30 This is the initial release of BoxWorld. I'd recommend downloading the installer as it contains the compiled code and everything all nicely contained. By default, you end up with this directory structure: C:\Program Files\ViperWorks\BoxWorld C:\Program Files\ViperWorks\BoxWorld\Data C:\Program Files\ViperWorks\BoxWorld\Interface C:\Program Files\ViperWorks\BoxWorld\Source In the root you have the compiled EXE files, one for the main release, one for the LITE release ...VidCoder: 1.2.1: Fixed a couple regressions: video encoder was blank in queue and crashes with the High Profile preset when opening the Settings window. Fixed problem with auto-update introduced in 1.2.0. If you have 1.2.0 you will need to update manually to get this.AssaultCube Reloaded: Release 2.3: THE RELEASE YOU'VE ALL BEEN WAITING FOR! IT CAN NOW BE CONSIDERED STABLE Linux has Debian 64-bit precompiled binaries, but you can compile your own as it also contains the source. If you are using Mac or other operating systems, download the Linux package. The server pack is ready for both Windows and Linux, but you might need to compile your own for linux (source included) If you are using Windows and require the source code, download the source package!Koober: Koober - The Ebook Creator 0.2: The official release of Koober as Open source. Koober is a ebook creator for Windows, and Koob Reader is the reader.A Microblog API (SINA weibo.com open API in C#, .Net???????API): AMicroblogAPI v1.0: AMicroBlogAPI is a C# implementation, a strong-typed deep encapsulation, a easy-to-use .net wrapper of SINA microblog API v1.0. App developers no longer need to parse various HTTP responses -- all responses are parsed into strong types; No longer need to construt the request query strings -- just simply gives the values of parameters; No longer need to implement the OAuth -- a single call could logs the user on. In this release (AMicroblogAPI v1.0), all basic data APIs are implemented. For ...patterns & practices: Enterprise Library Contrib: Enterprise Library Contrib - 5.0 (Oct 2011): This release of Enterprise Library Contrib is based on the Microsoft patterns & practices Enterprise Library 5.0 core and contains the following: Common extensionsTypeConfigurationElement<T> - A Polymorphic Configuration Element without having to be part of a PolymorphicConfigurationElementCollection. AnonymousConfigurationElement - A Configuration element that can be uniquely identified without having to define its name explicitly. Data Access Application Block extensionsMySql Provider - ...Network Monitor Open Source Parsers: Network Monitor Parsers 3.4.2748: The Network Monitor Parsers packages contain parsers for more than 400 network protocols, including RFC based public protocols and protocols for Microsoft products defined in the Microsoft Open Specifications for Windows and SQL Server. NetworkMonitor_Parsers.msi is the base parser package which defines parsers for commonly used public protocols and protocols for Microsoft Windows. In this release, NetowrkMonitor_Parsers.msi continues to improve quality and fix bugs. It has included the fo...Duckworth Lewis Professional Edition Calculator: DLcalc 3.0: DLcalc 3.0 can perform Duckworth/Lewis Professional Edition calculations 100% accurately. It also produces over-by-over and ball-by-ball PAR score tables.Media Companion: MC 3.420b Weekly: Ensure .NET 4.0 Full Framework is installed. (Available from http://www.microsoft.com/download/en/details.aspx?id=17718) Ensure the NFO ID fix is applied when transitioning from versions prior to 3.416b. (Details here) Movies Fixed: Fanart and poster scraping issues TV Shows (Re)Added: Rebuild single show Fixed: Issue when shows are moved from original location Ability to handle " for actor nicknames Crash when episode name contains "<" (does not scrape yet) Clears fanart when switch...patterns & practices - Unity: Unity 3.0 for .NET4.5 Preview: The Unity 3.0.1026.0 Preview enables Unity to work on .NET 4.5 with both the WinRT and desktop profiles. The major changes include: Unity projects updated to target .NET 4.5. Dynamic build plans modified to use compiled lambda expressions instead of Reflection.Emit Converting reflection to use the new TypeInfo for reflection. Projects updated to work with the Microsoft Visual Studio 2011 Preview Notes/Known Issues: The Microsoft.Practices.Unity.UnityServiceLocator class cannot be use...Managed Extensibility Framework: MEF 2 Preview 4: Detailed information on this release is available on the BCL team blog.AcDown????? - Anime&Comic Downloader: AcDown????? v3.6: ?? ● AcDown??????????、??????，??????????????????????，???????Acfun、Bilibili、???、???、???、Tucao.cc、SF???、?????80????，???????????、?????????。 ● AcDown???????????????????????????，???，???????????????????。 ● AcDown???????C#??，????.NET Framework 2.0??。?????"Acfun?????"。 ????32??64? Windows XP/Vista/7 ????????????? ??：????????Windows XP???,?????????.NET Framework 2.0???(x86)?.NET Framework 2.0???(x64)，?????"?????????"??? ??????????????，??????????： ??"AcDown?????"????????? ?? v3.6?? ??“????”...MVC Controls Toolkit: Mvc Controls Toolkit 1.5.0: Added: The new Client Blocks feaure of Views A new "move" js method for the TreeViews The NewHtmlCreated js event to the DataGrid Improved the ChoiceList structure that now allows also the selection list of a dropdown to be chosen with a lambda expression Improved the AcceptViewHintAttribute controller filter. Now a client can specify not only the name of a View or Partial View it prefers, but also to receive just the rough data in Json format. Now the the SMinimum and SMaximum par...Free SharePoint Master Pages: Buried Alive (Halloween) Theme: Release Notes *Created for Halloween, you will find theme file, custom css file and images. *Created by Al Roome @AlstarRoome Features: Custom styling for web part Custom background *Screenshot https://s3.amazonaws.com/kkhipple/post/sharepoint-showcase-halloween.pngNew ProjectsarlTH: "Airport Link Thailand" Application for Windows Phone 7.1Audio Tags Editor For Excel: The purpose of the project is to create a tool to edit tags of audio files through the excel application. The program is an Excel add-in.AW Load Tester: AW Load Tester is a simple and easy to use website load tester that very closely mimics a regular users browsing session on your website. Many popular load tester only download HTML. AW Load Tester downloads all the related items on a page simultaneously. AzMan - Be Good Do Right !: The AzMan is a collection of reusable software components designed to assist software developers with common enterprise development challenges. http://bbs.fengshupo.com/BestWayWP7UI: The User interface of best wat wp7 applicationCheckout Subsystem Optimizer: Checkout Subsystem OptimizerDBScripter - Library for scripting SQL Server database objects: This project is library that allows users to script SQL Server database objects. Library uses dynamic management views for extracting data about databases objects. This library could be used in various situations. The most interesting areas are comparing database objects and generation database documentation. For both cases examples have been prepared.Dependency Checker: Dependency Checker is a utility for verifying that a set of prerequisites is present on a machine. The set of dependencies is defined in a configuration file. For more information, see http://dependencychecker.codeplex.com/documentation DibTer: DibTer is a brand new CMS for ASP.NET Mainly features are NHibernate MVC JQuery HTML5Dynamics CRM Entity Based Stress Tool: Entity Based Stress Tool, is command line program that creates configurable server agents in order to stress a defined entity at a defined Message/Stage. This tool allows CRM Application Stressing, Plugin Stressing, and enlaborate an execution report.Edinger: Edinger is a simple text editor.FreeboxHDVideoPlayer: FreeboxHDVideoPlayer makes it easier for free.fr french ISP users to play and manage videos stored on the HDD of the Freebox HD. FreeboxHDVideoPlayer simplifie, pour les utilisateurs de free.fr, la lecture et la gestion des vidéos stockées sur le disque du boitier Freebox HD.guglex - google docs xtraz: my studying project for asp.net mvc 3. now displaying spreadsheet rows in card viewIP-Camera MJPEG HTTP Web-Server Emulator: Turns web-camera into IP-camera with mjpeg streaming and access it over network. Usage of i-Lids (Imagery Library for Intelligent Detection Systems) and other data for surveillance systems development. Sends WMV, folder with images, web-camera data over the wire.LIM: LIM is .net software for archive manage.Luminji.CorWeb: luminji's company web.Map Generator: Map Generator for your Civ/Col/Roguelike games in VB.NET.Microformat Parsers for .NET: Microformat's Parsers for .NET helps you to collect information you run into on the web, that is stored by means of microformats. It's written in C# 3.0.Mini Rx: Mini Rx provides LINQ like queries over events for C# and F# using extension methods, somewhat like the Reactive Extensions (Rx) but in a lightweight open source package with one non-strong named assembly.MVC Music Store by NNT: Demo Music Store Follow tut from Microsoft. Team Explorer VS 2010 .NET 4PHP MySQL Web Site Creator: PHP MySQL Web Site Creator helps you create a basic template for your web site using a UI form. I creates all PHP, CSS, HTML basic pages to connect to MySQL database.PQLRD: Paco KLk!Project Web And Application: Project MobileStore in ASP use ASP.NET MVC 3RoboZip: RoboZip combines MS Robocopy and Zip-functionality. Create a job and RoboZip will zip all files (from a list of filetypes) within a time span in the past. The zip file name can contain date and time values of the time period it covers. It's developed in C#. The zip component is the DotNetZip Library from http://dotnetzip.codeplex.com. The logging is done using Apache log4net from http://logging.apache.org/log4net. The idea here is not to only present the code, but also to document how the de...Scriptster Gearbox: Scriptster Gearbox is a full C# scripting system based on Scriptster (also on CodePlex). It is aimed at Visual Studio developers and power users who need to automate the sorts of things that DOS is generally good at, but which using C# can make more powerful and more familiar.Technical Analysis Engine for .NET: Technical Analysis Engine is a .NET based library for technical analysis. It is fast, free, easy to use and provides functionality for financial market analysis.ValidationDSL: A Fluent validation engine with ease to use and reusable around the multi-tenant applicationsWebFormsUtilities: WebFormsUtilities is a collection of tools that assist in using MVC/MVP functionality to webforms in an unobtrusive manner. This differs from a hybrid MVC/WebForms page in that you can use methods like UpdateModel() or TryValidateModel() in a WebForms Page class.Wen - WPF 3D Navigator: Control the camera and the lights of a WPF 3D application without any code modification.WindStudios: WindStudiosWrite My Expect Statement: Have Rhino Mocks automatically generate your expect() and return() statements for you.zion xtraz: some basic helpful classes for use in every project

Read the article
Using R to Analyze G1GC Log Files

- by user12620111

Using R to Analyze G1GC Log Files body, td { font-family: sans-serif; background-color: white; font-size: 12px; margin: 8px; } tt, code, pre { font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace; } h1 { font-size:2.2em; } h2 { font-size:1.8em; } h3 { font-size:1.4em; } h4 { font-size:1.0em; } h5 { font-size:0.9em; } h6 { font-size:0.8em; } a:visited { color: rgb(50%, 0%, 50%); } pre { margin-top: 0; max-width: 95%; border: 1px solid #ccc; white-space: pre-wrap; } pre code { display: block; padding: 0.5em; } code.r, code.cpp { background-color: #F8F8F8; } table, td, th { border: none; } blockquote { color:#666666; margin:0; padding-left: 1em; border-left: 0.5em #EEE solid; } hr { height: 0px; border-bottom: none; border-top-width: thin; border-top-style: dotted; border-top-color: #999999; } @media print { * { background: transparent !important; color: black !important; filter:none !important; -ms-filter: none !important; } body { font-size:12pt; max-width:100%; } a, a:visited { text-decoration: underline; } hr { visibility: hidden; page-break-before: always; } pre, blockquote { padding-right: 1em; page-break-inside: avoid; } tr, img { page-break-inside: avoid; } img { max-width: 100% !important; } @page :left { margin: 15mm 20mm 15mm 10mm; } @page :right { margin: 15mm 10mm 15mm 20mm; } p, h2, h3 { orphans: 3; widows: 3; } h2, h3 { page-break-after: avoid; } } pre .operator, pre .paren { color: rgb(104, 118, 135) } pre .literal { color: rgb(88, 72, 246) } pre .number { color: rgb(0, 0, 205); } pre .comment { color: rgb(76, 136, 107); } pre .keyword { color: rgb(0, 0, 255); } pre .identifier { color: rgb(0, 0, 0); } pre .string { color: rgb(3, 106, 7); } var hljs=new function(){function m(p){return p.replace(/&/gm,"&").replace(/"}while(y.length||w.length){var v=u().splice(0,1)[0];z+=m(x.substr(q,v.offset-q));q=v.offset;if(v.event=="start"){z+=t(v.node);s.push(v.node)}else{if(v.event=="stop"){var p,r=s.length;do{r--;p=s[r];z+=("")}while(p!=v.node);s.splice(r,1);while(r'+M[0]+""}else{r+=M[0]}O=P.lR.lastIndex;M=P.lR.exec(L)}return r+L.substr(O,L.length-O)}function J(L,M){if(M.sL&&e[M.sL]){var r=d(M.sL,L);x+=r.keyword_count;return r.value}else{return F(L,M)}}function I(M,r){var L=M.cN?'':"";if(M.rB){y+=L;M.buffer=""}else{if(M.eB){y+=m(r)+L;M.buffer=""}else{y+=L;M.buffer=r}}D.push(M);A+=M.r}function G(N,M,Q){var R=D[D.length-1];if(Q){y+=J(R.buffer+N,R);return false}var P=q(M,R);if(P){y+=J(R.buffer+N,R);I(P,M);return P.rB}var L=v(D.length-1,M);if(L){var O=R.cN?"":"";if(R.rE){y+=J(R.buffer+N,R)+O}else{if(R.eE){y+=J(R.buffer+N,R)+O+m(M)}else{y+=J(R.buffer+N+M,R)+O}}while(L1){O=D[D.length-2].cN?"":"";y+=O;L--;D.length--}var r=D[D.length-1];D.length--;D[D.length-1].buffer="";if(r.starts){I(r.starts,"")}return R.rE}if(w(M,R)){throw"Illegal"}}var E=e[B];var D=[E.dM];var A=0;var x=0;var y="";try{var s,u=0;E.dM.buffer="";do{s=p(C,u);var t=G(s[0],s[1],s[2]);u+=s[0].length;if(!t){u+=s[1].length}}while(!s[2]);if(D.length1){throw"Illegal"}return{r:A,keyword_count:x,value:y}}catch(H){if(H=="Illegal"){return{r:0,keyword_count:0,value:m(C)}}else{throw H}}}function g(t){var p={keyword_count:0,r:0,value:m(t)};var r=p;for(var q in e){if(!e.hasOwnProperty(q)){continue}var s=d(q,t);s.language=q;if(s.keyword_count+s.rr.keyword_count+r.r){r=s}if(s.keyword_count+s.rp.keyword_count+p.r){r=p;p=s}}if(r.language){p.second_best=r}return p}function i(r,q,p){if(q){r=r.replace(/^((]+|\t)+)/gm,function(t,w,v,u){return w.replace(/\t/g,q)})}if(p){r=r.replace(/\n/g,"")}return r}function n(t,w,r){var x=h(t,r);var v=a(t);var y,s;if(v){y=d(v,x)}else{return}var q=c(t);if(q.length){s=document.createElement("pre");s.innerHTML=y.value;y.value=k(q,c(s),x)}y.value=i(y.value,w,r);var u=t.className;if(!u.match("(\\s|^)(language-)?"+v+"(\\s|$)")){u=u?(u+" "+v):v}if(/MSIE [678]/.test(navigator.userAgent)&&t.tagName=="CODE"&&t.parentNode.tagName=="PRE"){s=t.parentNode;var p=document.createElement("div");p.innerHTML=""+y.value+"";t=p.firstChild.firstChild;p.firstChild.cN=s.cN;s.parentNode.replaceChild(p.firstChild,s)}else{t.innerHTML=y.value}t.className=u;t.result={language:v,kw:y.keyword_count,re:y.r};if(y.second_best){t.second_best={language:y.second_best.language,kw:y.second_best.keyword_count,re:y.second_best.r}}}function o(){if(o.called){return}o.called=true;var r=document.getElementsByTagName("pre");for(var p=0;p|=||=||=|\\?|\\[|\\{|\\(|\\^|\\^=|\\||\\|=|\\|\\||~";this.ER="(?![\\s\\S])";this.BE={b:"\\\\.",r:0};this.ASM={cN:"string",b:"'",e:"'",i:"\\n",c:[this.BE],r:0};this.QSM={cN:"string",b:'"',e:'"',i:"\\n",c:[this.BE],r:0};this.CLCM={cN:"comment",b:"//",e:"$"};this.CBLCLM={cN:"comment",b:"/\\*",e:"\\*/"};this.HCM={cN:"comment",b:"#",e:"$"};this.NM={cN:"number",b:this.NR,r:0};this.CNM={cN:"number",b:this.CNR,r:0};this.BNM={cN:"number",b:this.BNR,r:0};this.inherit=function(r,s){var p={};for(var q in r){p[q]=r[q]}if(s){for(var q in s){p[q]=s[q]}}return p}}();hljs.LANGUAGES.cpp=function(){var a={keyword:{"false":1,"int":1,"float":1,"while":1,"private":1,"char":1,"catch":1,"export":1,virtual:1,operator:2,sizeof:2,dynamic_cast:2,typedef:2,const_cast:2,"const":1,struct:1,"for":1,static_cast:2,union:1,namespace:1,unsigned:1,"long":1,"throw":1,"volatile":2,"static":1,"protected":1,bool:1,template:1,mutable:1,"if":1,"public":1,friend:2,"do":1,"return":1,"goto":1,auto:1,"void":2,"enum":1,"else":1,"break":1,"new":1,extern:1,using:1,"true":1,"class":1,asm:1,"case":1,typeid:1,"short":1,reinterpret_cast:2,"default":1,"double":1,register:1,explicit:1,signed:1,typename:1,"try":1,"this":1,"switch":1,"continue":1,wchar_t:1,inline:1,"delete":1,alignof:1,char16_t:1,char32_t:1,constexpr:1,decltype:1,noexcept:1,nullptr:1,static_assert:1,thread_local:1,restrict:1,_Bool:1,complex:1},built_in:{std:1,string:1,cin:1,cout:1,cerr:1,clog:1,stringstream:1,istringstream:1,ostringstream:1,auto_ptr:1,deque:1,list:1,queue:1,stack:1,vector:1,map:1,set:1,bitset:1,multiset:1,multimap:1,unordered_set:1,unordered_map:1,unordered_multiset:1,unordered_multimap:1,array:1,shared_ptr:1}};return{dM:{k:a,i:"",k:a,r:10,c:["self"]}]}}}();hljs.LANGUAGES.r={dM:{c:[hljs.HCM,{cN:"number",b:"\\b0[xX][0-9a-fA-F]+[Li]?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+(?:[eE][+\\-]?\\d*)?L\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+\\.(?!\\d)(?:i\\b)?",e:hljs.IMMEDIATE_RE,r:1},{cN:"number",b:"\\b\\d+(?:\\.\\d*)?(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\.\\d+(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"keyword",b:"(?:tryCatch|library|setGeneric|setGroupGeneric)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\.",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\d+(?![\\w.])",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\b(?:function)",e:hljs.IMMEDIATE_RE,r:2},{cN:"keyword",b:"(?:if|in|break|next|repeat|else|for|return|switch|while|try|stop|warning|require|attach|detach|source|setMethod|setClass)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"literal",b:"(?:NA|NA_integer_|NA_real_|NA_character_|NA_complex_)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"literal",b:"(?:NULL|TRUE|FALSE|T|F|Inf|NaN)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"identifier",b:"[a-zA-Z.][a-zA-Z0-9._]*\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"|=|| Using R to Analyze G1GC Log Files Using R to Analyze G1GC Log Files Introduction Working in Oracle Platform Integration gives an engineer opportunities to work on a wide array of technologies. My team’s goal is to make Oracle applications run best on the Solaris/SPARC platform. When looking for bottlenecks in a modern applications, one needs to be aware of not only how the CPUs and operating system are executing, but also network, storage, and in some cases, the Java Virtual Machine. I was recently presented with about 1.5 GB of Java Garbage First Garbage Collector log file data. If you’re not familiar with the subject, you might want to review Garbage First Garbage Collector Tuning by Monica Beckwith. The customer had been running Java HotSpot 1.6.0_31 to host a web application server. I was told that the Solaris/SPARC server was running a Java process launched using a commmand line that included the following flags: -d64 -Xms9g -Xmx9g -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=80 -XX:PermSize=256m -XX:MaxPermSize=256m -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps -XX:+PrintFlagsFinal -XX:+DisableExplicitGC -XX:+UnlockExperimentalVMOptions -XX:ParallelGCThreads=8 Several sources on the internet indicate that if I were to print out the 1.5 GB of log files, it would require enough paper to fill the bed of a pick up truck. Of course, it would be fruitless to try to scan the log files by hand. Tools will be required to summarize the contents of the log files. Others have encountered large Java garbage collection log files. There are existing tools to analyze the log files: IBM’s GC toolkit The chewiebug GCViewer gchisto HPjmeter Instead of using one of the other tools listed, I decide to parse the log files with standard Unix tools, and analyze the data with R. Data Cleansing The log files arrived in two different formats. I guess that the difference is that one set of log files was generated using a more verbose option, maybe -XX:+PrintHeapAtGC, and the other set of log files was generated without that option. Format 1 In some of the log files, the log files with the less verbose format, a single trace, i.e. the report of a singe garbage collection event, looks like this: {Heap before GC invocations=12280 (full 61): garbage-first heap total 9437184K, used 7499918K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 1 young (4096K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. 2014-05-14T07:24:00.988-0700: 60586.353: [GC pause (young) 7324M->7320M(9216M), 0.1567265 secs] Heap after GC invocations=12281 (full 61): garbage-first heap total 9437184K, used 7496533K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 0 young (0K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. } A simple grep can be used to extract a summary: $ grep "\[ GC pause (young" g1gc.log 2014-05-13T13:24:35.091-0700: 3.109: [GC pause (young) 20M->5029K(9216M), 0.0146328 secs] 2014-05-13T13:24:35.440-0700: 3.459: [GC pause (young) 9125K->6077K(9216M), 0.0086723 secs] 2014-05-13T13:24:37.581-0700: 5.599: [GC pause (young) 25M->8470K(9216M), 0.0203820 secs] 2014-05-13T13:24:42.686-0700: 10.704: [GC pause (young) 44M->15M(9216M), 0.0288848 secs] 2014-05-13T13:24:48.941-0700: 16.958: [GC pause (young) 51M->20M(9216M), 0.0491244 secs] 2014-05-13T13:24:56.049-0700: 24.066: [GC pause (young) 92M->26M(9216M), 0.0525368 secs] 2014-05-13T13:25:34.368-0700: 62.383: [GC pause (young) 602M->68M(9216M), 0.1721173 secs] But that format wasn't easily read into R, so I needed to be a bit more tricky. I used the following Unix command to create a summary file that was easy for R to read. $ echo "SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime" $ grep "\[GC pause (young" g1gc.log | grep -v mark | sed -e 's/[A-SU-z,]/ /g' -e 's/->/ /' -e 's/: / /g' | more SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime 2014-05-13T13:24:35.091-0700 3.109 20 5029 9216 0.0146328 2014-05-13T13:24:35.440-0700 3.459 9125 6077 9216 0.0086723 2014-05-13T13:24:37.581-0700 5.599 25 8470 9216 0.0203820 2014-05-13T13:24:42.686-0700 10.704 44 15 9216 0.0288848 2014-05-13T13:24:48.941-0700 16.958 51 20 9216 0.0491244 2014-05-13T13:24:56.049-0700 24.066 92 26 9216 0.0525368 2014-05-13T13:25:34.368-0700 62.383 602 68 9216 0.1721173 Format 2 In some of the log files, the log files with the more verbose format, a single trace, i.e. the report of a singe garbage collection event, was more complicated than Format 1. Here is a text file with an example of a single G1GC trace in the second format. As you can see, it is quite complicated. It is nice that there is so much information available, but the level of detail can be overwhelming. I wrote this awk script (download) to summarize each trace on a single line. #!/usr/bin/env awk -f BEGIN { printf("SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize\n") } ###################### # Save count data from lines that are at the start of each G1GC trace. # Each trace starts out like this: # {Heap before GC invocations=14 (full 0): # garbage-first heap total 9437184K, used 325496K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) ###################### /{Heap.*full/{ gsub ( "\\)" , "" ); nf=split($0,a,"="); split(a[2],b," "); getline; if ( match($0, "first") ) { G1GC=1; IncrementalCount=b[1]; FullCount=substr( b[3], 1, length(b[3])-1 ); } else { G1GC=0; } } ###################### # Pull out time stamps that are in lines with this format: # 2014-05-12T14:02:06.025-0700: 94.312: [GC pause (young), 0.08870154 secs] ###################### /GC pause/ { DateTime=$1; SecondsSinceLaunch=substr($2, 1, length($2)-1); } ###################### # Heap sizes are in lines that look like this: # [ 4842M->4838M(9216M)] ###################### /\[ .*]$/ { gsub ( "\\[" , "" ); gsub ( "\ \]" , "" ); gsub ( "->" , " " ); gsub ( "\$ " , " " ); gsub ( "\ $" , " " ); split($0,a," "); if ( split(a[1],b,"M") > 1 ) {BeforeSize=b[1]*1024;} if ( split(a[1],b,"K") > 1 ) {BeforeSize=b[1];} if ( split(a[2],b,"M") > 1 ) {AfterSize=b[1]*1024;} if ( split(a[2],b,"K") > 1 ) {AfterSize=b[1];} if ( split(a[3],b,"M") > 1 ) {TotalSize=b[1]*1024;} if ( split(a[3],b,"K") > 1 ) {TotalSize=b[1];} } ###################### # Emit an output line when you find input that looks like this: # [Times: user=1.41 sys=0.08, real=0.24 secs] ###################### /\[Times/ { if (G1GC==1) { gsub ( "," , "" ); split($2,a,"="); UserTime=a[2]; split($3,a,"="); SysTime=a[2]; split($4,a,"="); RealTime=a[2]; print DateTime,SecondsSinceLaunch,IncrementalCount,FullCount,UserTime,SysTime,RealTime,BeforeSize,AfterSize,TotalSize; G1GC=0; } } The resulting summary is about 25X smaller that the original file, but still difficult for a human to digest. SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ... 2014-05-12T18:36:34.669-0700: 3985.744 561 0 0.57 0.06 0.16 1724416 1720320 9437184 2014-05-12T18:36:34.839-0700: 3985.914 562 0 0.51 0.06 0.19 1724416 1720320 9437184 2014-05-12T18:36:35.069-0700: 3986.144 563 0 0.60 0.04 0.27 1724416 1721344 9437184 2014-05-12T18:36:35.354-0700: 3986.429 564 0 0.33 0.04 0.09 1725440 1722368 9437184 2014-05-12T18:36:35.545-0700: 3986.620 565 0 0.58 0.04 0.17 1726464 1722368 9437184 2014-05-12T18:36:35.726-0700: 3986.801 566 0 0.43 0.05 0.12 1726464 1722368 9437184 2014-05-12T18:36:35.856-0700: 3986.930 567 0 0.30 0.04 0.07 1726464 1723392 9437184 2014-05-12T18:36:35.947-0700: 3987.023 568 0 0.61 0.04 0.26 1727488 1723392 9437184 2014-05-12T18:36:36.228-0700: 3987.302 569 0 0.46 0.04 0.16 1731584 1724416 9437184 Reading the Data into R Once the GC log data had been cleansed, either by processing the first format with the shell script, or by processing the second format with the awk script, it was easy to read the data into R. g1gc.df = read.csv("summary.txt", row.names = NULL, stringsAsFactors=FALSE,sep="") str(g1gc.df) ## 'data.frame': 8307 obs. of 10 variables: ## $ row.names : chr "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ... ## $ SecondsSinceLaunch: num 1.16 1.47 1.97 3.83 6.1 ... ## $ IncrementalCount : int 0 1 2 3 4 5 6 7 8 9 ... ## $ FullCount : int 0 0 0 0 0 0 0 0 0 0 ... ## $ UserTime : num 0.11 0.05 0.04 0.21 0.08 0.26 0.31 0.33 0.34 0.56 ... ## $ SysTime : num 0.04 0.01 0.01 0.05 0.01 0.06 0.07 0.06 0.07 0.09 ... ## $ RealTime : num 0.02 0.02 0.01 0.04 0.02 0.04 0.05 0.04 0.04 0.06 ... ## $ BeforeSize : int 8192 5496 5768 22528 24576 43008 34816 53248 55296 93184 ... ## $ AfterSize : int 1400 1672 2557 4907 7072 14336 16384 18432 19456 21504 ... ## $ TotalSize : int 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 ... head(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount ## 1 2014-05-12T14:00:32.868-0700: 1.161 0 ## 2 2014-05-12T14:00:33.179-0700: 1.472 1 ## 3 2014-05-12T14:00:33.677-0700: 1.969 2 ## 4 2014-05-12T14:00:35.538-0700: 3.830 3 ## 5 2014-05-12T14:00:37.811-0700: 6.103 4 ## 6 2014-05-12T14:00:41.428-0700: 9.720 5 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 1 0 0.11 0.04 0.02 8192 1400 9437184 ## 2 0 0.05 0.01 0.02 5496 1672 9437184 ## 3 0 0.04 0.01 0.01 5768 2557 9437184 ## 4 0 0.21 0.05 0.04 22528 4907 9437184 ## 5 0 0.08 0.01 0.02 24576 7072 9437184 ## 6 0 0.26 0.06 0.04 43008 14336 9437184 Basic Statistics Once the data has been read into R, simple statistics are very easy to generate. All of the numbers from high school statistics are available via simple commands. For example, generate a summary of every column: summary(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount FullCount ## Length:8307 Min. : 1 Min. : 0 Min. : 0.0 ## Class :character 1st Qu.: 9977 1st Qu.:2048 1st Qu.: 0.0 ## Mode :character Median :12855 Median :4136 Median : 12.0 ## Mean :12527 Mean :4156 Mean : 31.6 ## 3rd Qu.:15758 3rd Qu.:6262 3rd Qu.: 61.0 ## Max. :55484 Max. :8391 Max. :113.0 ## UserTime SysTime RealTime BeforeSize ## Min. :0.040 Min. :0.0000 Min. : 0.0 Min. : 5476 ## 1st Qu.:0.470 1st Qu.:0.0300 1st Qu.: 0.1 1st Qu.:5137920 ## Median :0.620 Median :0.0300 Median : 0.1 Median :6574080 ## Mean :0.751 Mean :0.0355 Mean : 0.3 Mean :5841855 ## 3rd Qu.:0.920 3rd Qu.:0.0400 3rd Qu.: 0.2 3rd Qu.:7084032 ## Max. :3.370 Max. :1.5600 Max. :488.1 Max. :8696832 ## AfterSize TotalSize ## Min. : 1380 Min. :9437184 ## 1st Qu.:5002752 1st Qu.:9437184 ## Median :6559744 Median :9437184 ## Mean :5785454 Mean :9437184 ## 3rd Qu.:7054336 3rd Qu.:9437184 ## Max. :8482816 Max. :9437184 Q: What is the total amount of User CPU time spent in garbage collection? sum(g1gc.df$UserTime) ## [1] 6236 As you can see, less than two hours of CPU time was spent in garbage collection. Is that too much? To find the percentage of time spent in garbage collection, divide the number above by total_elapsed_time*CPU_count. In this case, there are a lot of CPU’s and it turns out the the overall amount of CPU time spent in garbage collection isn’t a problem when viewed in isolation. When calculating rates, i.e. events per unit time, you need to ask yourself if the rate is homogenous across the time period in the log file. Does the log file include spikes of high activity that should be separately analyzed? Averaging in data from nights and weekends with data from business hours may alias problems. If you have a reason to suspect that the garbage collection rates include peaks and valleys that need independent analysis, see the “Time Series” section, below. Q: How much garbage is collected on each pass? The amount of heap space that is recovered per GC pass is surprisingly low: At least one collection didn’t recover any data. (“Min.=0”) 25% of the passes recovered 3MB or less. (“1st Qu.=3072”) Half of the GC passes recovered 4MB or less. (“Median=4096”) The average amount recovered was 56MB. (“Mean=56390”) 75% of the passes recovered 36MB or less. (“3rd Qu.=36860”) At least one pass recovered 2GB. (“Max.=2121000”) g1gc.df$Delta = g1gc.df$BeforeSize - g1gc.df$AfterSize summary(g1gc.df$Delta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0 3070 4100 56400 36900 2120000 Q: What is the maximum User CPU time for a single collection? The worst garbage collection (“Max.”) is many standard deviations away from the mean. The data appears to be right skewed. summary(g1gc.df$UserTime) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.040 0.470 0.620 0.751 0.920 3.370 sd(g1gc.df$UserTime) ## [1] 0.3966 Basic Graphics Once the data is in R, it is trivial to plot the data with formats including dot plots, line charts, bar charts (simple, stacked, grouped), pie charts, boxplots, scatter plots histograms, and kernel density plots. Histogram of User CPU Time per Collection I don't think that this graph requires any explanation. hist(g1gc.df$UserTime, main="User CPU Time per Collection", xlab="Seconds", ylab="Frequency") Box plot to identify outliers When the initial data is viewed with a box plot, you can see the one crazy outlier in the real time per GC. Save this data point for future analysis and drop the outlier so that it’s not throwing off our statistics. Now the box plot shows many outliers, which will be examined later, using times series analysis. Notice that the scale of the x-axis changes drastically once the crazy outlier is removed. par(mfrow=c(2,1)) boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(dominated by a crazy outlier)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") crazy.outlier.df=g1gc.df[g1gc.df$RealTime > 400,] g1gc.df=g1gc.df[g1gc.df$RealTime < 400,] boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(crazy outlier excluded)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") box(which = "outer", lty = "solid") Here is the crazy outlier for future analysis: crazy.outlier.df ## row.names SecondsSinceLaunch IncrementalCount ## 8233 2014-05-12T23:15:43.903-0700: 20741 8316 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 8233 112 0.55 0.42 488.1 8381440 8235008 9437184 ## Delta ## 8233 146432 R Time Series Data To analyze the garbage collection as a time series, I’ll use Z’s Ordered Observations (zoo). “zoo is the creator for an S3 class of indexed totally ordered observations which includes irregular time series.” require(zoo) ## Loading required package: zoo ## ## Attaching package: 'zoo' ## ## The following objects are masked from 'package:base': ## ## as.Date, as.Date.numeric head(g1gc.df[,1]) ## [1] "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" ## [3] "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ## [5] "2014-05-12T14:00:37.811-0700:" "2014-05-12T14:00:41.428-0700:" options("digits.secs"=3) times=as.POSIXct( g1gc.df[,1], format="%Y-%m-%dT%H:%M:%OS%z:") g1gc.z = zoo(g1gc.df[,-c(1)], order.by=times) head(g1gc.z) ## SecondsSinceLaunch IncrementalCount FullCount ## 2014-05-12 17:00:32.868 1.161 0 0 ## 2014-05-12 17:00:33.178 1.472 1 0 ## 2014-05-12 17:00:33.677 1.969 2 0 ## 2014-05-12 17:00:35.538 3.830 3 0 ## 2014-05-12 17:00:37.811 6.103 4 0 ## 2014-05-12 17:00:41.427 9.720 5 0 ## UserTime SysTime RealTime BeforeSize AfterSize ## 2014-05-12 17:00:32.868 0.11 0.04 0.02 8192 1400 ## 2014-05-12 17:00:33.178 0.05 0.01 0.02 5496 1672 ## 2014-05-12 17:00:33.677 0.04 0.01 0.01 5768 2557 ## 2014-05-12 17:00:35.538 0.21 0.05 0.04 22528 4907 ## 2014-05-12 17:00:37.811 0.08 0.01 0.02 24576 7072 ## 2014-05-12 17:00:41.427 0.26 0.06 0.04 43008 14336 ## TotalSize Delta ## 2014-05-12 17:00:32.868 9437184 6792 ## 2014-05-12 17:00:33.178 9437184 3824 ## 2014-05-12 17:00:33.677 9437184 3211 ## 2014-05-12 17:00:35.538 9437184 17621 ## 2014-05-12 17:00:37.811 9437184 17504 ## 2014-05-12 17:00:41.427 9437184 28672 Example of Two Benchmark Runs in One Log File The data in the following graph is from a different log file, not the one of primary interest to this article. I’m including this image because it is an example of idle periods followed by busy periods. It would be uninteresting to average the rate of garbage collection over the entire log file period. More interesting would be the rate of garbage collect in the two busy periods. Are they the same or different? Your production data may be similar, for example, bursts when employees return from lunch and idle times on weekend evenings, etc. Once the data is in an R Time Series, you can analyze isolated time windows. Clipping the Time Series data Flashing back to our test case… Viewing the data as a time series is interesting. You can see that the work intensive time period is between 9:00 PM and 3:00 AM. Lets clip the data to the interesting period: par(mfrow=c(2,1)) plot(g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Complete Log File", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") clipped.g1gc.z=window(g1gc.z, start=as.POSIXct("2014-05-12 21:00:00"), end=as.POSIXct("2014-05-13 03:00:00")) plot(clipped.g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Limited to Benchmark Execution", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") box(which = "outer", lty = "solid") Cumulative Incremental and Full GC count Here is the cumulative incremental and full GC count. When the line is very steep, it indicates that the GCs are repeating very quickly. Notice that the scale on the Y axis is different for full vs. incremental. plot(clipped.g1gc.z[,c(2:3)], main="Cumulative Incremental and Full GC count", xlab="Time of Day", col="#1b9e77") GC Analysis of Benchmark Execution using Time Series data In the following series of 3 graphs: The “After Size” show the amount of heap space in use after each garbage collection. Many Java objects are still referenced, i.e. alive, during each garbage collection. This may indicate that the application has a memory leak, or may indicate that the application has a very large memory footprint. Typically, an application's memory footprint plateau's in the early stage of execution. One would expect this graph to have a flat top. The steep decline in the heap space may indicate that the application crashed after 2:00. The second graph shows that the outliers in real execution time, discussed above, occur near 2:00. when the Java heap seems to be quite full. The third graph shows that Full GCs are infrequent during the first few hours of execution. The rate of Full GC's, (the slope of the cummulative Full GC line), changes near midnight. plot(clipped.g1gc.z[,c("AfterSize","RealTime","FullCount")], xlab="Time of Day", col=c("#1b9e77","red","#1b9e77")) GC Analysis of heap recovered Each GC trace includes the amount of heap space in use before and after the individual GC event. During garbage coolection, unreferenced objects are identified, the space holding the unreferenced objects is freed, and thus, the difference in before and after usage indicates how much space has been freed. The following box plot and bar chart both demonstrate the same point - the amount of heap space freed per garbage colloection is surprisingly low. par(mfrow=c(2,1)) boxplot(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", horizontal = TRUE, col="red") hist(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", breaks=100, col="red") box(which = "outer", lty = "solid") This graph is the most interesting. The dark blue area shows how much heap is occupied by referenced Java objects. This represents memory that holds live data. The red fringe at the top shows how much data was recovered after each garbage collection. barplot(clipped.g1gc.z[,c("AfterSize","Delta")], col=c("#7570b3","#e7298a"), xlab="Time of Day", border=NA) legend("topleft", c("Live Objects","Heap Recovered on GC"), fill=c("#7570b3","#e7298a")) box(which = "outer", lty = "solid") When I discuss the data in the log files with the customer, I will ask for an explaination for the large amount of referenced data resident in the Java heap. There are two are posibilities: There is a memory leak and the amount of space required to hold referenced objects will continue to grow, limited only by the maximum heap size. After the maximum heap size is reached, the JVM will throw an “Out of Memory” exception every time that the application tries to allocate a new object. If this is the case, the aplication needs to be debugged to identify why old objects are referenced when they are no longer needed. The application has a legitimate requirement to keep a large amount of data in memory. The customer may want to further increase the maximum heap size. Another possible solution would be to partition the application across multiple cluster nodes, where each node has responsibility for managing a unique subset of the data. Conclusion In conclusion, R is a very powerful tool for the analysis of Java garbage collection log files. The primary difficulty is data cleansing so that information can be read into an R data frame. Once the data has been read into R, a rich set of tools may be used for thorough evaluation.

Read the article
Design pattern for cost calculator app?

- by Anders Svensson

Hi, I have a problem that I’ve tried to get help for before, but I wasn’t able to solve it then, so I’m trying to simplify the problem now to see if I can get some more concrete help with this because it is driving me crazy… Basically, I have a working (more complex) version of this application, which is a project cost calculator. But because I am at the same time trying to learn to design my applications better, I would like some input on how I could improve this design. Basically the main thing I want is input on the conditionals that (here) appear repeated in two places. The suggestions I got before was to use the strategy pattern or factory pattern. I also know about the Martin Fowler book with the suggestion to Refactor conditional with polymorphism. I understand that principle in his simpler example. But how can I do either of these things here (if any would be suitable)? The way I see it, the calculation is dependent on a couple of conditions: 1. What kind of service is it, writing or analysis? 2. Is the project small, medium or large? (Please note that there may be other parameters as well, equally different, such as “are the products new or previously existing?” So such parameters should be possible to add, but I tried to keep the example simple with only two parameters to be able to get concrete help) So refactoring with polymorphism would imply creating a number of subclasses, which I already have for the first condition (type of service), and should I really create more subclasses for the second condition as well (size)? What would that become, AnalysisSmall, AnalysisMedium, AnalysisLarge, WritingSmall, etc…??? No, I know that’s not good, I just don’t see how to work with that pattern anyway else? I see the same problem basically for the suggestions of using the strategy pattern (and the factory pattern as I see it would just be a helper to achieve the polymorphism above). So please, if anyone has concrete suggestions as to how to design these classes the best way I would be really grateful! Please also consider whether I have chosen the objects correctly too, or if they need to be redesigned. (Responses like "you should consider the factory pattern" will obviously not be helpful... I've already been down that road and I'm stumped at precisely how in this case) Regards, Anders The code (very simplified, don’t mind the fact that I’m using strings instead of enums, not using a config file for data etc, that will be done as necessary in the real application once I get the hang of these design problems): public abstract class Service { protected Dictionary<string, int> _hours; protected const int SMALL = 2; protected const int MEDIUM = 8; public int NumberOfProducts { get; set; } public abstract int GetHours(); } public class Writing : Service { public Writing(int numberOfProducts) { NumberOfProducts = numberOfProducts; _hours = new Dictionary<string, int> { { "small", 125 }, { "medium", 100 }, { "large", 60 } }; } public override int GetHours() { if (NumberOfProducts <= SMALL) return _hours["small"] * NumberOfProducts; if (NumberOfProducts <= MEDIUM) return (_hours["small"] * SMALL) + (_hours["medium"] * (NumberOfProducts - SMALL)); return (_hours["small"] * SMALL) + (_hours["medium"] * (MEDIUM - SMALL)) + (_hours["large"] * (NumberOfProducts - MEDIUM)); } } public class Analysis : Service { public Analysis(int numberOfProducts) { NumberOfProducts = numberOfProducts; _hours = new Dictionary<string, int> { { "small", 56 }, { "medium", 104 }, { "large", 200 } }; } public override int GetHours() { if (NumberOfProducts <= SMALL) return _hours["small"]; if (NumberOfProducts <= MEDIUM) return _hours["medium"]; return _hours["large"]; } } public partial class Form1 : Form { public Form1() { InitializeComponent(); List<int> quantities = new List<int>(); for (int i = 0; i < 100; i++) { quantities.Add(i); } comboBoxNumberOfProducts.DataSource = quantities; } private void comboBoxNumberOfProducts_SelectedIndexChanged(object sender, EventArgs e) { Service writing = new Writing((int) comboBoxNumberOfProducts.SelectedItem); Service analysis = new Analysis((int) comboBoxNumberOfProducts.SelectedItem); labelWriterHours.Text = writing.GetHours().ToString(); labelAnalysisHours.Text = analysis.GetHours().ToString(); } }

Read the article
Open an Application .exe window inside another application

- by jpnavarini

I have an application in WPF running. I would like that, when a button is clicked inside this application, another application opens, with its window maximized. However, I don't want my first application to stop and wait. I want both to be open and running independently. When the button is clicked again, in case the application is minimized, the application is maximized. In case it is not, it is open again. How is it possible using C#? I have tried the following: Process process = Process.GetProcesses().FirstOrDefault(f => f.ProcessName.Contains("Analysis")); ShowWindow((process ?? Process.Start("..\\..\\..\\MS Analysis\\bin\\Debug\\Chemtech.RT.MS.Analysis.exe")).MainWindowHandle.ToInt32(), SW_MAXIMIZE); But the window does not open, even though the process does start.

Read the article
Is there a way of getting a file name and inserting into Matlab script?

- by torr

In a folder, I have both my .m file that contains the script and an imaging .dcm file that needs to be analyzed. Folder structure: Folder1/analysis.m Folder1/meas_dynamic_123.dcm My script (analysis.m) begins as follows: target =''; <== here should go the full path to the file + filename example: /Volumes/Data/Folder1/meas_dynamic_123.m txt = dir(target); // etc So I'm wondering if there is a way of when running analysis.m it will: automatically search the folder it's in, grab the full path + filename of file containing string dynamic in the name, insert its full path + name into target variable continue running the script Does anyone have any pointers on how to achieve this? Using ffpath?

Read the article
How to fix “The requested service, ‘net.pipe://localhost/SecurityTokenServiceApplication/appsts.svc’ could not be activated.”

- by ybbest

Problem: When I try to publish a SharePoint2013 workflow, I received the error: The requested service, ‘net.pipe://localhost/SecurityTokenServiceApplication/appsts.svc’ could not be activated. After that, my workflow stopped working and every time I start a work I receive the following error message: System.ApplicationException: PreconditionFailed ---> System.ApplicationException: Error in the application. --- End of inner exception stack trace --- at System.Activities.Statements.Throw.Execute(CodeActivityContext context) at System.Activities.CodeActivity.InternalExecute(ActivityInstance instance, ActivityExecutor executor, BookmarkManager bookmarkManager) at System.Activities.Runtime.ActivityExecutor.ExecuteActivityWorkItem.ExecuteBody(ActivityExecutor executor, BookmarkManager bookmarkManager, Location resultLocation) Analysis: After analysis, I found the error by visiting the http://localhost:32843/SecurityTokenServiceApplication/securitytoken.svc and the error I got on the message is Solution: The solution is basically getting more memory to the server. For development environment, you can restart your noderunner.exe or some other services to release some memories. To verify you have enough memory you can browse to http://localhost:32843/SecurityTokenServiceApplication/securitytoken.svc , it should return the information below. Then you can republish your workflow and it will work like a charm.

Read the article
Migration from Exchange to BPOS - Microsoft Assessment and Planning (MAP) Toolkit Link

- by Harish Pavithran

The Microsoft Assessment and Planning (MAP) Toolkit is an agentless toolkit that finds computers on a network and performs a detailed inventory of the computers using Windows Management Instrumentation (WMI) and the Remote Registry Service. The data and analysis provided by this toolkit can significantly simplify the planning process for migrating to Windows® 7, Windows Vista®, Microsoft Office 2007, Windows Server® 2008 R2, Windows Server 2008, Hyper-V, Microsoft Application Virtualization, Microsoft SQL Server 2008, and Forefront® Client Security and Network Access Protection. Assessments for Windows Server 2008 R2, Windows Server 2008, Windows 7, and Windows Vista include device driver availability as well as recommendations for hardware upgrades. If you are interested in server virtualization planning, MAP provides the ability to gather performance metrics from computers you are considering for virtualization and a feature to model a library of potential host hardware and storage configurations. This information can be used to quickly perform "what-if" analysis using Hyper-V and Microsoft Virtual Server 2005 R2 as virtualization platforms. http://www.microsoft.com/downloads/details.aspx?displaylang=en&FamilyID=67240b76-3148-4e49-943d-4d9ea7f77730

Read the article
Trace File Source Adapter

The Trace File Source adapter is a useful addition to your SSIS toolbox. It allows you to read 2005 and 2008 profiler traces stored as .trc files and read them into the Data Flow. From there you can perform filtering and analysis using the power of SSIS. There is no need for a SQL Server connection this just uses the trace file. Example Usages Cache warming for SQL Server Analysis Services Reading the flight recorder Find out the longest running queries on a server Analyze statements for CPU, memory by user or some other criteria you choose Properties The Trace File Source adapter has two properties, both of which combine to control the source trace file that is read at runtime. SQL Server 2005 and SQL Server 2008 trace files are supported for both the Database Engine (SQL Server) and Analysis Services. The properties are managed by the Editor form or can be set directly from the Properties Grid in Visual Studio. Property Type Description AccessMode Enumeration This property determines how the Filename property is interpreted. The values available are: DirectInput Variable Filename String This property holds the path for trace file to load (*.trc). The value is either a full path, or the name of a variable which contains the full path to the trace file, depending on the AccessMode property. Trace Column Definition Hopefully the majority of you can skip this section entirely, but if you encounter some problems processing a trace file this may explain it and allow you to fix the problem. The component is built upon the trace management API provided by Microsoft. Unfortunately API methods that expose the schema of a trace file have known issues and are unreliable, put simply the data often differs from what was specified. To overcome these limitations the component uses some simple XML files. These files enable the trace column data types and sizing attributes to be overridden. For example SQL Server Profiler or TMO generated structures define EventClass as an integer, but the real value is a string. TraceDataColumnsSQL.xml - SQL Server Database Engine Trace Columns TraceDataColumnsAS.xml - SQL Server Analysis Services Trace Columns The files can be found in the %ProgramFiles%\Microsoft SQL Server\100\DTS\PipelineComponents folder, e.g. "C:\Program Files\Microsoft SQL Server\100\DTS\PipelineComponents\TraceDataColumnsSQL.xml" "C:\Program Files\Microsoft SQL Server\100\DTS\PipelineComponents\TraceDataColumnsAS.xml" If at runtime the component encounters a type conversion or sizing error it is most likely due to a discrepancy between the column definition as reported by the API and the actual value encountered. Whilst most common issues have already been fixed through these files we have implemented specific exception traps to direct you to the files to enable you to fix any further issues due to different usage or data scenarios that we have not tested. An example error that you can fix through these files is shown below. Buffer exception writing value to column 'Column Name'. The string value is 999 characters in length, the column is only 111. Columns can be overridden by the TraceDataColumns XML files in "C:\Program Files\Microsoft SQL Server\100\DTS\PipelineComponents\TraceDataColumnsAS.xml". Installation The component is provided as an MSI file which you can download and run to install it. This simply places the files on disk in the correct locations and also installs the assemblies in the Global Assembly Cache as per Microsoft’s recommendations. You may need to restart the SQL Server Integration Services service, as this caches information about what components are installed, as well as restarting any open instances of Business Intelligence Development Studio (BIDS) / Visual Studio that you may be using to build your SSIS packages. Finally you will have to add the transformation to the Visual Studio toolbox manually. Right-click the toolbox, and select Choose Items.... Select the SSIS Data Flow Items tab, and then check the Trace File Source transformation in the Choose Toolbox Items window. This process has been described in detail in the related FAQ entry for How do I install a task or transform component? We recommend you follow best practice and apply the current Microsoft SQL Server Service pack to your SQL Server servers and workstations. Please note that the Microsoft Trace classes used in the component are not supported on 64-bit platforms. To use the Trace File Source on a 64-bit host you need to ensure you have the 32-bit (x86) tools available, and the way you execute your package is setup to use them, please see the help topic 64-bit Considerations for Integration Services for more details. Downloads Trace Sources for SQL Server 2005 -- Trace Sources for SQL Server 2008 Version History SQL Server 2008 Version 2.0.0.382 - SQL Sever 2008 public release. (9 Apr 2009) SQL Server 2005 Version 1.0.0.321 - SQL Server 2005 public release. (18 Nov 2008) -- Screenshots

Read the article
How to fix “The requested service, ‘net.pipe://localhost/SecurityTokenServiceApplication/appsts.svc’ could not be activated.”

- by ybbest

Problem: When I try to publish a SharePoint2013 workflow, I received the error: The requested service, ‘net.pipe://localhost/SecurityTokenServiceApplication/appsts.svc’ could not be activated. After that, my workflow stopped working and every time I start a work I receive the following error message: System.ApplicationException: PreconditionFailed ---> System.ApplicationException: Error in the application. --- End of inner exception stack trace --- at System.Activities.Statements.Throw.Execute(CodeActivityContext context) at System.Activities.CodeActivity.InternalExecute(ActivityInstance instance, ActivityExecutor executor, BookmarkManager bookmarkManager) at System.Activities.Runtime.ActivityExecutor.ExecuteActivityWorkItem.ExecuteBody(ActivityExecutor executor, BookmarkManager bookmarkManager, Location resultLocation) Analysis: After analysis, I found the error by visiting the http://localhost:32843/SecurityTokenServiceApplication/securitytoken.svc and the error I got on the message is Solution: The solution is basically getting more memory to the server. For development environment, you can restart your noderunner.exe or some other services to release some memories. To verify you have enough memory you can browse to http://localhost:32843/SecurityTokenServiceApplication/securitytoken.svc , it should return the information below. Then you can republish your workflow and it will work like a charm.

Read the article
New Video on the ASP.NET Chart Control

Learn how easy it is to create visually appealing statistical or financial analysis tools in a browser using the Chart control.

Read the article
Open Source R Language Undergoes a Commercial Revolution

Revolution Analytics has big plans for the data analysis language.

Read the article
Open Source R Language Undergoes a Commercial Revolution

Revolution Analytics has big plans for the data analysis language.

Read the article
Application Stores, Developer Communities, Content, Games and Widgets: Strategic Market Review and O

- by Manish Agrawal

Sample of Alan's one day workshop given on Application Stores, Developer Communities, Content, Games and Widgets: Strategic Market Review and Operator Opportunity / Risk Analysis Thanks to Alan

Read the article
GDD-BR 2010 [2F] Storage, Bigquery and Prediction APIs

GDD-BR 2010 [2F] Storage, Bigquery and Prediction APIs Speaker: Patrick Chanezon Track: Cloud Computing Time slot: F [15:30 - 16:15] Room: 2 Level: 101 Google is expanding our storage products by introducing Google Storage for Developers. It offers a RESTful API for storing and accessing data at Google. Developers can take advantage of the performance and reliability of Google's storage infrastructure, as well as the advanced security and sharing capabilities. We will demonstrate key functionality of the product as well as customer use cases. Google relies heavily on data analysis and has developed many tools to understand large datasets. Two of these tools are now available on a limited sign-up basis to developers: (1) BigQuery: interactive analysis of very large data sets and (2) Prediction API: make informed predictions from your data. We will demonstrate their use and give instructions on how to get access. From: GoogleDevelopers Views: 1 0 ratings Time: 39:27 More in Science & Technology

Read the article
New e learning course on Business Intelligence

- by simonsabin

I just got this from fello SQL MVP Chris Testa O'Neil "I am pleased to announce the release of the Author Model eCourseCollection 6233 AE: Implementing and Maintaining Business Intelligence in Microsoft® SQL Server® 2008: Integration Services, Reporting Services and Analysis Services This 24-hour collection provides you with the skills and knowledge required for implementing and maintaining business intelligence solutions on SQL Server 2008. You will learn about the SQL Server technologies, such as Integration Services, Analysis Services, and Reporting Services. This collection also helps students to prepare for Exam 70-448 and can be accessed from: http://www.microsoft.com/learning/elearning/course/6233.mspx

Read the article
Trace File Source Adapter

The Trace File Source adapter is a useful addition to your SSIS toolbox. It allows you to read 2005 and 2008 profiler traces stored as .trc files and read them into the Data Flow. From there you can perform filtering and analysis using the power of SSIS. There is no need for a SQL Server connection this just uses the trace file. Example Usages Cache warming for SQL Server Analysis Services Reading the flight recorder Find out the longest running queries on a server Analyze statements for CPU, memory by user or some other criteria you choose Properties The Trace File Source adapter has two properties, both of which combine to control the source trace file that is read at runtime. SQL Server 2005 and SQL Server 2008 trace files are supported for both the Database Engine (SQL Server) and Analysis Services. The properties are managed by the Editor form or can be set directly from the Properties Grid in Visual Studio. Property Type Description AccessMode Enumeration This property determines how the Filename property is interpreted. The values available are: DirectInput Variable Filename String This property holds the path for trace file to load (*.trc). The value is either a full path, or the name of a variable which contains the full path to the trace file, depending on the AccessMode property. Trace Column Definition Hopefully the majority of you can skip this section entirely, but if you encounter some problems processing a trace file this may explain it and allow you to fix the problem. The component is built upon the trace management API provided by Microsoft. Unfortunately API methods that expose the schema of a trace file have known issues and are unreliable, put simply the data often differs from what was specified. To overcome these limitations the component uses some simple XML files. These files enable the trace column data types and sizing attributes to be overridden. For example SQL Server Profiler or TMO generated structures define EventClass as an integer, but the real value is a string. TraceDataColumnsSQL.xml - SQL Server Database Engine Trace Columns TraceDataColumnsAS.xml - SQL Server Analysis Services Trace Columns The files can be found in the %ProgramFiles%\Microsoft SQL Server\100\DTS\PipelineComponents folder, e.g. "C:\Program Files\Microsoft SQL Server\100\DTS\PipelineComponents\TraceDataColumnsSQL.xml" "C:\Program Files\Microsoft SQL Server\100\DTS\PipelineComponents\TraceDataColumnsAS.xml" If at runtime the component encounters a type conversion or sizing error it is most likely due to a discrepancy between the column definition as reported by the API and the actual value encountered. Whilst most common issues have already been fixed through these files we have implemented specific exception traps to direct you to the files to enable you to fix any further issues due to different usage or data scenarios that we have not tested. An example error that you can fix through these files is shown below. Buffer exception writing value to column 'Column Name'. The string value is 999 characters in length, the column is only 111. Columns can be overridden by the TraceDataColumns XML files in "C:\Program Files\Microsoft SQL Server\100\DTS\PipelineComponents\TraceDataColumnsAS.xml". Installation The component is provided as an MSI file which you can download and run to install it. This simply places the files on disk in the correct locations and also installs the assemblies in the Global Assembly Cache as per Microsoft’s recommendations. You may need to restart the SQL Server Integration Services service, as this caches information about what components are installed, as well as restarting any open instances of Business Intelligence Development Studio (BIDS) / Visual Studio that you may be using to build your SSIS packages. Finally you will have to add the transformation to the Visual Studio toolbox manually. Right-click the toolbox, and select Choose Items.... Select the SSIS Data Flow Items tab, and then check the Trace File Source transformation in the Choose Toolbox Items window. This process has been described in detail in the related FAQ entry for How do I install a task or transform component? We recommend you follow best practice and apply the current Microsoft SQL Server Service pack to your SQL Server servers and workstations. Please note that the Microsoft Trace classes used in the component are not supported on 64-bit platforms. To use the Trace File Source on a 64-bit host you need to ensure you have the 32-bit (x86) tools available, and the way you execute your package is setup to use them, please see the help topic 64-bit Considerations for Integration Services for more details. Downloads Trace Sources for SQL Server 2005 -- Trace Sources for SQL Server 2008 Version History SQL Server 2008 Version 2.0.0.382 - SQL Sever 2008 public release. (9 Apr 2009) SQL Server 2005 Version 1.0.0.321 - SQL Server 2005 public release. (18 Nov 2008) -- Screenshots

Read the article
College Courses through distance learning

- by Matt

I realize this isn't really a programming question, but didn't really know where to post this in the stackexchange and because I am a computer science major i thought id ask here. This is pretty unique to the programmer community since my degree is about 95% programming. I have 1 semester left, but i work full time. I would like to finish up in December, but to make things easier i like to take online classes whenever I can. So, my question is does anyone know of any colleges that offer distance learning courses for computer science? I have been searching around and found a few potential classes, but not sure yet. I would like to gather some classes and see what i can get approval for. Class I need: Only need one C SC 437 Geometric Algorithms C SC 445 Algorithms C SC 473 Automata Only need one C SC 452 Operating Systems C SC 453 Compilers/Systems Software While i only need of each of the above courses i still need to take two more electives. These also have to be upper 400 level classes. So i can take multiple in each category. Some other classes I can take are: CSC 447 - Green Computing CSC 425 - Computer Networking CSC 460 - Database Design CSC 466 - Computer Security I hoping to take one or two of these courses over the summer. If not, then online over the regular semester would be ok too. Any help in helping find these classes would be awesome. Maybe you went to a college that offered distance learning. Some of these classes may be considered to be graduate courses too. Descriptions are listed below if you need. Thanks! Descriptions Computer Security This is an introductory course covering the fundamentals of computer security. In particular, the course will cover basic concepts of computer security such as threat models and security policies, and will show how these concepts apply to specific areas such as communication security, software security, operating systems security, network security, web security, and hardware-based security. Computer Networking Theory and practice of computer networks, emphasizing the principles underlying the design of network software and the role of the communications system in distributed computing. Topics include routing, flow and congestion control, end-to-end protocols, and multicast. Database Design Functions of a database system. Data modeling and logical database design. Query languages and query optimization. Efficient data storage and access. Database access through standalone and web applications. Green Computing This course covers fundamental principles of energy management faced by designers of hardware, operating systems, and data centers. We will explore basic energy management option in individual components such as CPUs, network interfaces, hard drives, memory. We will further present the energy management policies at the operating system level that consider performance vs. energy saving tradeoffs. Finally we will consider large scale data centers where energy management is done at multiple layers from individual components in the system to shutting down entries subset of machines. We will also discuss energy generation and delivery and well as cooling issues in large data centers. Compilers/Systems Software Basic concepts of compilation and related systems software. Topics include lexical analysis, parsing, semantic analysis, code generation; assemblers, loaders, linkers; debuggers. Operating Systems Concepts of modern operating systems; concurrent processes; process synchronization and communication; resource allocation; kernels; deadlock; memory management; file systems. Algorithms Introduction to the design and analysis of algorithms: basic analysis techniques (asymptotics, sums, recurrences); basic design techniques (divide and conquer, dynamic programming, greedy, amortization); acquiring an algorithm repertoire (sorting, median finding, strong components, spanning trees, shortest paths, maximum flow, string matching); and handling intractability (approximation algorithms, branch and bound). Automata Introduction to models of computation (finite automata, pushdown automata, Turing machines), representations of languages (regular expressions, context-free grammars), and the basic hierarchy of languages (regular, context-free, decidable, and undecidable languages). Geometric Algorithms The study of algorithms for geometric objects, using a computational geometry approach, with an emphasis on applications for graphics, VLSI, GIS, robotics, and sensor networks. Topics may include the representation and overlaying of maps, finding nearest neighbors, solving linear programming problems, and searching geometric databases.

Read the article
How to Use RDA to Generate WLS Thread Dumps At Specified Intervals?

- by Daniel Mortimer

Introduction There are many ways to generate a thread dump of a WebLogic Managed Server. For example, take a look at: Taking Thread Dumps - [an excellent blog post on the Middleware Magic site]or Different ways to take thread dumps in WebLogic Server (Document 1098691.1) There is another method - use Remote Diagnostic Agent! The solution described below is not documented, but it is relatively straightforward to execute. One advantage of using RDA to collect the thread dumps is RDA will also collect configuration, log files, network, system, performance information at the same time. Instructions 1. Not familiar with Remote Diagnostic Agent? Take a look at my previous blog "Resolve SRs Faster Using RDA - Find the Right Profile" 2. Choose a profile, which includes the WebLogic Server data collection modules (for example the profile "WebLogicServer"). At RDA setup time you should see the prompt below: ------------------------------------------------------------------------------- S301WLS: Collects Oracle WebLogic Server Information ------------------------------------------------------------------------------- Enter the location of the directory where the domains to analyze are located (For example in UNIX, <BEA Home>/user_projects/domains or <Middleware Home>/user_projects/domains) Hit 'Return' to accept the default (/oracle/11AS/Middleware/user_projects/domains) > For a successful WLS connection, ensure that the domain Admin Server is up and running. Data Collection Type: 1 Collect for a single server (offline mode) 2 Collect for a single server (using WLS connection) 3 Collect for multiple servers (using WLS connection) Enter the item number Hit 'Return' to accept the default (1) > 2 Choose option 2 or 3. Note: Collect for a single server or multiple servers using WLS connection means that RDA will attempt to connect to execute online WLST commands against the targeted server(s). The thread dumps are collected using the WLST function - "threadDumps()". If WLST cannot connect to the managed server, RDA will proceed to collect other data and ignore the request to collect thread dumps. If in the final output you see no Thread Dump menu item, then it's likely that the managed server is in a state which prevents new connections to it. If faced with this scenario, you would have to employ alternative methods for collecting thread dumps. 3. The RDA setup will create a setup.cfg file in the RDA_HOME directory. Open this file in an editor. You will find the following parameters which govern the number of thread dumps and thread dump interval. #N.Number of thread dumps to capture WREQ_THREAD_DUMP=10 #N.Thread dump interval WREQ_THREAD_DUMP_INTERVAL=5000 The example lines above show the default settings. In other words, RDA will collect 10 thread dumps at 5000 millisecond (5 second) intervals. You may want to change this to something like: #N.Number of thread dumps to capture WREQ_THREAD_DUMP=10 #N.Thread dump interval WREQ_THREAD_DUMP_INTERVAL=30000 However, bear in mind, that such change will increase the total amount of time it takes for RDA to complete its run. 4. Once you are happy with the setup.cfg, run RDA. RDA will collect, render, generate and package all files in the output directory. 5. For ease of viewing, open up the RDA Start html file - "xxxx__start.htm". The thread dumps can be found under the WLST Collections for the target managed server(s). See screenshots belowScreenshot 1:RDA Start Page - Main Index Screenshot 2: Managed Server Sub Index Screenshot 3: WLST Collections Screenshot 4: Thread Dump Page - List of dump file links Screenshot 5: Thread Dump Dat File Link Additional Comment: A) You can view the thread dump files within the RDA Start Page framework, but most likely you will want to download the dat files for in-depth analysis via thread dump analysis tools such as: Thread Dump Analyzer - Samurai - a GUI based tail , thread dump analysis tool If you are new to thread dump analysis - take a look at this recorded Support Advisor Webcast Oracle WebLogic Server: Diagnosing Performance Issues through Java Thread Dumps[Slidedeck from webcast in PDF format]B) I have logged a couple of enhancement requests for the RDA Development Team to consider: Add timestamp to dump file links, dat filename and at the top of the body of the dat file Package the individual thread dumps in a zip so all dump files can be conveniently downloaded in one go.

Read the article
New release of "OLAP PivotTable Extensions"

- by Luca Zavarella

For those who are not familiar with this add-in, the OLAP PivotTable Extensions add features of interest to Excel 2007 or 2010 PivotTables pointing to an OLAP cube in Analysis Services. One of these features I like very much, is to know the MDX query code associated with the pivot used at that time in Excel: You can find all the details here: http://olappivottableextend.codeplex.com/ It was recently released a new version of the add-in (version 0.7.4), which does not introduce any new features, but fixes a significant bug: Release 0.7.4 now properly handles languages but introduces no new features. International users who run a different Windows language than their Excel UI language may be receiving an error message when they double click a cell and perform drillthrough which reads: "XML for Analysis parser: The LocaleIdentifier property is not overwritable and cannot be assigned a new value". This error was caused by OLAP PivotTable Extensions in some situations, but release 0.7.4 fixes this problem. Enjoy!

Read the article
Yahoo! Finance Managed

Download stock quotes, historic data, rates of exchanges and technical analysis charts from Yahoo! Finance

Read the article
Microsoft StyleCop is now Open Source

- by Scott Dorman

I have previously talked about Microsoft StyleCop. For those that might not know about it, StyleCop is a source analysis tool (different from the static analysis that FxCop performs) that analyzes the source code directly. As a result, it focuses on more design (or style) issues such as layout, readability and documentation. In an interesting move (and one that I am happy to see), Microsoft has decided to make StyleCop an open source project (under the MS-PL license) available on CodePlex. (The project site isn’t up yet, but should be in the next few weeks.) This will give the development community the opportunity to help contribute to the project, expanding support to other languages (currently it only supports C#), adding features, and fixing bugs. Shortly after the CodePlex site is up, StyleCop 4.4 will be released which includes support for C# 4.0 as well as a large number of bug fixes and other improvements. Technorati Tags: Code Style and Standard,StyleCop

Read the article
Oracle????????????????????????~????????????????????

- by Yusuke.Yamamoto

RDBMS ???????·????????????????????????????????????????????????????????????????????????? ????????Oracle ?????????????????????????????????? Oracle Database ???????????????????????????????? ????????????????????? ????Oracle???????????????????????????????????????????????????????????????????????????? ?????????????? Oracle Database ???????????????????????? ??????????????????????????????????2????????????? 1. ??????(Query Transformation) Query Transformation ???????SQL??????????????????SQL????????????????????? Query Transformation ???Predicate Transformation ? Common Sub-expression Elimination (CSE), Order-BY Elimination (OBYE), Outer Join Elimination (OJE), Simple View Meging (SVM), Predicate Move around (PM), Complex View Merging (CVM), Sub-query Unnesting (SU), Join Predicate Push Down (JPPD) ???? OR Expansion, Star Transformation (ST) ????????????? ···???????????????????????????????????????????????????? Predicate Transformation ?????? Transitive Predicate Generation ????????????? ?????????????SQL???deptno ? 10 ????????????????????????????? select e.ename, d.loc from emp e, dept d where e.deptno=d.deptno and e.deptno=10; ???????????????emp ??? deptno=10 ??????????????dept ??? d.deptno=10 ??????????????????? emp ?? deptno=10 ????????????????????emp ?? deptno=10 ??????10???????10? dept ????????????dept ??20???????????????????????10?*20?=200?????(??????????·?????????)? ??SQL?? Transitive Predicate Generation ??????SQL????????????????? select e.ename, d.location from emp e, dept d where e.deptno=d.deptno and e.deptno=10 and d.deptno=10; ^^^^^^^^^^^ ??????dept ?????? deptno=10 ??????????????????????????10?*1?=10(dept.deptno ?unique????)?1/20????????????????1/20????????????????10??????????30???????????????Query Transformation ???????????????????????????? ?:??????????? dept ?? 1-row table ??????dept ?? driving ???(Outer Table)??? emp ?? probe ???(Inner Table)????????????1?*10?=10 ????????????????????????????????????????????????????????1/20????????????? ?????? Query Transformation ??????SQL????????????????????????????????? Transformation ??????????????????????????????????? 2. ????·????(Access Path Analysis) Access Path Analysis ??Query Transformation ??SQL????????????(Access Path)?????????(Join Method)?????(Join Order)?????????? ??????????????????(FTS)?ROWID?????????????????????????????·?????(Nested Loop Join)???????(Hash Join)????/?????(Sort Merge Join)????????????????????????????????????????????????????????????????????????? Oracle Database ????????? Query Transformation ???? Logical Optimizer?Access Path Analysis ???? Physical Optimizer ????????? ??????????????????????????????????????????????????????????????????????????????????????????????????? ????????????????????????????????????????????????????? Oracle Database ????????????????????? "Oracle ????????" ?????????? Sustaining Engineering?? ?(??? ???) ???????????????? Sustaining Engineering ????????????????????????Oracle Database ???????????????????????? ?????????????????????Ruby????????????????????????? Oracle????????????????????????! Oracle????????????? Oracle????????????????????????

Read the article
Software development life cycle in the industry

- by jiewmeng

I am taking a module called "Requirements Analysis & Design" in a local university. Common module, I'd say (on software development life cycle (SDLC) and UML). But there is a lot of things I wonder if they are actually (strictly) practiced in the industry. For example, will a domain class diagram, an not anything extra (from design class), be strictly the output from Analysis or Discovery phase? I'm sure many times you will think a bit about the technical implementation too? Else you might end up with a design class diagram later that is very different from the original domain class diagram? I also find it hard to remember what diagrams are from Initiation, Discovery, Design etc etc. Plus these phases vary from SDLC to SDLC, I believe? So I usually will create a diagram when I think will be useful. Is it the wrong way?

Read the article
Google I/O 2010 - BigQuery and Prediction APIs

Google I/O 2010 - BigQuery and Prediction APIs Google I/O 2010 - BigQuery and Prediction APIs App Engine 101 Amit Agarwal, Max Lin, Gideon Mann, Siddartha Naidu Google relies heavily on data analysis and has developed many tools to understand large datasets. Two of these tools are now available on a limited sign-up basis to developers: (1) BigQuery: interactive analysis of very large data sets and (2) Prediction API: make informed predictions from your data. We will demonstrate their use and give instructions on how to get access. For all I/O 2010 sessions, please go to code.google.com From: GoogleDevelopers Views: 6 0 ratings Time: 57:48 More in Science & Technology

Read the article

Search Results

Search found 2628 results on 106 pages for 'lexical analysis'.

Page 31/106 | < Previous Page | 27 28 29 30 31 32 33 34 35 36 37 38 | Next Page >

- by Tarun Arora

- by user12620111

- by Anders Svensson

- by jpnavarini

- by torr

- by ybbest

- by Harish Pavithran

- by ybbest

- by Manish Agrawal

- by simonsabin

- by Matt

- by Daniel Mortimer

- by Luca Zavarella

- by Scott Dorman

- by Yusuke.Yamamoto

- by jiewmeng

< Previous Page | 27 28 29 30 31 32 33 34 35 36 37 38 | Next Page >