Search Results

Search found 53 results on 3 pages for 'profilers'.

Page 1/3 | 1 2 3 | Next Page >

Profilers for ASP.Net Web Applications?

- by Earlz

I was recently wanting to do some profiling on an ASP.Net project and was surprised to see that Visual Studio (at least seems to be) lacking a profiler. So my question is what profiler do you use for ASP.Net? Are there any decent ones out there that are free? I've seen a few general .Net profilers but have yet to see one that can be used with ASP.Net..

Read the article
Free JVM profilers for websites

- by 2Real

I'm looking for a JVM profiler (preferably open source) so I can look at the heap and cpu usage of my personal website. I've used Lambda Probe, and I like it because it provides a web interface for my remote Unix computer that has no display. I was wondering what else is available Thanks,

Read the article
My VSTS 2008 has no Analyze menu. Where did it go?

- by maxima120

I have VSTS 2008 but no Analyze menu. Why did it happen in first place? And how can I get it? Thanks

Read the article
How to find and fix performance problems in ORM powered applications

- by FransBouma

Once in a while we get requests about how to fix performance problems with our framework. As it comes down to following the same steps and looking into the same things every single time, I decided to write a blogpost about it instead, so more people can learn from this and solve performance problems in their O/R mapper powered applications. In some parts it's focused on LLBLGen Pro but it's also usable for other O/R mapping frameworks, as the vast majority of performance problems in O/R mapper powered applications are not specific for a certain O/R mapper framework. Too often, the developer looks at the wrong part of the application, trying to fix what isn't a problem in that part, and getting frustrated that 'things are so slow with <insert your favorite framework X here>'. I'm in the O/R mapper business for a long time now (almost 10 years, full time) and as it's a small world, we O/R mapper developers know almost all tricks to pull off by now: we all know what to do to make task ABC faster and what compromises (because there are almost always compromises) to deal with if we decide to make ABC faster that way. Some O/R mapper frameworks are faster in X, others in Y, but you can be sure the difference is mainly a result of a compromise some developers are willing to deal with and others aren't. That's why the O/R mapper frameworks on the market today are different in many ways, even though they all fetch and save entities from and to a database. I'm not suggesting there's no room for improvement in today's O/R mapper frameworks, there always is, but it's not a matter of 'the slowness of the application is caused by the O/R mapper' anymore. Perhaps query generation can be optimized a bit here, row materialization can be optimized a bit there, but it's mainly coming down to milliseconds. Still worth it if you're a framework developer, but it's not much compared to the time spend inside databases and in user code: if a complete fetch takes 40ms or 50ms (from call to entity object collection), it won't make a difference for your application as that 10ms difference won't be noticed. That's why it's very important to find the real locations of the problems so developers can fix them properly and don't get frustrated because their quest to get a fast, performing application failed. Performance tuning basics and rules Finding and fixing performance problems in any application is a strict procedure with four prescribed steps: isolate, analyze, interpret and fix, in that order. It's key that you don't skip a step nor make assumptions: these steps help you find the reason of a problem which seems to be there, and how to fix it or leave it as-is. Skipping a step, or when you assume things will be bad/slow without doing analysis will lead to the path of premature optimization and won't actually solve your problems, only create new ones. The most important rule of finding and fixing performance problems in software is that you have to understand what 'performance problem' actually means. Most developers will say "when a piece of software / code is slow, you have a performance problem". But is that actually the case? If I write a Linq query which will aggregate, group and sort 5 million rows from several tables to produce a resultset of 10 rows, it might take more than a couple of milliseconds before that resultset is ready to be consumed by other logic. If I solely look at the Linq query, the code consuming the resultset of the 10 rows and then look at the time it takes to complete the whole procedure, it will appear to me to be slow: all that time taken to produce and consume 10 rows? But if you look closer, if you analyze and interpret the situation, you'll see it does a tremendous amount of work, and in that light it might even be extremely fast. With every performance problem you encounter, always do realize that what you're trying to solve is perhaps not a technical problem at all, but a perception problem. The second most important rule you have to understand is based on the old saying "Penny wise, Pound Foolish": the part which takes e.g. 5% of the total time T for a given task isn't worth optimizing if you have another part which takes a much larger part of the total time T for that same given task. Optimizing parts which are relatively insignificant for the total time taken is not going to bring you better results overall, even if you totally optimize that part away. This is the core reason why analysis of the complete set of application parts which participate in a given task is key to being successful in solving performance problems: No analysis -> no problem -> no solution. One warning up front: hunting for performance will always include making compromises. Fast software can be made maintainable, but if you want to squeeze as much performance out of your software, you will inevitably be faced with the dilemma of compromising one or more from the group {readability, maintainability, features} for the extra performance you think you'll gain. It's then up to you to decide whether it's worth it. In almost all cases it's not. The reason for this is simple: the vast majority of performance problems can be solved by implementing the proper algorithms, the ones with proven Big O-characteristics so you know the performance you'll get plus you know the algorithm will work. The time taken by the algorithm implementing code is inevitable: you already implemented the best algorithm. You might find some optimizations on the technical level but in general these are minor. Let's look at the four steps to see how they guide us through the quest to find and fix performance problems. Isolate The first thing you need to do is to isolate the areas in your application which are assumed to be slow. For example, if your application is a web application and a given page is taking several seconds or even minutes to load, it's a good candidate to check out. It's important to start with the isolate step because it allows you to focus on a single code path per area with a clear begin and end and ignore the rest. The rest of the steps are taken per identified problematic area. Keep in mind that isolation focuses on tasks in an application, not code snippets. A task is something that's started in your application by either another task or the user, or another program, and has a beginning and an end. You can see a task as a piece of functionality offered by your application. Analyze Once you've determined the problem areas, you have to perform analysis on the code paths of each area, to see where the performance problems occur and which areas are not the problem. This is a multi-layered effort: an application which uses an O/R mapper typically consists of multiple parts: there's likely some kind of interface (web, webservice, windows etc.), a part which controls the interface and business logic, the O/R mapper part and the RDBMS, all connected with either a network or inter-process connections provided by the OS or other means. Each of these parts, including the connectivity plumbing, eat up a part of the total time it takes to complete a task, e.g. load a webpage with all orders of a given customer X. To understand which parts participate in the task / area we're investigating and how much they contribute to the total time taken to complete the task, analysis of each participating task is essential. Start with the code you wrote which starts the task, analyze the code and track the path it follows through your application. What does the code do along the way, verify whether it's correct or not. Analyze whether you have implemented the right algorithms in your code for this particular area. Remember we're looking at one area at a time, which means we're ignoring all other code paths, just the code path of the current problematic area, from begin to end and back. Don't dig in and start optimizing at the code level just yet. We're just analyzing. If your analysis reveals big architectural stupidity, it's perhaps a good idea to rethink the architecture at this point. For the rest, we're analyzing which means we collect data about what could be wrong, for each participating part of the complete application. Reviewing the code you wrote is a good tool to get deeper understanding of what is going on for a given task but ultimately it lacks precision and overview what really happens: humans aren't good code interpreters, computers are. We therefore need to utilize tools to get deeper understanding about which parts contribute how much time to the total task, triggered by which other parts and for example how many times are they called. There are two different kind of tools which are necessary: .NET profilers and O/R mapper / RDBMS profilers. .NET profiling .NET profilers (e.g. dotTrace by JetBrains or Ants by Red Gate software) show exactly which pieces of code are called, how many times they're called, and the time it took to run that piece of code, at the method level and sometimes even at the line level. The .NET profilers are essential tools for understanding whether the time taken to complete a given task / area in your application is consumed by .NET code, where exactly in your code, the path to that code, how many times that code was called by other code and thus reveals where hotspots are located: the areas where a solution can be found. Importantly, they also reveal which areas can be left alone: remember our penny wise pound foolish saying: if a profiler reveals that a group of methods are fast, or don't contribute much to the total time taken for a given task, ignore them. Even if the code in them is perhaps complex and looks like a candidate for optimization: you can work all day on that, it won't matter. As we're focusing on a single area of the application, it's best to start profiling right before you actually activate the task/area. Most .NET profilers support this by starting the application without starting the profiling procedure just yet. You navigate to the particular part which is slow, start profiling in the profiler, in your application you perform the actions which are considered slow, and afterwards you get a snapshot in the profiler. The snapshot contains the data collected by the profiler during the slow action, so most data is produced by code in the area to investigate. This is important, because it allows you to stay focused on a single area. O/R mapper and RDBMS profiling .NET profilers give you a good insight in the .NET side of things, but not in the RDBMS side of the application. As this article is about O/R mapper powered applications, we're also looking at databases, and the software making it possible to consume the database in your application: the O/R mapper. To understand which parts of the O/R mapper and database participate how much to the total time taken for task T, we need different tools. There are two kind of tools focusing on O/R mappers and database performance profiling: O/R mapper profilers and RDBMS profilers. For O/R mapper profilers, you can look at LLBLGen Prof by hibernating rhinos or the Linq to Sql/LLBLGen Pro profiler by Huagati. Hibernating rhinos also have profilers for other O/R mappers like NHibernate (NHProf) and Entity Framework (EFProf) and work the same as LLBLGen Prof. For RDBMS profilers, you have to look whether the RDBMS vendor has a profiler. For example for SQL Server, the profiler is shipped with SQL Server, for Oracle it's build into the RDBMS, however there are also 3rd party tools. Which tool you're using isn't really important, what's important is that you get insight in which queries are executed during the task / area we're currently focused on and how long they took. Here, the O/R mapper profilers have an advantage as they collect the time it took to execute the query from the application's perspective so they also collect the time it took to transport data across the network. This is important because a query which returns a massive resultset or a resultset with large blob/clob/ntext/image fields takes more time to get transported across the network than a small resultset and a database profiler doesn't take this into account most of the time. Another tool to use in this case, which is more low level and not all O/R mappers support it (though LLBLGen Pro and NHibernate as well do) is tracing: most O/R mappers offer some form of tracing or logging system which you can use to collect the SQL generated and executed and often also other activity behind the scenes. While tracing can produce a tremendous amount of data in some cases, it also gives insight in what's going on. Interpret After we've completed the analysis step it's time to look at the data we've collected. We've done code reviews to see whether we've done anything stupid and which parts actually take place and if the proper algorithms have been implemented. We've done .NET profiling to see which parts are choke points and how much time they contribute to the total time taken to complete the task we're investigating. We've performed O/R mapper profiling and RDBMS profiling to see which queries were executed during the task, how many queries were generated and executed and how long they took to complete, including network transportation. All this data reveals two things: which parts are big contributors to the total time taken and which parts are irrelevant. Both aspects are very important. The parts which are irrelevant (i.e. don't contribute significantly to the total time taken) can be ignored from now on, we won't look at them. The parts which contribute a lot to the total time taken are important to look at. We now have to first look at the .NET profiler results, to see whether the time taken is consumed in our own code, in .NET framework code, in the O/R mapper itself or somewhere else. For example if most of the time is consumed by DbCommand.ExecuteReader, the time it took to complete the task is depending on the time the data is fetched from the database. If there was just 1 query executed, according to tracing or O/R mapper profilers / RDBMS profilers, check whether that query is optimal, uses indexes or has to deal with a lot of data. Interpret means that you follow the path from begin to end through the data collected and determine where, along the path, the most time is contributed. It also means that you have to check whether this was expected or is totally unexpected. My previous example of the 10 row resultset of a query which groups millions of rows will likely reveal that a long time is spend inside the database and almost no time is spend in the .NET code, meaning the RDBMS part contributes the most to the total time taken, the rest is compared to that time, irrelevant. Considering the vastness of the source data set, it's expected this will take some time. However, does it need tweaking? Perhaps all possible tweaks are already in place. In the interpret step you then have to decide that further action in this area is necessary or not, based on what the analysis results show: if the analysis results were unexpected and in the area where the most time is contributed to the total time taken is room for improvement, action should be taken. If not, you can only accept the situation and move on. In all cases, document your decision together with the analysis you've done. If you decide that the perceived performance problem is actually expected due to the nature of the task performed, it's essential that in the future when someone else looks at the application and starts asking questions you can answer them properly and new analysis is only necessary if situations changed. Fix After interpreting the analysis results you've concluded that some areas need adjustment. This is the fix step: you're actively correcting the performance problem with proper action targeted at the real cause. In many cases related to O/R mapper powered applications it means you'll use different features of the O/R mapper to achieve the same goal, or apply optimizations at the RDBMS level. It could also mean you apply caching inside your application (compromise memory consumption over performance) to avoid unnecessary re-querying data and re-consuming the results. After applying a change, it's key you re-do the analysis and interpretation steps: compare the results and expectations with what you had before, to see whether your actions had any effect or whether it moved the problem to a different part of the application. Don't fall into the trap to do partly analysis: do the full analysis again: .NET profiling and O/R mapper / RDBMS profiling. It might very well be that the changes you've made make one part faster but another part significantly slower, in such a way that the overall problem hasn't changed at all. Performance tuning is dealing with compromises and making choices: to use one feature over the other, to accept a higher memory footprint, to go away from the strict-OO path and execute queries directly onto the RDBMS, these are choices and compromises which will cross your path if you want to fix performance problems with respect to O/R mappers or data-access and databases in general. In most cases it's not a big issue: alternatives are often good choices too and the compromises aren't that hard to deal with. What is important is that you document why you made a choice, a compromise: which analysis data, which interpretation led you to the choice made. This is key for good maintainability in the years to come. Most common performance problems with O/R mappers Below is an incomplete list of common performance problems related to data-access / O/R mappers / RDBMS code. It will help you with fixing the hotspots you found in the interpretation step. SELECT N+1: (Lazy-loading specific). Lazy loading triggered performance bottlenecks. Consider a list of Orders bound to a grid. You have a Field mapped onto a related field in Order, Customer.CompanyName. Showing this column in the grid will make the grid fetch (indirectly) for each row the Customer row. This means you'll get for the single list not 1 query (for the orders) but 1+(the number of orders shown) queries. To solve this: use eager loading using a prefetch path to fetch the customers with the orders. SELECT N+1 is easy to spot with an O/R mapper profiler or RDBMS profiler: if you see a lot of identical queries executed at once, you have this problem. Prefetch paths using many path nodes or sorting, or limiting. Eager loading problem. Prefetch paths can help with performance, but as 1 query is fetched per node, it can be the number of data fetched in a child node is bigger than you think. Also consider that data in every node is merged on the client within the parent. This is fast, but it also can take some time if you fetch massive amounts of entities. If you keep fetches small, you can use tuning parameters like the ParameterizedPrefetchPathThreshold setting to get more optimal queries. Deep inheritance hierarchies of type Target Per Entity/Type. If you use inheritance of type Target per Entity / Type (each type in the inheritance hierarchy is mapped onto its own table/view), fetches will join subtype- and supertype tables in many cases, which can lead to a lot of performance problems if the hierarchy has many types. With this problem, keep inheritance to a minimum if possible, or switch to a hierarchy of type Target Per Hierarchy, which means all entities in the inheritance hierarchy are mapped onto the same table/view. Of course this has its own set of drawbacks, but it's a compromise you might want to take. Fetching massive amounts of data by fetching large lists of entities. LLBLGen Pro supports paging (and limiting the # of rows returned), which is often key to process through large sets of data. Use paging on the RDBMS if possible (so a query is executed which returns only the rows in the page requested). When using paging in a web application, be sure that you switch server-side paging on on the datasourcecontrol used. In this case, paging on the grid alone is not enough: this can lead to fetching a lot of data which is then loaded into the grid and paged there. Keep note that analyzing queries for paging could lead to the false assumption that paging doesn't occur, e.g. when the query contains a field of type ntext/image/clob/blob and DISTINCT can't be applied while it should have (e.g. due to a join): the datareader will do DISTINCT filtering on the client. this is a little slower but it does perform paging functionality on the data-reader so it won't fetch all rows even if the query suggests it does. Fetch massive amounts of data because blob/clob/ntext/image fields aren't excluded. LLBLGen Pro supports field exclusion for queries. You can exclude fields (also in prefetch paths) per query to avoid fetching all fields of an entity, e.g. when you don't need them for the logic consuming the resultset. Excluding fields can greatly reduce the amount of time spend on data-transport across the network. Use this optimization if you see that there's a big difference between query execution time on the RDBMS and the time reported by the .NET profiler for the ExecuteReader method call. Doing client-side aggregates/scalar calculations by consuming a lot of data. If possible, try to formulate a scalar query or group by query using the projection system or GetScalar functionality of LLBLGen Pro to do data consumption on the RDBMS server. It's far more efficient to process data on the RDBMS server than to first load it all in memory, then traverse the data in-memory to calculate a value. Using .ToList() constructs inside linq queries. It might be you use .ToList() somewhere in a Linq query which makes the query be run partially in-memory. Example: var q = from c in metaData.Customers.ToList() where c.Country=="Norway" select c; This will actually fetch all customers in-memory and do an in-memory filtering, as the linq query is defined on an IEnumerable<T>, and not on the IQueryable<T>. Linq is nice, but it can often be a bit unclear where some parts of a Linq query might run. Fetching all entities to delete into memory first. To delete a set of entities it's rather inefficient to first fetch them all into memory and then delete them one by one. It's more efficient to execute a DELETE FROM ... WHERE query on the database directly to delete the entities in one go. LLBLGen Pro supports this feature, and so do some other O/R mappers. It's not always possible to do this operation in the context of an O/R mapper however: if an O/R mapper relies on a cache, these kind of operations are likely not supported because they make it impossible to track whether an entity is actually removed from the DB and thus can be removed from the cache. Fetching all entities to update with an expression into memory first. Similar to the previous point: it is more efficient to update a set of entities directly with a single UPDATE query using an expression instead of fetching the entities into memory first and then updating the entities in a loop, and afterwards saving them. It might however be a compromise you don't want to take as it is working around the idea of having an object graph in memory which is manipulated and instead makes the code fully aware there's a RDBMS somewhere. Conclusion Performance tuning is almost always about compromises and making choices. It's also about knowing where to look and how the systems in play behave and should behave. The four steps I provided should help you stay focused on the real problem and lead you towards the solution. Knowing how to optimally use the systems participating in your own code (.NET framework, O/R mapper, RDBMS, network/services) is key for success as well as knowing what's going on inside the application you built. I hope you'll find this guide useful in tracking down performance problems and dealing with them in a useful way.

Read the article
Analysing and measuring the performance of a .NET application (survey results)

- by Laila

Back in December last year, I asked myself: could it be that .NET developers think that you need three days and a PhD to do performance profiling on their code? What if developers are shunning profilers because they perceive them as too complex to use? If so, then what method do they use to measure and analyse the performance of their .NET applications? Do they even care about performance? So, a few weeks ago, I decided to get a 1-minute survey up and running in the hopes that some good, hard data would clear the matter up once and for all. I posted the survey on Simple Talk and got help from a few people to promote it. The survey consisted of 3 simple questions: Amazingly, 533 developers took the time to respond - which means I had enough data to get representative results! So before I go any further, I would like to thank all of you who contributed, because I now have some pretty good answers to the troubling questions I was asking myself. To thank you properly, I thought I would share some of the results with you. First of all, application performance is indeed important to most of you. In fact, performance is an intrinsic part of the development cycle for a good 40% of you, which is much higher than I had anticipated, I have to admit. (I know, "Have a little faith Laila!") When asked what tool you use to measure and analyse application performance, I found that nearly half of the respondents use logging statements, a third use performance counters, and 70% of respondents use a profiler of some sort (a 3rd party performance profilers, the CLR profiler or the Visual Studio profiler). The importance attributed to logging statements did surprise me a little. I am still not sure why somebody would go to the trouble of manually instrumenting code in order to measure its performance, instead of just using a profiler. I personally find the process of annotating code, calculating times from log files, and relating it all back to your source terrifyingly laborious. Not to mention that you then need to remember to turn it all off later! Even when you have logging in place throughout all your code anyway, you still have a fair amount of potentially error-prone calculation to sift through the results; in addition, you'll only get method-level rather than line-level timings, and you won't get timings from any framework or library methods you don't have source for. To top it all, we all know that bottlenecks are rarely where you would expect them to be, so you could be wasting time looking for a performance problem in the wrong place. On the other hand, profilers do all the work for you: they automatically collect the CPU and wall-clock timings, and present the results from method timing all the way down to individual lines of code. Maybe I'm missing a trick. I would love to know about the types of scenarios where you actively prefer to use logging statements. Finally, while a third of the respondents didn't have a strong opinion about code performance profilers, those who had an opinion thought that they were mainly complex to use and time consuming. Three respondents in particular summarised this perfectly: "sometimes, they are rather complex to use, adding an additional time-sink to the process of trying to resolve the existing problem". "they are simple to use, but the results are hard to understand" "Complex to find the more advanced things, easy to find some low hanging fruit". These results confirmed my suspicions: Profilers are seen to be designed for more advanced users who can use them effectively and make sense of the results. I found yet more interesting information when I started comparing samples of "developers for whom performance is an important part of the dev cycle", with those "to whom performance is only looked at in times of crisis", and "developers to whom performance is not important, as long as the app works". See the three graphs below. Sample of developers to whom performance is an important part of the dev cycle: Sample of developers to whom performance is important only in times of crisis: Sample of developers to whom performance is not important, as long as the app works: As you can see, there is a strong correlation between the usage of a profiler and the importance attributed to performance: indeed, the more important performance is to a development team, the more likely they are to use a profiler. In addition, developers to whom performance is an important part of the dev cycle have a higher tendency to use a much wider range of methods for performance measurement and analysis. And, unsurprisingly, the less important performance is, the less varied the methods of measurement are. So all in all, to come back to my random questions: .NET developers do care about performance. Those who care the most use a wider range of performance measurement methods than those who care less. But overall, logging statements, performance counters and third party performance profilers are the performance measurement methods of choice for most developers. Finally, although most of you find code profilers complex to use, those of you who care the most about performance tend to use profilers more than those of you to whom performance is not so important.

Read the article
How can you get the call tree with python profilers?

- by Oliver

I used to use a nice Apple profiler that is built into the System Monitor application. As long as your C++ code was compiled with debug information, you could sample your running application and it would print out an indented tree telling you what percent of the parent function's time was spent in this function (and the body vs. other function calls). For instance, if main called function_1 and function_2, function_2 calls function_3, and then main calls function_3: main (100%, 1% in function body): function_1 (9%, 9% in function body): function_2 (90%, 85% in function body): function_3 (100%, 100% in function body) function_3 (1%, 1% in function body) I would see this and think, "Something is taking a long time in the code in the body of function_2. If I want my program to be faster, that's where I should start." Does anyone know how I can most easily get this exact profiling output for a python program? I've seen people say to do this: import cProfile, pstats prof = cProfile.Profile() prof = prof.runctx("real_main(argv)", globals(), locals()) stats = pstats.Stats(prof) stats.sort_stats("time") # Or cumulative stats.print_stats(80) # 80 = how many to print but it's quite messy compared to that elegant call tree. Please let me know if you can easily do this, it would help quite a bit. Cheers!

Read the article
Which .NET performance and/or memory profilers will allow me to profile a DLL?

- by Eric

I write a lot of .NET based plug-ins for other programs which are usually compiled as a DLL which is up to the native application to start up. I've been using Equatec's profiler, which works great, but now would like something with more features, including the ability to profile memory usage. I tried out Red Gate's Ant Profiler, but as far as I can see there is no way to profile a DLL. The only option is to profile an EXE. So my question is what other profiling tools are available that will allow me to profile a single library DLL rather than an EXE. I'm assuming this would require injecting profile code into the library as Equatec does?

Read the article
ASP.NET High CPU Bringing Servers to their Knees

- by user880954

Ok, our new build is having 100% cpu spikes on each server at random intervals. For long durations it make the site totally unresponsive - this will be at peak times as people in different countries log on to the site etc. We've looked at perfmom, memory profilers, CLR profiler, sql profilers, Red gate ants profiler, tried load testing in UAT - but cannot even reproduce the problem. This could mean only thousands of users hitting the live site causes it to happen. One pattern we did notice was that the new code - the broken build - actually uses noticably less threads. We are also using spring for IOC - does this have a bed reputation? To make things worse, we cannot deploy to live due to the business impact - so cannot narrow the problem down to subset of the new features we've added. We truly are destroyed - has anyone got any battle scars that may save us a few lives?

Read the article
Revisiting ANTS Performance Profiler 7.4

- by James Michael Hare

Last year, I did a small review on the ANTS Performance Profiler 6.3, now that it’s a year later and a major version number higher, I thought I’d revisit the review and revise my last post. This post will take the same examples as the original post and update them to show what’s new in version 7.4 of the profiler. Background A performance profiler’s main job is to keep track of how much time is typically spent in each unit of code. This helps when we have a program that is not running at the performance we expect, and we want to know where the program is experiencing issues. There are many profilers out there of varying capabilities. Red Gate’s typically seem to be the very easy to “jump in” and get started with very little training required. So let’s dig into the Performance Profiler. I’ve constructed a very crude program with some obvious inefficiencies. It’s a simple program that generates random order numbers (or really could be any unique identifier), adds it to a list, sorts the list, then finds the max and min number in the list. Ignore the fact it’s very contrived and obviously inefficient, we just want to use it as an example to show off the tool: 1: // our test program 2: public static class Program 3: { 4: // the number of iterations to perform 5: private static int _iterations = 1000000; 6: 7: // The main method that controls it all 8: public static void Main() 9: { 10: var list = new List<string>(); 11: 12: for (int i = 0; i < _iterations; i++) 13: { 14: var x = GetNextId(); 15: 16: AddToList(list, x); 17: 18: var highLow = GetHighLow(list); 19: 20: if ((i % 1000) == 0) 21: { 22: Console.WriteLine("{0} - High: {1}, Low: {2}", i, highLow.Item1, highLow.Item2); 23: Console.Out.Flush(); 24: } 25: } 26: } 27: 28: // gets the next order id to process (random for us) 29: public static string GetNextId() 30: { 31: var random = new Random(); 32: var num = random.Next(1000000, 9999999); 33: return num.ToString(); 34: } 35: 36: // add it to our list - very inefficiently! 37: public static void AddToList(List<string> list, string item) 38: { 39: list.Add(item); 40: list.Sort(); 41: } 42: 43: // get high and low of order id range - very inefficiently! 44: public static Tuple<int,int> GetHighLow(List<string> list) 45: { 46: return Tuple.Create(list.Max(s => Convert.ToInt32(s)), list.Min(s => Convert.ToInt32(s))); 47: } 48: } So let’s run it through the profiler and see what happens! Visual Studio Integration First, let’s look at how the ANTS profilers integrate with Visual Studio’s menu system. Once you install the ANTS profilers, you will get an ANTS menu item with several options: Notice that you can either Profile Performance or Launch ANTS Performance Profiler. These sound similar but achieve two slightly different actions: Profile Performance: this immediately launches the profiler with all defaults selected to profile the active project in Visual Studio. Launch ANTS Performance Profiler: this launches the profiler much the same way as starting it from the Start Menu. The profiler will pre-populate the application and path information, but allow you to change the settings before beginning the profile run. So really, the main difference is that Profile Performance immediately begins profiling with the default selections, where Launch ANTS Performance Profiler allows you to change the defaults and attach to an already-running application. Let’s Fire it Up! So when you fire up ANTS either via Start Menu or Launch ANTS Performance Profiler menu in Visual Studio, you are presented with a very simple dialog to get you started: Notice you can choose from many different options for application type. You can profile executables, services, web applications, or just attach to a running process. In fact, in version 7.4 we see two new options added: ASP.NET Web Application (IIS Express) SharePoint web application (IIS) So this gives us an additional way to profile ASP.NET applications and the ability to profile SharePoint applications as well. You can also choose your level of detail in the Profiling Mode drop down. If you choose Line-Level and method-level timings detail, you will get a lot more detail on the method durations, but this will also slow down profiling somewhat. If you really need the profiler to be as unintrusive as possible, you can change it to Sample method-level timings. This is performing very light profiling, where basically the profiler collects timings of a method by examining the call-stack at given intervals. Which method you choose depends a lot on how much detail you need to find the issue and how sensitive your program issues are to timing. So for our example, let’s just go with the line and method timing detail. So, we check that all the options are correct (if you launch from VS2010, the executable and path are filled in already), and fire it up by clicking the [Start Profiling] button. Profiling the Application Once you start profiling the application, you will see a real-time graph of CPU usage that will indicate how much your application is using the CPU(s) on your system. During this time, you can select segments of the graph and bookmark them, giving them mnemonic names. This can be useful if you want to compare performance in one part of the run to another part of the run. Notice that once you select a block, it will give you the call tree breakdown for that selection only, and the relative performance of those calls. Once you feel you have collected enough information, you can click [Stop Profiling] to stop the application run and information collection and begin a more thorough analysis. Analyzing Method Timings So now that we’ve halted the run, we can look around the GUI and see what we can see. By default, the times are shown in terms of percentage of time of the total run of the application, though you can change it in the View menu item to milliseconds, ticks, or seconds as well. This won’t affect the percentages of methods, it only affects what units the times are shown. Notice also that the major hotspot seems to be in a method without source, ANTS Profiler will filter these out by default, but you can right-click on the line and remove the filter to see more detail. This proves especially handy when a bottleneck is due to a method in the BCL. So now that we’ve removed the filter, we see a bit more detail: In addition, ANTS Performance Profiler gives you the ability to decompile the methods without source so that you can dive even deeper, though typically this isn’t necessary for our purposes. When looking at timings, there are generally two types of timings for each method call: Time: This is the time spent ONLY in this method, not including calls this method makes to other methods. Time With Children: This is the total of time spent in both this method AND including calls this method makes to other methods. In other words, the Time tells you how much work is being done exclusively in this method, and the Time With Children tells you how much work is being done inclusively in this method and everything it calls. You can also choose to display the methods in a tree or in a grid. The tree view is the default and it shows the method calls arranged in terms of the tree representing all method calls and the parent method that called them, etc. This is useful for when you find a hot-spot method, you can see who is calling it to determine if the problem is the method itself, or if it is being called too many times. The grid method represents each method only once with its totals and is useful for quickly seeing what method is the trouble spot. In addition, you can choose to display Methods with source which are generally the methods you wrote (as opposed to native or BCL code), or Any Method which shows not only your methods, but also native calls, JIT overhead, synchronization waits, etc. So these are just two ways of viewing the same data, and you’re free to choose the organization that best suits what information you are after. Analyzing Method Source If we look at the timings above, we see that our AddToList() method (and in particular, it’s call to the List<T>.Sort() method in the BCL) is the hot-spot in this analysis. If ANTS sees a method that is consuming the most time, it will flag it as a hot-spot to help call out potential areas of concern. This doesn’t mean the other statistics aren’t meaningful, but that the hot-spot is most likely going to be your biggest bang-for-the-buck to concentrate on. So let’s select the AddToList() method, and see what it shows in the source window below: Notice the source breakout in the bottom pane when you select a method (from either tree or grid view). This shows you the timings in this method per line of code. This gives you a major indicator of where the trouble-spot in this method is. So in this case, we see that performing a Sort() on the List<T> after every Add() is killing our performance! Of course, this was a very contrived, duh moment, but you’d be surprised how many performance issues become duh moments. Note that this one line is taking up 86% of the execution time of this application! If we eliminate this bottleneck, we should see drastic improvement in the performance. So to fix this, if we still wanted to maintain the List<T> we’d have many options, including: delay Sort() until after all Add() methods, using a SortedSet, SortedList, or SortedDictionary depending on which is most appropriate, or forgoing the sorting all together and using a Dictionary. Rinse, Repeat! So let’s just change all instances of List<string> to SortedSet<string> and run this again through the profiler: Now we see the AddToList() method is no longer our hot-spot, but now the Max() and Min() calls are! This is good because we’ve eliminated one hot-spot and now we can try to correct this one as well. As before, we can then optimize this part of the code (possibly by taking advantage of the fact the list is now sorted and returning the first and last elements). We can then rinse and repeat this process until we have eliminated as many bottlenecks as possible. Calls by Web Request Another feature that was added recently is the ability to view .NET methods grouped by the HTTP requests that caused them to run. This can be helpful in determining which pages, web services, etc. are causing hot spots in your web applications. Summary If you like the other ANTS tools, you’ll like the ANTS Performance Profiler as well. It is extremely easy to use with very little product knowledge required to get up and running. There are profilers built into the higher product lines of Visual Studio, of course, which are also powerful and easy to use. But for quickly jumping in and finding hot spots rapidly, Red Gate’s Performance Profiler 7.4 is an excellent choice. Technorati Tags: Influencers,ANTS,Performance Profiler,Profiler

Read the article
Performance Optimization – It Is Faster When You Can Measure It

- by Alois Kraus

Performance optimization in bigger systems is hard because the measured numbers can vary greatly depending on the measurement method of your choice. To measure execution timing of specific methods in your application you usually use Time Measurement Method Potential Pitfalls Stopwatch Most accurate method on recent processors. Internally it uses the RDTSC instruction. Since the counter is processor specific you can get greatly different values when your thread is scheduled to another core or the core goes into a power saving mode. But things do change luckily: Intel's Designer's vol3b, section 16.11.1 "16.11.1 Invariant TSC The time stamp counter in newer processors may support an enhancement, referred to as invariant TSC. Processor's support for invariant TSC is indicated by CPUID.80000007H:EDX[8]. The invariant TSC will run at a constant rate in all ACPI P-, C-. and T-states. This is the architectural behavior moving forward. On processors with invariant TSC support, the OS may use the TSC for wall clock timer services (instead of ACPI or HPET timers). TSC reads are much more efficient and do not incur the overhead associated with a ring transition or access to a platform resource." DateTime.Now Good but it has only a resolution of 16ms which can be not enough if you want more accuracy. Reporting Method Potential Pitfalls Console.WriteLine Ok if not called too often. Debug.Print Are you really measuring performance with Debug Builds? Shame on you. Trace.WriteLine Better but you need to plug in some good output listener like a trace file. But be aware that the first time you call this method it will read your app.config and deserialize your system.diagnostics section which does also take time. In general it is a good idea to use some tracing library which does measure the timing for you and you only need to decorate some methods with tracing so you can later verify if something has changed for the better or worse. In my previous article I did compare measuring performance with quantum mechanics. This analogy does work surprising well. When you measure a quantum system there is a lower limit how accurately you can measure something. The Heisenberg uncertainty relation does tell us that you cannot measure of a quantum system the impulse and location of a particle at the same time with infinite accuracy. For programmers the two variables are execution time and memory allocations. If you try to measure the timings of all methods in your application you will need to store them somewhere. The fastest storage space besides the CPU cache is the memory. But if your timing values do consume all available memory there is no memory left for the actual application to run. On the other hand if you try to record all memory allocations of your application you will also need to store the data somewhere. This will cost you memory and execution time. These constraints are always there and regardless how good the marketing of tool vendors for performance and memory profilers are: Any measurement will disturb the system in a non predictable way. Commercial tool vendors will tell you they do calculate this overhead and subtract it from the measured values to give you the most accurate values but in reality it is not entirely true. After falling into the trap to trust the profiler timings several times I have got into the habit to Measure with a profiler to get an idea where potential bottlenecks are. Measure again with tracing only the specific methods to check if this method is really worth optimizing. Optimize it Measure again. Be surprised that your optimization has made things worse. Think harder Implement something that really works. Measure again Finished! - Or look for the next bottleneck. Recently I have looked into issues with serialization performance. For serialization DataContractSerializer was used and I was not sure if XML is really the most optimal wire format. After looking around I have found protobuf-net which uses Googles Protocol Buffer format which is a compact binary serialization format. What is good for Google should be good for us. A small sample app to check out performance was a matter of minutes: using ProtoBuf; using System; using System.Diagnostics; using System.IO; using System.Reflection; using System.Runtime.Serialization; [DataContract, Serializable] class Data { [DataMember(Order=1)] public int IntValue { get; set; } [DataMember(Order = 2)] public string StringValue { get; set; } [DataMember(Order = 3)] public bool IsActivated { get; set; } [DataMember(Order = 4)] public BindingFlags Flags { get; set; } } class Program { static MemoryStream _Stream = new MemoryStream(); static MemoryStream Stream { get { _Stream.Position = 0; _Stream.SetLength(0); return _Stream; } } static void Main(string[] args) { DataContractSerializer ser = new DataContractSerializer(typeof(Data)); Data data = new Data { IntValue = 100, IsActivated = true, StringValue = "Hi this is a small string value to check if serialization does work as expected" }; var sw = Stopwatch.StartNew(); int Runs = 1000 * 1000; for (int i = 0; i < Runs; i++) { //ser.WriteObject(Stream, data); Serializer.Serialize<Data>(Stream, data); } sw.Stop(); Console.WriteLine("Did take {0:N0}ms for {1:N0} objects", sw.Elapsed.TotalMilliseconds, Runs); Console.ReadLine(); } } The results are indeed promising: Serializer Time in ms N objects protobuf-net 807 1000000 DataContract 4402 1000000 Nearly a factor 5 faster and a much more compact wire format. Lets use it! After switching over to protbuf-net the transfered wire data has dropped by a factor two (good) and the performance has worsened by nearly a factor two. How is that possible? We have measured it? Protobuf-net is much faster! As it turns out protobuf-net is faster but it has a cost: For the first time a type is de/serialized it does use some very smart code-gen which does not come for free. Lets try to measure this one by setting of our performance test app the Runs value not to one million but to 1. Serializer Time in ms N objects protobuf-net 85 1 DataContract 24 1 The code-gen overhead is significant and can take up to 200ms for more complex types. The break even point where the code-gen cost is amortized by its faster serialization performance is (assuming small objects) somewhere between 20.000-40.000 serialized objects. As it turned out my specific scenario involved about 100 types and 1000 serializations in total. That explains why the good old DataContractSerializer is not so easy to take out of business. The final approach I ended up was to reduce the number of types and to serialize primitive types via BinaryWriter directly which turned out to be a pretty good alternative. It sounded good until I measured again and found that my optimizations so far do not help much. After looking more deeper at the profiling data I did found that one of the 1000 calls did take 50% of the time. So how do I find out which call it was? Normal profilers do fail short at this discipline. A (totally undeserved) relatively unknown profiler is SpeedTrace which does unlike normal profilers create traces of your applications by instrumenting your IL code at runtime. This way you can look at the full call stack of the one slow serializer call to find out if this stack was something special. Unfortunately the call stack showed nothing special. But luckily I have my own tracing as well and I could see that the slow serializer call did happen during the serialization of a bool value. When you encounter after much analysis something unreasonable you cannot explain it then the chances are good that your thread was suspended by the garbage collector. If there is a problem with excessive GCs remains to be investigated but so far the serialization performance seems to be mostly ok. When you do profile a complex system with many interconnected processes you can never be sure that the timings you just did measure are accurate at all. Some process might be hitting the disc slowing things down for all other processes for some seconds as well. There is a big difference between warm and cold startup. If you restart all processes you can basically forget the first run because of the OS disc cache, JIT and GCs make the measured timings very flexible. When you are in need of a random number generator you should measure cold startup times of a sufficiently complex system. After the first run you can try again getting different and much lower numbers. Now try again at least two times to get some feeling how stable the numbers are. Oh and try to do the same thing the next day. It might be that the bottleneck you found yesterday is gone today. Thanks to GC and other random stuff it can become pretty hard to find stuff worth optimizing if no big bottlenecks except bloatloads of code are left anymore. When I have found a spot worth optimizing I do make the code changes and do measure again to check if something has changed. If it has got slower and I am certain that my change should have made it faster I can blame the GC again. The thing is that if you optimize stuff and you allocate less objects the GC times will shift to some other location. If you are unlucky it will make your faster working code slower because you see now GCs at times where none were before. This is where the stuff does get really tricky. A safe escape hatch is to create a repro of the slow code in an isolated application so you can change things fast in a reliable manner. Then the normal profilers do also start working again. As Vance Morrison does point out it is much more complex to profile a system against the wall clock compared to optimize for CPU time. The reason is that for wall clock time analysis you need to understand how your system does work and which threads (if you have not one but perhaps 20) are causing a visible delay to the end user and which threads can wait a long time without affecting the user experience at all. Next time: Commercial profiler shootout.

Read the article
WMemoryProfiler is Released

- by Alois Kraus

What is it? WMemoryProfiler is a managed profiling Api to aid integration testing. This free library can get managed heap statistics and memory usage for your own process (remember testing) and other processes as well. The best thing is that it does work from .NET 2.0 up to .NET 4.5 in x86 and x64. To make it more interesting it can attach to any running .NET process. The reason why I do mention this is that commercial profilers do support this functionality only for their professional editions. An normally only since .NET 4.0 since the profiling API only since then does support attaching to a running process. This thing does differ in many aspects from “normal” profilers because while profiling yourself you can get all objects from all managed heaps back as an object array. If you ever wanted to change the state of an object which does only exist a method local in another thread you can get your hands on it now … Enough theory. Show me some code /// <summary> /// Show feature to not only get statisics out of a process but also the newly allocated /// instances since the last call to MarkCurrentObjects. /// GetNewObjects does return the newly allocated objects as object array /// </summary> static void InstanceTracking() { using (var dumper = new MemoryDumper()) // if you have problems use to see the debugger windows true,true)) { dumper.MarkCurrentObjects(); Allocate(); ILookup<Type, object> newObjects = dumper.GetNewObjects() .ToLookup( x => x.GetType() ); Console.WriteLine("New Strings:"); foreach (var newStr in newObjects[typeof(string)] ) { Console.WriteLine("Str: {0}", newStr); } } } … New Strings: Str: qqd Str: String data: Str: String data: 0 Str: String data: 1 … This is really hot stuff. Not only you can get heap statistics but you can directly examine the new objects and make queries upon them. When I do find more time I can reconstruct the object root graph from it from my own process. It this cool or what? You can also peek into the Finalization Queue to check if you did accidentally forget to dispose a whole bunch of objects … /// <summary> /// .NET 4.0 or above only. Get all finalizable objects which are ready for finalization and have no other object roots anymore. /// </summary> static void NotYetFinalizedObjects() { using (var dumper = new MemoryDumper()) { object[] finalizable = dumper.GetObjectsReadyForFinalization(); Console.WriteLine("Currently {0} objects of types {1} are ready for finalization. Consider disposing them before.", finalizable.Length, String.Join(",", finalizable.ToLookup( x=> x.GetType() ) .Select( x=> x.Key.Name)) ); } } How does it work? The W of WMemoryProfiler is a good hint. It does employ Windbg and SOS dll to do the heavy lifting and concentrates on an easy to use Api which does hide completely Windbg. If you do not want to see Windbg you will never see it. In my experience the most complex thing is actually to download Windbg from the Windows 8 Stanalone SDK. This is described in the Readme and the exception you are greeted with if it is missing in much greater detail. So I will not go into this here. What Next? Depending on the feedback I do get I can imagine some features which might be useful as well Calculate first order GC Roots from the actual object graph Identify global statics in Types in object graph Support read out of finalization queue of .NET 2.0 as well. Support Memory Dump analysis (again a feature only supported by commercial profilers in their professional editions if it is supported at all) Deserialize objects from a memory dump into a live process back (this would need some more investigation but it is doable) The last item needs some explanation. Why on earth would you want to do that? The basic idea is to store in your live process some logging/tracing data which can become quite big but since it is never written to it is very fast to generate. When your process crashes with a memory dump you could transfer this data structure back into a live viewer which can then nicely display your program state at the point it did crash. This is an advanced trouble shooting technique I have not seen anywhere yet but it could be quite useful. You can have here a look at the current feature list of WMemoryProfiler with some examples. How To Get Started? First I would download the released source package (it is tiny). And compile the complete project. Then you can compile the Example project (it has this name) and uncomment in the main method the scenario you want to check out. If you are greeted with an exception it is time to install the Windows 8 Standalone SDK which is described in great detail in the exception text. Thats it for the first round. I have seen something more limited in the Java world some years ago (now I cannot find the link anymore) but anyway. Now we have something much better.

Read the article
Which has been the most reliable, fastest Windows C++ profiler that you have used?

- by carleeto

I need to profile a real time C++ app on Windows. Most of the available profilers are either terribly expensive, total overkill, or both. I don't need any .NET stuff. Since it is a real time app, I need the profiler to be as fast as possible. It would be excellent if it integrated in some way with Visual Studio 2005/2008, but that's not necessary. If this description reminds you of a profiler that you have used, I would really like to know about it. I am hoping to draw from people's use of C++ profilers on Windows to pinpoint one that will do the job. Thanks.

Read the article
How to profile a silverlight application?

- by rudigrobler

Is their any profilers that support Silverlight? I have tried ANTS (Version 3.1) without any success? Does version 4 support it? Any other products I can try? Updated since the release of Silverlight 4, it is now possible to do full profiling on SL applications... check out this article on the topic At PDC, I announced that Silverlight 4 came with the new CoreCLR capability of being profile-able by the VS2010 profilers: this means that for the first time, we give you the power to profile the managed and native code (user or platform) used by a Silverlight application. woohoo. kudos to the CLR team. Sidenote: From silverlight 1-3, one could only use things like xperf (see XPerf: A CPU Sampler for Silverlight) which is very powerful to see the layout/text/media/gfx/etc pipelines, but only gives the native callstack.) From SilverLite (PDC video, TechEd Iceland, VS2010, profiling, Silverlight 4)

Read the article
What's a very easy C++ profiler (VC++)?

- by John

I've used a few profilers in the past and never found them particularly easy. Maybe I picked bad ones, maybe I didn't really know what I was expecting! But I'd like to know if there are any 'standard' profilers which simply drop in and work? I don't believe I need massively fine-detailed reports, just to pick up major black-spots. Ease of use is more important to me at this point. It's VC++ 2008 we're using (I run standard edition personally). I don't suppose there are any tools in the IDE for this, I can't see any from looking at the main menus?

Read the article
Inside Red Gate - Ricky Leeks

- by Simon Cooper

So, one of our profilers has a problem. Red Gate produces two .NET profilers - ANTS Performance Profiler (APP) and ANTS Memory Profiler (AMP). Both products help .NET developers solve problems they are virtually guaranteed to encounter at some point in their careers - slow code, and high memory usage, respectively. Everyone understands slow code - the symptoms are very obvious (an operation takes 2 hours when it should take 10 seconds), you know when you've solved it (the same operation now takes 15 seconds), and everyone understands how you can use a profiler like APP to help solve your particular problem. High memory usage is a much more subtle and misunderstood concept. How can .NET have memory leaks? The garbage collector, and how the CLR uses and frees memory, is one of the most misunderstood concepts in .NET. There's hundreds of blog posts out there covering various aspects of the GC and .NET memory, some of them helpful, some of them confusing, and some of them are just plain wrong. There's a lot of misconceptions out there. And, if you have got an application that uses far too much memory, it can be hard to wade through all the contradictory information available to even get an idea as to what's going on, let alone trying to solve it. That's where a memory profiler, like AMP, comes into play. Unfortunately, that's not the end of the issue. .NET memory management is a large, complicated, and misunderstood problem. Even armed with a profiler, you need to understand what .NET is doing with your objects, how it processes them, and how it frees them, to be able to use the profiler effectively to solve your particular problem. And that's what's wrong with AMP - even with all the thought, designs, UX sessions, and research we've put into AMP itself, some users simply don't have the knowledge required to be able to understand what AMP is telling them about how their application uses memory, and so they have problems understanding & solving their memory problem. Ricky Leeks This is where Ricky Leeks comes in. Created by one of the many...colourful...people in Red Gate, he headlines and promotes several tutorials, pages, and articles all with information on how .NET memory management actually works, with the goal to help educate developers on .NET memory management. And educating us all on how far you can push various vegetable-based puns. This, in turn, not only helps them understand and solve any memory issues they may be having, but helps them proactively code against such memory issues in their existing code. Ricky's latest outing is an interview on .NET Rocks, providing information on the Top 5 .NET Memory Management Gotchas, along with information on a free ebook on .NET Memory Management. Don't worry, there's loads more vegetable-based jokes where those came from...

Read the article
What are the tools used by modern desktop/"native" application developers? [closed]

- by kunjaa

Besides the usual editor and debugger, what do the modern desktop (windows and linux) application developers use for their development. I am more interested in profilers, code analyzers, memory analyzers, packaging tools, GUI frameworks, libraries and any other handy tools and secrets that you couldnt live without. For example, as a web application developer, I have my Firebug and its extensions, Wireshark, jQuery and its extensions, client side and server side mvc frameworks, selenium tests, jsfiddle etc. Edit : Ok let us constrain this by saying you are using C++

Read the article
Structural and Sampling (JavaScript) Profiling in Google Chrome

Structural and Sampling (JavaScript) Profiling in Google Chrome Slow JavaScript code on your pages? Chrome provides both a sampling, and a structural profiler to help you track down, isolate, and fix the underlying problem. Tune in to learn how to use both profilers, and how to improve your own workflow to build better, faster browser applications! We'll talk about chrome://tracing, the built-in JS profiler in DevTools, and much more. From: GoogleDevelopers Views: 0 3 ratings Time: 01:00:00 More in Science & Technology

Read the article
Talking JavaOne with Rock Star Kirk Pepperdine

- by Janice J. Heiss

Kirk Pepperdine is not only a JavaOne Rock Star but a Java Champion and a highly regarded expert in Java performance tuning who works as a consultant, educator, and author. He is the principal consultant at Kodewerk Ltd. He speaks frequently at conferences and co-authored the Ant Developer's Handbook. In the rapidly shifting world of information technology, Pepperdine, as much as anyone, keeps up with what's happening with Java performance tuning. Pepperdine will participate in the following sessions: CON5405 - Are Your Garbage Collection Logs Speaking to You? BOF6540 - Java Champions and JUG Leaders Meet Oracle Executives (with Jeff Genender, Mattias Karlsson, Henrik Stahl, Georges Saab) HOL6500 - Finding and Solving Java Deadlocks (with Heinz Kabutz, Ellen Kraffmiller Martijn Verburg, Jeff Genender, and Henri Tremblay) I asked him what technological changes need to be taken into account in performance tuning. “The volume of data we're dealing with just seems to be getting bigger and bigger all the time,” observed Pepperdine. “A couple of years ago you'd never think of needing a heap that was 64g, but today there are deployments where the heap has grown to 256g and tomorrow there are plans for heaps that are even larger. Dealing with all that data simply requires more horse power and some very specialized techniques. In some cases, teams are trying to push hardware to the breaking point. Under those conditions, you need to be very clever just to get things to work -- let alone to get them to be fast. We are very quickly moving from a world where everything happens in a transaction to one where if you were to even consider using a transaction, you've lost." When asked about the greatest misconceptions about performance tuning that he currently encounters, he said, “If you have a performance problem, you should start looking at code at the very least and for that extra step, whip out an execution profiler. I'm not going to say that I never use execution profilers or look at code. What I will say is that execution profilers are effective for a small subset of performance problems and code is literally the last thing you should look at.And what is the most exciting thing happening in the world of Java today? “Interesting question because so many people would say that nothing exciting is happening in Java. Some might be disappointed that a few features have slipped in terms of scheduling. But I'd disagree with the first group and I'm not so concerned about the slippage because I still see a lot of exciting things happening. First, lambda will finally be with us and with lambda will come better ways.” For JavaOne, he is proctoring for Heinz Kabutz's lab. “I'm actually looking forward to that more than I am to my own talk,” he remarked. “Heinz will be the third non-Sun/Oracle employee to present a lab and the first since Oracle began hosting JavaOne. He's got a great message. He's spent a ton of time making sure things are going to work, and we've got a great team of proctors to help out. After that, getting my talk done, the Java Champion's panel session and then kicking back and just meeting up and talking to some Java heads."Finally, what should Java developers know that they currently do not know? “’Write Once, Run Everywhere’ is a great slogan and Java has come closer to that dream than any other technology stack that I've used. That said, different hardware bits work differently and as hard as we try, the JVM can't hide all the differences. Plus, if we are to get good performance we need to work with our hardware and not against it. All this implies that Java developers need to know more about the hardware they are deploying to.” Originally published on blogs.oracle.com/javaone.

Read the article
Talking JavaOne with Rock Star Kirk Pepperdine

- by Janice J. Heiss

Kirk Pepperdine is not only a JavaOne Rock Star but a Java Champion and a highly regarded expert in Java performance tuning who works as a consultant, educator, and author. He is the principal consultant at Kodewerk Ltd. He speaks frequently at conferences and co-authored the Ant Developer's Handbook. In the rapidly shifting world of information technology, Pepperdine, as much as anyone, keeps up with what's happening with Java performance tuning. Pepperdine will participate in the following sessions: CON5405 - Are Your Garbage Collection Logs Speaking to You? BOF6540 - Java Champions and JUG Leaders Meet Oracle Executives (with Jeff Genender, Mattias Karlsson, Henrik Stahl, Georges Saab) HOL6500 - Finding and Solving Java Deadlocks (with Heinz Kabutz, Ellen Kraffmiller Martijn Verburg, Jeff Genender, and Henri Tremblay) I asked him what technological changes need to be taken into account in performance tuning. “The volume of data we're dealing with just seems to be getting bigger and bigger all the time,” observed Pepperdine. “A couple of years ago you'd never think of needing a heap that was 64g, but today there are deployments where the heap has grown to 256g and tomorrow there are plans for heaps that are even larger. Dealing with all that data simply requires more horse power and some very specialized techniques. In some cases, teams are trying to push hardware to the breaking point. Under those conditions, you need to be very clever just to get things to work -- let alone to get them to be fast. We are very quickly moving from a world where everything happens in a transaction to one where if you were to even consider using a transaction, you've lost." When asked about the greatest misconceptions about performance tuning that he currently encounters, he said, “If you have a performance problem, you should start looking at code at the very least and for that extra step, whip out an execution profiler. I'm not going to say that I never use execution profilers or look at code. What I will say is that execution profilers are effective for a small subset of performance problems and code is literally the last thing you should look at.And what is the most exciting thing happening in the world of Java today? “Interesting question because so many people would say that nothing exciting is happening in Java. Some might be disappointed that a few features have slipped in terms of scheduling. But I'd disagree with the first group and I'm not so concerned about the slippage because I still see a lot of exciting things happening. First, lambda will finally be with us and with lambda will come better ways.” For JavaOne, he is proctoring for Heinz Kabutz's lab. “I'm actually looking forward to that more than I am to my own talk,” he remarked. “Heinz will be the third non-Sun/Oracle employee to present a lab and the first since Oracle began hosting JavaOne. He's got a great message. He's spent a ton of time making sure things are going to work, and we've got a great team of proctors to help out. After that, getting my talk done, the Java Champion's panel session and then kicking back and just meeting up and talking to some Java heads."Finally, what should Java developers know that they currently do not know? “’Write Once, Run Everywhere’ is a great slogan and Java has come closer to that dream than any other technology stack that I've used. That said, different hardware bits work differently and as hard as we try, the JVM can't hide all the differences. Plus, if we are to get good performance we need to work with our hardware and not against it. All this implies that Java developers need to know more about the hardware they are deploying to.”

Read the article
Commit is VERY slow in my NHibernate / SQLite project

- by Tom Bushell

I've just started doing some real-world performance testing on my Fluent NHibernate / SQLite project, and am experiencing some serious delays when when I Commit to the database. By serious, I mean taking 20 - 30 seconds to Commit 30 K of data! This delay seems to get worse as the database grows. When the SQLite DB file is empty, commits happen almost instantly, but when it grows to 10 Meg, I see these huge delays. The database has 16 tables, averaging 10 columns each. One possible problem is that I'm storing a dozen or so IList members, but they are typically only 200 elements long. But this is a recent addition to Fluent NHibernate automapping, which stores each float in a single table row, so maybe that's a potential problem. Any suggestions on how to track this down? I suspect SQLite is the culprit, but maybe it's NHibernate? I don't have any experience with profilers, but am thinking of getting one. I'm aware of NHibernate Profiler - any recommendations for profilers that work well with SQLite? Here's the method that saves the data - it's just a SaveOrUpdate call and a Commit, if you ignore all the error handling and debug logging. public static void SaveMeasurement(object measurement) { Debug.WriteLine("\r\n---SaveMeasurement---"); // Get the application's database session var session = GetSession(); using (var transaction = session.BeginTransaction()) { try { session.SaveOrUpdate(measurement); } catch (Exception e) { throw new ApplicationException( "\r\n SaveMeasurement->SaveOrUpdate failed\r\n\r\n", e); } try { Debug.WriteLine("\r\n---Commit---"); transaction.Commit(); Debug.WriteLine("\r\n---Commit Complete---"); } catch (Exception e) { throw new ApplicationException( "\r\n SaveMeasurement->Commit failed\r\n\r\n", e); } } }

Read the article
Simple Criminal Minds

- by andyleonard

My favorite mother-in-law hooked my bride on the television series Criminal Minds during her last visit. I started watching the show as well, and sort of followed along. I was ok with the show until a recent episode in which Fonzie used his motorcycle to jump a shark in a pool in front of Al's. Ok, wrong show. Sort of. For those unfamiliar with the show, Criminal Minds is about the FBI's Behavioral Analysis Unit. Profilers, in other words. They're called to various places around the country to profile...(read more)

Read the article
High Usage of RAM by wxPython's GUI and need some advice to reduce it

- by user67024

I've recently developed a GUI in wxPython for windows platform. It contains a five tabs, 4 of them are just richTextCtrl boxes and the other one has controls for uploading files, buttons, textctrls, a slider etc.. As I was new to GUI development in Python, I used wxFormBuilder to generate some of the code using a good amount of sizers. So, now the problem is that the GUI starts off with a initial memory of around 40MB which is too much for such a simple application (Or so I think) . Also, when the functions handling the functions use huge lists as the program is for debugging large data logs and identifying the problems in'em implying that I can't afford memory for GUI. So, how can I reduce that start working memory size? Is it a general issue in wxPython? And currently trying use profilers but not sure if it's going to help.

Read the article
Profiling Startup Of VS2012 – YourKit Profiler

- by Alois Kraus

The YourKit (v7.0.5) profiler is interesting in terms of price (79€ single place license, 409€ + 1 year support and upgrades) and feature set. You do get a performance and memory profiler in one package for which you normally need also to pay extra from the other vendors. As an interesting side note the profiler UI is written in Java because they do also sell Java profilers with the same feature set. To get all methods of a VS startup you need first to configure it to include System* in the profiled methods and you need to configure * to measure wall clock time. By default it does record only CPU times which allows you to optimize CPU hungry operations. But you will never see a Thread.Sleep(10000) in the profiler blocking the UI in this mode. It can profile as all others processes started from within the profiler but it can also profile the next or all started processes. As usual it can profile in sampling and tracing mode. But since it is a memory profiler as well it does by default also record all object allocations > 1MB. With allocation recording enabled VS2012 did crash but without allocation recording there were no problems. The CPU tab contains the time line of the application and when you click in the graph you the call stacks of all threads at this time. This is really a nice feature. When you select a time region you the CPU Usage estimation for this time window. I have seen many applications consuming 100% CPU only because they did create garbage like crazy. For this is the Garbage Collection tab interesting in conjunction with a time range. This view is like the CPU table only that the CPU graph (green) is missing. All relevant information except for GCs/s is already visible in the CPU tab. Very handy to pinpoint excessive GC or CPU bound issues. The Threads tab does show the thread names and their lifetime. This is useful to see thread interactions or which thread is hottest in terms of CPU consumption. On the CPU tab the call tree does exist in a merged and thread specific view. When you click on a method you get below a list of all called methods. There you can sort for methods with a high own time which are worth optimizing. In the Method List you can select which scope you want to see. Back Traces are the methods which did call you. Callees ist the list of methods called directly or indirectly by your method as a flat list. This is not a call stack but still very useful to see which methods were slow so you can see the “root” cause quite quickly without the need to click trough long call stacks. The last view Merged Calles is a call stacked view of the previous view. This does help a lot to understand did call each method at run time. You would get the same view with a debugger for one call invocation but here you get the full statistics (invocation count) as well. Since YourKit is also a memory profiler you can directly see which objects you have on your managed heap and which objects do hold most of your precious memory. You can in in the Object Explorer view also examine the contents of your objects (strings or whatsoever) to get a better understanding which objects where potentially allocating this stuff. YourKit is a very easy to use combined memory and performance profiler in one product. The unbeatable single license price makes it very attractive to straightly buy it. Although it is a Java UI it is very responsive and the memory consumption is considerably lower compared to dotTrace and ANTS profiler. What I do really like is to start the YourKit ui and then start the processes I want to profile as usual. There is no need to alter your own application code to be able to inject a profiler into your new started processes. For performance and memory profiling you can simply select the process you want to investigate from the list of started processes. That's the way I like to use profilers. Just get out of the way and let the application run without any special preparations. Next: Telerik JustTrace

Read the article
What tools does your company use to manage application performance of asp.net applications?

- by CodeToGlory

I am not talking about application profilers or debuggers but more specific to managing the applications in production environment. So essentially monitor, identify bottlenecks, deploy fixes.

Read the article
Critical tools that every Java Developer should have in his toolbelt?

- by Timur Fanshteyn

I was trying to compile a list of tools that a good Java Developer should be know of, and keep in his Developer Tool Belt I can think of a few Eclipse Development Environment - There are other IDEs, but you should know how Eclipse of eclipse. JUnit - Java Unit Testing Framework. Of course there are others, but... ANT Maven Soap UI - for testing SOAP endpoints jrat - Java Profiler. I don't know of other good Java profilers Java Decompiler - For when you just have to know what's in the jar file

Read the article

Search Results

Search found 53 results on 3 pages for 'profilers'.

Page 1/3 | 1 2 3 | Next Page >

- by Earlz

- by 2Real

- by maxima120

- by FransBouma

- by Laila

- by Oliver

- by Eric

- by user880954

- by James Michael Hare

- by Alois Kraus

- by Alois Kraus

- by carleeto

- by rudigrobler

- by John

- by Simon Cooper

- by kunjaa

- by Janice J. Heiss

- by Janice J. Heiss

- by Tom Bushell

- by andyleonard

- by user67024

- by Alois Kraus

- by CodeToGlory

- by Timur Fanshteyn

1 2 3 | Next Page >