Search Results

Search found 156 results on 7 pages for 'evenly'.

Page 7/7 | < Previous Page | 3 4 5 6 7

Parallelism in .NET – Part 11, Divide and Conquer via Parallel.Invoke

- by Reed

Many algorithms are easily written to work via recursion. For example, most data-oriented tasks where a tree of data must be processed are much more easily handled by starting at the root, and recursively “walking” the tree. Some algorithms work this way on flat data structures, such as arrays, as well. This is a form of divide and conquer: an algorithm design which is based around breaking up a set of work recursively, “dividing” the total work in each recursive step, and “conquering” the work when the remaining work is small enough to be solved easily. Recursive algorithms, especially ones based on a form of divide and conquer, are often a very good candidate for parallelization. This is apparent from a common sense standpoint. Since we’re dividing up the total work in the algorithm, we have an obvious, built-in partitioning scheme. Once partitioned, the data can be worked upon independently, so there is good, clean isolation of data. Implementing this type of algorithm is fairly simple. The Parallel class in .NET 4 includes a method suited for this type of operation: Parallel.Invoke. This method works by taking any number of delegates defined as an Action, and operating them all in parallel. The method returns when every delegate has completed: Parallel.Invoke( () => { Console.WriteLine("Action 1 executing in thread {0}", Thread.CurrentThread.ManagedThreadId); }, () => { Console.WriteLine("Action 2 executing in thread {0}", Thread.CurrentThread.ManagedThreadId); }, () => { Console.WriteLine("Action 3 executing in thread {0}", Thread.CurrentThread.ManagedThreadId); } ); .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Running this simple example demonstrates the ease of using this method. For example, on my system, I get three separate thread IDs when running the above code. By allowing any number of delegates to be executed directly, concurrently, the Parallel.Invoke method provides us an easy way to parallelize any algorithm based on divide and conquer. We can divide our work in each step, and execute each task in parallel, recursively. For example, suppose we wanted to implement our own quicksort routine. The quicksort algorithm can be designed based on divide and conquer. In each iteration, we pick a pivot point, and use that to partition the total array. We swap the elements around the pivot, then recursively sort the lists on each side of the pivot. For example, let’s look at this simple, sequential implementation of quicksort: public static void QuickSort<T>(T[] array) where T : IComparable<T> { QuickSortInternal(array, 0, array.Length - 1); } private static void QuickSortInternal<T>(T[] array, int left, int right) where T : IComparable<T> { if (left >= right) { return; } SwapElements(array, left, (left + right) / 2); int last = left; for (int current = left + 1; current <= right; ++current) { if (array[current].CompareTo(array[left]) < 0) { ++last; SwapElements(array, last, current); } } SwapElements(array, left, last); QuickSortInternal(array, left, last - 1); QuickSortInternal(array, last + 1, right); } static void SwapElements<T>(T[] array, int i, int j) { T temp = array[i]; array[i] = array[j]; array[j] = temp; } Here, we implement the quicksort algorithm in a very common, divide and conquer approach. Running this against the built-in Array.Sort routine shows that we get the exact same answers (although the framework’s sort routine is slightly faster). On my system, for example, I can use framework’s sort to sort ten million random doubles in about 7.3s, and this implementation takes about 9.3s on average. Looking at this routine, though, there is a clear opportunity to parallelize. At the end of QuickSortInternal, we recursively call into QuickSortInternal with each partition of the array after the pivot is chosen. This can be rewritten to use Parallel.Invoke by simply changing it to: // Code above is unchanged... SwapElements(array, left, last); Parallel.Invoke( () => QuickSortInternal(array, left, last - 1), () => QuickSortInternal(array, last + 1, right) ); } This routine will now run in parallel. When executing, we now see the CPU usage across all cores spike while it executes. However, there is a significant problem here – by parallelizing this routine, we took it from an execution time of 9.3s to an execution time of approximately 14 seconds! We’re using more resources as seen in the CPU usage, but the overall result is a dramatic slowdown in overall processing time. This occurs because parallelization adds overhead. Each time we split this array, we spawn two new tasks to parallelize this algorithm! This is far, far too many tasks for our cores to operate upon at a single time. In effect, we’re “over-parallelizing” this routine. This is a common problem when working with divide and conquer algorithms, and leads to an important observation: When parallelizing a recursive routine, take special care not to add more tasks than necessary to fully utilize your system. This can be done with a few different approaches, in this case. Typically, the way to handle this is to stop parallelizing the routine at a certain point, and revert back to the serial approach. Since the first few recursions will all still be parallelized, our “deeper” recursive tasks will be running in parallel, and can take full advantage of the machine. This also dramatically reduces the overhead added by parallelizing, since we’re only adding overhead for the first few recursive calls. There are two basic approaches we can take here. The first approach would be to look at the total work size, and if it’s smaller than a specific threshold, revert to our serial implementation. In this case, we could just check right-left, and if it’s under a threshold, call the methods directly instead of using Parallel.Invoke. The second approach is to track how “deep” in the “tree” we are currently at, and if we are below some number of levels, stop parallelizing. This approach is a more general-purpose approach, since it works on routines which parse trees as well as routines working off of a single array, but may not work as well if a poor partitioning strategy is chosen or the tree is not balanced evenly. This can be written very easily. If we pass a maxDepth parameter into our internal routine, we can restrict the amount of times we parallelize by changing the recursive call to: // Code above is unchanged... SwapElements(array, left, last); if (maxDepth < 1) { QuickSortInternal(array, left, last - 1, maxDepth); QuickSortInternal(array, last + 1, right, maxDepth); } else { --maxDepth; Parallel.Invoke( () => QuickSortInternal(array, left, last - 1, maxDepth), () => QuickSortInternal(array, last + 1, right, maxDepth)); } We no longer allow this to parallelize indefinitely – only to a specific depth, at which time we revert to a serial implementation. By starting the routine with a maxDepth equal to Environment.ProcessorCount, we can restrict the total amount of parallel operations significantly, but still provide adequate work for each processing core. With this final change, my timings are much better. On average, I get the following timings: Framework via Array.Sort: 7.3 seconds Serial Quicksort Implementation: 9.3 seconds Naive Parallel Implementation: 14 seconds Parallel Implementation Restricting Depth: 4.7 seconds Finally, we are now faster than the framework’s Array.Sort implementation.

Read the article
Project Jigsaw: Late for the train: The Q&A

- by Mark Reinhold

I recently proposed, to the Java community in general and to the SE 8 (JSR 337) Expert Group in particular, to defer Project Jigsaw from Java 8 to Java 9. I also proposed to aim explicitly for a regular two-year release cycle going forward. Herewith a summary of the key questions I’ve seen in reaction to these proposals, along with answers. Making the decision Q Has the Java SE 8 Expert Group decided whether to defer the addition of a module system and the modularization of the Platform to Java SE 9? A No, it has not yet decided. Q By when do you expect the EG to make this decision? A In the next month or so. Q How can I make sure my voice is heard? A The EG will consider all relevant input from the wider community. If you have a prominent blog, column, or other communication channel then there’s a good chance that we’ve already seen your opinion. If not, you’re welcome to send it to the Java SE 8 Comments List, which is the EG’s official feedback channel. Q What’s the overall tone of the feedback you’ve received? A The feedback has been about evenly divided as to whether Java 8 should be delayed for Jigsaw, Jigsaw should be deferred to Java 9, or some other, usually less-realistic, option should be taken. Project Jigsaw Q Why is Project Jigsaw taking so long? A Project Jigsaw started at Sun, way back in August 2008. Like many efforts during the final years of Sun, it was not well staffed. Jigsaw initially ran on a shoestring, with just a handful of mostly part-time engineers, so progress was slow. During the integration of Sun into Oracle all work on Jigsaw was halted for a time, but it was eventually resumed after a thorough consideration of the alternatives. Project Jigsaw was really only fully staffed about a year ago, around the time that Java 7 shipped. We’ve added a few more engineers to the team since then, but that can’t make up for the inadequate initial staffing and the time lost during the transition. Q So it’s really just a matter of staffing limitations and corporate-integration distractions? A Aside from these difficulties, the other main factor in the duration of the project is the sheer technical difficulty of modularizing the JDK. Q Why is modularizing the JDK so hard? A There are two main reasons. The first is that the JDK code base is deeply interconnected at both the API and the implementation levels, having been built over many years primarily in the style of a monolithic software system. We’ve spent considerable effort eliminating or at least simplifying as many API and implementation dependences as possible, so that both the Platform and its implementations can be presented as a coherent set of interdependent modules, but some particularly thorny cases remain. Q What’s the second reason? A We want to maintain as much compatibility with prior releases as possible, most especially for existing classpath-based applications but also, to the extent feasible, for applications composed of modules. Q Is modularizing the JDK even necessary? Can’t you just put it in one big module? A Modularizing the JDK, and more specifically modularizing the Java SE Platform, will enable standard yet flexible Java runtime configurations scaling from large servers down to small embedded devices. In the long term it will enable the convergence of Java SE with the higher-end Java ME Platforms. Q Is Project Jigsaw just about modularizing the JDK? A As originally conceived, Project Jigsaw was indeed focused primarily upon modularizing the JDK. The growing demand for a truly standard module system for the Java Platform, which could be used not just for the Platform itself but also for libraries and applications built on top of it, later motivated expanding the scope of the effort. Q As a developer, why should I care about Project Jigsaw? A The introduction of a modular Java Platform will, in the long term, fundamentally change the way that Java implementations, libraries, frameworks, tools, and applications are designed, built, and deployed. Q How much progress has Project Jigsaw made? A We’ve actually made a lot of progress. Much of the core functionality of the module system has been prototyped and works at both compile time and run time. We’ve extended the Java programming language with module declarations, worked out a structure for modular source trees and corresponding compiled-class trees, and implemented these features in javac. We’ve defined an efficient module-file format, extended the JVM to bootstrap a modular JRE, and designed and implemented a preliminary API. We’ve used the module system to make a good first cut at dividing the JDK and the Java SE API into a coherent set of modules. Among other things, we’re currently working to retrofit the java.util.ServiceLoader API to support modular services. Q I want to help! How can I get involved? A Check out the project page, read the draft requirements and design overview documents, download the latest prototype build, and play with it. You can tell us what you think, and follow the rest of our work in real time, on the jigsaw-dev list. The Java Platform Module System JSR Q What’s the relationship between Project Jigsaw and the eventual Java Platform Module System JSR? A At a high level, Project Jigsaw has two phases. In the first phase we’re exploring an approach to modularity that’s markedly different from that of existing Java modularity solutions. We’ve assumed that we can change the Java programming language, the virtual machine, and the APIs. Doing so enables a design which can strongly enforce module boundaries in all program phases, from compilation to deployment to execution. That, in turn, leads to better usability, diagnosability, security, and performance. The ultimate goal of the first phase is produce a working prototype which can inform the work of the Module-System JSR EG. Q What will happen in the second phase of Project Jigsaw? A The second phase will produce the reference implementation of the specification created by the Module-System JSR EG. The EG might ultimately choose an entirely different approach than the one we’re exploring now. If and when that happens then Project Jigsaw will change course as necessary, but either way I think that the end result will be better for having been informed by our current work. Maven & OSGi Q Why not just use Maven? A Maven is a software project management and comprehension tool. As such it can be seen as a kind of build-time module system but, by its nature, it does nothing to support modularity at run time. Q Why not just adopt OSGi? A OSGi is a rich dynamic component system which includes not just a module system but also a life-cycle model and a dynamic service registry. The latter two facilities are useful to some kinds of sophisticated applications, but I don’t think they’re of wide enough interest to be standardized as part of the Java SE Platform. Q Okay, then why not just adopt the module layer of OSGi? A The OSGi module layer is not operative at compile time; it only addresses modularity during packaging, deployment, and execution. As it stands, moreover, it’s useful for library and application modules but, since it’s built strictly on top of the Java SE Platform, it can’t be used to modularize the Platform itself. Q If Maven addresses modularity at build time, and the OSGi module layer addresses modularity during deployment and at run time, then why not just use the two together, as many developers already do? A The combination of Maven and OSGi is certainly very useful in practice today. These systems have, however, been built on top of the existing Java platform; they have not been able to change the platform itself. This means, among other things, that module boundaries are weakly enforced, if at all, which makes it difficult to diagnose configuration errors and impossible to run untrusted code securely. The prototype Jigsaw module system, by contrast, aims to define a platform-level solution which extends both the language and the JVM in order to enforce module boundaries strongly and uniformly in all program phases. Q If the EG chooses an approach like the one currently being taken in the Jigsaw prototype, will Maven and OSGi be made obsolete? A No, not at all! No matter what approach is taken, to ensure wide adoption it’s essential that the standard Java Platform Module System interact well with Maven. Applications that depend upon the sophisticated features of OSGi will no doubt continue to use OSGi, so it’s critical that implementations of OSGi be able to run on top of the Java module system and, if suitably modified, support OSGi bundles that depend upon Java modules. Ideas for how to do that are currently being explored in Project Penrose. Java 8 & Java 9 Q Without Jigsaw, won’t Java 8 be a pretty boring release? A No, far from it! It’s still slated to include the widely-anticipated Project Lambda (JSR 335), work on which has been going very well, along with the new Date/Time API (JSR 310), Type Annotations (JSR 308), and a set of smaller features already in progress. Q Won’t deferring Jigsaw to Java 9 delay the eventual convergence of the higher-end Java ME Platforms with Java SE? A It will slow that transition, but it will not stop it. To allow progress toward that convergence to be made with Java 8 I’ve suggested to the Java SE 8 EG that we consider specifying a small number of Profiles which would allow compact configurations of the SE Platform to be built and deployed. Q If Jigsaw is deferred to Java 9, would the Oracle engineers currently working on it be reassigned to other Java 8 features and then return to working on Jigsaw again after Java 8 ships? A No, these engineers would continue to work primarily on Jigsaw from now until Java 9 ships. Q Why not drop Lambda and finish Jigsaw instead? A Even if the engineers currently working on Lambda could instantly switch over to Jigsaw and immediately become productive—which of course they can’t—there are less than nine months remaining in the Java 8 schedule for work on major features. That’s just not enough time for the broad review, testing, and feedback which such a fundamental change to the Java Platform requires. Q Why not ship the module system in Java 8, and then modularize the platform in Java 9? A If we deliver a module system in one release but don’t use it to modularize the JDK until some later release then we run a big risk of getting something fundamentally wrong. If that happens then we’d have to fix it in the later release, and fixing fundamental design flaws after the fact almost always leads to a poor end result. Q Why not ship Jigsaw in an 8.5 release, less than two years after 8? Or why not just ship a new release every year, rather than every other year? A Many more developers work on the JDK today than a couple of years ago, both because Oracle has dramatically increased its own investment and because other organizations and individuals have joined the OpenJDK Community. Collectively we don’t, however, have the bandwidth required to ship and then provide long-term support for a big JDK release more frequently than about every other year. Q What’s the feedback been on the two-year release-cycle proposal? A For just about every comment that we should release more frequently, so that new features are available sooner, there’s been another asking for an even slower release cycle so that large teams of enterprise developers who ship mission-critical applications have a chance to migrate at a comfortable pace.

Read the article
Quartz Thread Execution Parallel or Sequential?

- by vikas

We have a quartz based scheduler application which runs about 1000 jobs per minute which are evenly distributed across seconds of each minute i.e. about 16-17 jobs per second. Ideally, these 16-17 jobs should fire at same time, however our first statement, which simply logs the time of execution, of execute method of the job is being called very late. e.g. let us assume we have 1000 jobs scheduled per minute from 05:00 to 05:04. So, ideally the job which is scheduled at 05:03:50 should have logged the first statement of the execute method at 05:03:50, however, it is doing it at about 05:06:38. I have tracked down the time taken by the scheduled job which comes around 15-20 milliseconds. This scheduled job is fast enough because we just send a message on an ActiveMQ queue. We have specified the number of threads of quartz to be 100 and even tried with increasing it to 200 and more, but no gain. One more thing we noticed is that logs from scheduler are coming sequential after first 1 minute i.e. [Quartz_Worker_28] <Some log statement> .. .. [Quartz_Worker_29] <Some log statement> .. .. [Quartz_Worker_30] <Some log statement> .. .. So it suggesting that after some time quartz is running threads almost sequential. May be this is happening due to the time taken in notifying the job completion to persistence store (which is a separate postgres database in this case) and/or context switching. What can be the reason behind this strange behavior? EDIT: More detailed Log [06/07/12 10:08:37:192][QuartzScheduler_Worker-34][INFO] org.quartz.plugins.history.LoggingTriggerHistoryPlugin - Trigger [<trigger_name>] fired job [<job_name>] scheduled at: 06-07-2012 10:08:33.458, next scheduled at: 06-07-2012 10:34:53.000 [06/07/12 10:08:37:192][QuartzScheduler_Worker-34][INFO] <my_package>.scheduler.quartz.ScheduledLocateJob - execute begin--------- ScheduledLocateJob with key: <job_name> started at Fri Jul 06 10:08:37 EDT 2012 [06/07/12 10:08:37:192][QuartzScheduler_Worker-34][INFO] <my_package>.scheduler.quartz.ScheduledLocateJob <some log statement> [06/07/12 10:08:37:192][QuartzScheduler_Worker-34][INFO] <my_package>.scheduler.quartz.ScheduledLocateJob <some log statement> [06/07/12 10:08:37:192][QuartzScheduler_Worker-34][INFO] <my_package>.scheduler.quartz.ScheduledLocateJob <some log statement> [06/07/12 10:08:37:220][QuartzScheduler_Worker-34][INFO] <my_package>.scheduler.quartz.ScheduledLocateJob - execute end--------- ScheduledLocateJob with key: <job_name> ended at Fri Jul 06 10:08:37 EDT 2012 [06/07/12 10:08:37:220][QuartzScheduler_Worker-34][INFO] org.quartz.plugins.history.LoggingTriggerHistoryPlugin - Trigger [<trigger_name>] completed firing job [<job_name>] with resulting trigger instruction code: DO NOTHING. Next scheduled at: 06-07-2012 10:34:53.000 I am doubting on this section of the above log scheduled at: 06-07-2012 10:08:33.458, next scheduled at: 06-07-2012 10:34:53.000 because this job was scheduled for 10:04:53, but it fired at 10:08:33 and still quartz didn't consider it as misfire. Shouldn't it be a misfire?

Read the article
Does this sound like a stack overflow?

- by Jordan S

I think I might be having a stack overflow problem or something similar in my embedded firmware code. I am a new programmer and have never dealt with a SO so I'm not sure if that is what's happening or not. The firmware controls a device with a wheel that has magnets evenly spaced around it and the board has a hall effect sensor that senses when magnet is over it. My firmware operates the stepper and also count steps while monitoring the magnet sensor in order to detect if the wheel has stalled. I am using a timer interrupt on my chip (8 bit, 8057 acrh.) to set output ports to control the motor and for the stall detection. The stall detection code looks like this... // Enter ISR // Change the ports to the appropriate value for the next step // ... StallDetector++; // Increment the stall detector if(PosSensor != LastPosMagState) { StallDetector = 0; LastPosMagState = PosSensor; } else { if (PosSensor == ON) { if (StallDetector > (MagnetSize + 10)) { HandleStallEvent(); } } else if (PosSensor == OFF) { if (StallDetector > (GapSize + 10)) { HandleStallEvent(); } } } this code is called every time the ISR is triggered. PosSensor is the magnet sensor. MagnetSize is the number of stepper steps that it takes to get through the magnet field. GapSize is the number of steps between two magnets. So I want to detect if the wheel gets stuck either with the sensor over a magnet or not over a magnet. This works great for a long time but then after a while the first stall event will occur because 'StallDetector (MagnetSize + 10)' but when I look at the value of StallDetector it is always around 220! This doesn't make sense because MagnetSize is always around 35. So the stall event should have been triggered at like 46 but somehow it got all the way up to 220? And I don't set the value of stall detector anywhere else in my code. Do you have any advice on how I can track down the root of this problem? The ISR looks like this void Timer3_ISR(void) interrupt 14 { OperateStepper(); // This is the function shown above TMR3CN &= ~0x80; // Clear Timer3 interrupt flag } HandleStallEvent just sets a few variable back to their default values so that it can attempt another move... #pragma save #pragma nooverlay void HandleStallEvent() { ///* PulseMotor = 0; //Stop the wheel from moving SetMotorPower(0); //Set motor power low MotorSpeed = LOW_SPEED; SetSpeedHz(); ERROR_STATE = 2; DEVICE_IS_HOMED = FALSE; DEVICE_IS_HOMING = FALSE; DEVICE_IS_MOVING = FALSE; HOMING_STATE = 0; MOVING_STATE = 0; CURRENT_POSITION = 0; StallDetector = 0; return; //*/ } #pragma restore

Read the article
ActionScript Gradient Banding Problem

- by TheDarkIn1978

i'm having a strange issue with banding between certain colors of a gradient. to create the gradient, i'm drawing evenly spaced circle wedges from the center to the border, and filling each circle wedge from a bitmap line gradient pixel in a loop. public class ColorWheel extends Sprite { private static const DEFAULT_RADIUS:Number = 100; private static const DEFAULT_BANDING_QUALITY:int = 3600; public function ColorWheel(nRadius:Number = DEFAULT_RADIUS) { init(nRadius); } public function init(nRadius:Number = DEFAULT_RADIUS):void { var nRadians : Number; var nColor : Number; var objMatrix : Matrix = new Matrix(); var nX : Number; var nY : Number; var previousX : Number = nRadius; var previousY : Number = 0; var leftToRightColors:Array = new Array(0xFF0000, 0xFFFF00, 0x00FF00, 0x00FFFF, 0x0000FF, 0xFF00FF); leftToRightColors.push(leftToRightColors[0]); var leftToRightAlphas:Array = new Array(); var leftToRightRatios:Array = new Array(); var leftToRightPartition:Number = 255 / (leftToRightColors.length - 1); //Push arrays for (var j:int = 0; j < leftToRightColors.length; j++) { leftToRightAlphas.push(1); leftToRightRatios.push(j * leftToRightPartition); } var leftToRightColorsMatrix:Matrix = new Matrix(); leftToRightColorsMatrix.createGradientBox(DEFAULT_BANDING_QUALITY, 1); //Produce a horizontal leftToRightLine sprite var leftToRightLine:Sprite = new Sprite(); leftToRightLine.graphics.lineStyle(1, 0, 1, false, LineScaleMode.NONE, CapsStyle.NONE); leftToRightLine.graphics.lineGradientStyle(GradientType.LINEAR, leftToRightColors, leftToRightAlphas, leftToRightRatios, leftToRightColorsMatrix); leftToRightLine.graphics.moveTo(0, 0); leftToRightLine.graphics.lineTo(DEFAULT_BANDING_QUALITY, 0); //Assign bitmapData to the leftToRightLine var leftToRightLineBitmapData:BitmapData = new BitmapData(leftToRightLine.width, leftToRightLine.height); leftToRightLineBitmapData.draw(leftToRightLine); for(var i:int = 1; i < (DEFAULT_BANDING_QUALITY + 1); i++) { // Convert the degree to radians. nRadians = i * (Math.PI / (DEFAULT_BANDING_QUALITY / 2)); // OR the individual color channels together. nColor = leftToRightLineBitmapData.getPixel(i-1, 0); // Calculate the coordinate in which the line should be drawn to. nX = nRadius * Math.cos(nRadians); nY = nRadius * Math.sin(nRadians); // Create a matrix for the wedges gradient color. objMatrix.createGradientBox(nRadius * 2, nRadius * 2, nRadians, -nRadius, -nRadius); graphics.beginGradientFill(GradientType.LINEAR, [nColor, nColor], [1, 1], [127, 255], objMatrix); graphics.moveTo( 0, 0 ); graphics.lineTo( previousX, previousY ); graphics.lineTo( nX, nY ); graphics.lineTo( 0, 0 ); graphics.endFill(); previousX = nX; previousY = nY; } } } i'm creating a circle with 3600 wedges, although it doesn't look like it based on the screen shot within the orange color that is produced from gradating from red to yellow numbers. adding a orange number between red and yellow doesn't help. but if i create the circle with only 360 wedges, the gradient banding is much more obvious. 3600 is probably overkill, and doesn't really add more detail over, say, making the circle of 1440 wedges, but i don't know any other way to slightly elevate this banding issue. any ideas how i can fix this, or what i'm doing wrong? could it be caused by the circleMatrix rotation?

Read the article
Load and Web Performance Testing using Visual Studio Ultimate 2010-Part 3

- by Tarun Arora

Welcome back once again, in Part 1 of Load and Web Performance Testing using Visual Studio 2010 I talked about why Performance Testing the application is important, the test tools available in Visual Studio Ultimate 2010 and various test rig topologies, in Part 2 of Load and Web Performance Testing using Visual Studio 2010 I discussed the details of web performance & load tests as well as why it’s important to follow a goal based pattern while performance testing your application. In part 3 I’ll be discussing Test Result Analysis, Test Result Drill through, Test Report Generation, Test Run Comparison, Asp.net Profiler and some closing thoughts. Test Results – I see some creepy worms! In Part 2 we put together a web performance test and a load test, lets run the test to see load test to see how the Web site responds to the load simulation. While the load test is running you will be able to see close to real time analysis in the Load Test Analyser window. You can use the Load Test Analyser to conduct load test analysis in three ways: Monitor a running load test - A condensed set of the performance counter data is maintained in memory. To prevent the results memory requirements from growing unbounded, up to 200 samples for each performance counter are maintained. This includes 100 evenly spaced samples that span the current elapsed time of the run and the most recent 100 samples. After the load test run is completed - The test controller spools all collected performance counter data to a database while the test is running. Additional data, such as timing details and error details, is loaded into the database when the test completes. The performance data for a completed test is loaded from the database and analysed by the Load Test Analyser. Below you can see a screen shot of the summary view, this provides key results in a format that is compact and easy to read. You can also print the load test summary, this is generated after the test has completed or been stopped. Analyse the load test results of a previously run load test – We’ll see this in the section where i discuss comparison between two test runs. The performance counters can be plotted on the graphs. You also have the option to highlight a selected part of the test and view details, drill down to the user activity chart where you can hover over to see more details of the test run. Generate Report => Test Run Comparisons The level of reports you can generate using the Load Test Analyser is astonishing. You have the option to create excel reports and conduct side by side analysis of two test results or to track trend analysis. The tools also allows you to export the graph data either to MS Excel or to a CSV file. You can view the ASP.NET profiler report to conduct further analysis as well. View Data and Diagnostic Attachments opens the Choose Diagnostic Data Adapter Attachment dialog box to select an adapter to analyse the result type. For example, you can select an IntelliTrace adapter, click OK and open the IntelliTrace summary for the test agent that was used in the load test. Compare results This creates a set of reports that compares the data from two load test results using tables and bar charts. I have taken these screen shots from the MSDN documentation, I would highly recommend exploring the wealth of knowledge available on MSDN. Leaving Thoughts While load testing the application with an excessive load for a longer duration of time, i managed to bring the IIS to its knees by piling up a huge queue of requests waiting to be processed. This clearly means that the IIS had run out of threads as all the threads were busy processing existing request, one easy way of fixing this is by increasing the default number of allocated threads, but this might escalate the problem. The better suggestion is to try and drill down to the actual root cause of the problem. When ever the garbage collection runs it stops processing any pages so all requests that come in during that period are queued up, but realistically the garbage collection completes in fraction of a a second. To understand this better lets look at the .net heap, it is divided into large heap and small heap, anything greater than 85kB in size will be allocated to the Large object heap, the Large object heap is non compacting and remember large objects are expensive to move around, so if you are allocating something in the large object heap, make sure that you really need it! The small object heap on the other hand is divided into generations, so all objects that are supposed to be short-lived are suppose to live in Gen-0 and the long living objects eventually move to Gen-2 as garbage collection goes through. As you can see in the picture below all < 85 KB size objects are first assigned to Gen-0, when Gen-0 fills up and a new object comes in and finds Gen-0 full, the garbage collection process is started, the process checks for all the dead objects and assigns them as the valid candidate for deletion to free up memory and promotes all the remaining objects in Gen-0 to Gen-1. So in the future when ever you clean up Gen-1 you have to clean up Gen-0 as well. When you fill up Gen – 0 again, all of Gen – 1 dead objects are drenched and rest are moved to Gen-2 and Gen-0 objects are moved to Gen-1 to free up Gen-0, but by this time your Garbage collection process has started to take much more time than it usually takes. Now as I mentioned earlier when garbage collection is being run all page requests that come in during that period are queued up. Does this explain why possibly page requests are getting queued up, apart from this it could also be the case that you are waiting for a long running database process to complete. Lets explore the heap a bit more… What is really a case of crisis is when the objects are living long enough to make it to Gen-2 and then dying, this is definitely a high cost operation. But sometimes you need objects in memory, for example when you cache data you hold on to the objects because you need to use them right across the user session, which is acceptable. But if you wanted to see what extreme caching can do to your server then write a simple application that chucks in a lot of data in cache, run a load test over it for about 10-15 minutes, forcing a lot of data in memory causing the heap to run out of memory. If you get to such a state where you start running out of memory the IIS as a mode of recovery restarts the worker process. It is great way to free up all your memory in the heap but this would clear the cache. The problem with this is if the customer had 10 items in their shopping basket and that data was stored in the application cache, the user basket will now be empty forcing them either to get frustrated and go to a competitor website or if the customer is really patient, give it another try! How can you address this, well two ways of addressing this; 1. Workaround – A x86 bit processor only allows a maximum of 4GB of RAM, this means the machine effectively has around 3.4 GB of RAM available, the OS needs about 1.5 GB of RAM to run efficiently, the IIS and .net framework also need their share of memory, leaving you a heap of around 800 MB to play with. Because Team builds by default build your application in ‘Compile as any mode’ it means the application is build such that it will run in x86 bit mode if run on a x86 bit processor and run in a x64 bit mode if run on a x64 but processor. The problem with this is not all applications are really x64 bit compatible specially if you are using com objects or external libraries. So, as a quick win if you compiled your application in x86 bit mode by changing the compile as any selection to compile as x86 in the team build, you will be able to run your application on a x64 bit machine in x86 bit mode (WOW – By running Windows on Windows) and what that means is, you could use 8GB+ worth of RAM, if you take away everything else your application will roughly get a heap size of at least 4 GB to play with, which is immense. If you need a heap size of more than 4 GB you have either build a software for NASA or there is something fundamentally wrong in your application. 2. Solution – Now that you have put a workaround in place the IIS will not restart the worker process that regularly, which means you can take a breather and start working to get to the root cause of this memory leak. But this begs a question “How do I Identify possible memory leaks in my application?” Well i won’t say that there is one single tool that can tell you where the memory leak is, but trust me, ‘Performance Profiling’ is a great start point, it definitely gets you started in the right direction, let’s have a look at how. Performance Wizard - Start the Performance Wizard and select Instrumentation, this lets you measure function call counts and timings. Before running the performance session right click the performance session settings and chose properties from the context menu to bring up the Performance session properties page and as shown in the screen shot below, check the check boxes in the group ‘.NET memory profiling collection’ namely ‘Collect .NET object allocation information’ and ‘Also collect the .NET Object lifetime information’. Now if you fire off the profiling session on your pages you will notice that the results allows you to view ‘Object Lifetime’ which shows you the number of objects that made it to Gen-0, Gen-1, Gen-2, Large heap, etc. Another great feature about the profile is that if your application has > 5% cases where objects die right after making to the Gen-2 storage a threshold alert is generated to alert you. Since you have the option to also view the most expensive methods and by capturing the IntelliTrace data you can drill in to narrow down to the line of code that is the root cause of the problem. Well now that we have seen how crucial memory management is and how easy Visual Studio Ultimate 2010 makes it for us to identify and reproduce the problem with the best of breed tools in the product. Caching One of the main ways to improve performance is Caching. Which basically means you tell the web server that instead of going to the database for each request you keep the data in the webserver and when the user asks for it you serve it from the webserver itself. BUT that can have consequences! Let’s look at some code, trust me caching code is not very intuitive, I define a cache key for almost all searches made through the common search page and cache the results. The approach works fine, first time i get the data from the database and second time data is served from the cache, significant performance improvement, EXCEPT when two users try to do the same operation and run into each other. But it is easy to handle this by adding the lock as you can see in the snippet below. So, as long as a user comes in and finds that the cache is empty, the user locks and starts to get the cache no more concurrency issues. But lets say you are processing 10 requests per second, by the time i have locked the operation to get the results from the database, 9 other users came in and found that the cache key is null so after i have come out and populated the cache they will still go in to get the results again. The application will still be faster because the next set of 10 users and so on would continue to get data from the cache. BUT if we added another null check after locking to build the cache and before actual call to the db then the 9 users who follow me would not make the extra trip to the database at all and that would really increase the performance, but didn’t i say that the code won’t be very intuitive, may be you should leave a comment you don’t want another developer to come in and think what a fresher why is he checking for the cache key null twice !!! The downside of caching is, you are storing the data outside of the database and the data could be wrong because the updates applied to the database would make the data cached at the web server out of sync. So, how do you invalidate the cache? Well if you only had one way of updating the data lets say only one entry point to the data update you can write some logic to say that every time new data is entered set the cache object to null. But this approach will not work as soon as you have several ways of feeding data to the system or your system is scaled out across a farm of web servers. The perfect solution to this is Micro Caching which means you cache the query for a set time duration and invalidate the cache after that set duration. The advantage is every time the user queries for that data with in the time span for which you have cached the results there are no calls made to the database and the data is served right from the server which makes the response immensely quick. Now figuring out the appropriate time span for which you micro cache the query results really depends on the application. Lets say your website gets 10 requests per second, if you retain the cache results for even 1 minute you will have immense performance gains. You would reduce 90% hits to the database for searching. Ever wondered why when you go to e-bookers.com or xpedia.com or yatra.com to book a flight and you click on the book button because the fare seems too exciting and you get an error message telling you that the fare is not valid any more. Yes, exactly => That is a cache failure! These travel sites or price compare engines are not going to hit the database every time you hit the compare button instead the results will be served from the cache, because the query results are micro cached, its a perfect trade-off, by micro caching the results the site gains 100% performance benefits but every once in a while annoys a customer because the fare has expired. But the trade off works in the favour of these sites as they are still able to process up to 30+ page requests per second which means cater to the site traffic by may be losing 1 customer every once in a while to a competitor who is also using a similar caching technique what are the odds that the user will not come back to their site sooner or later? Recap Resources Below are some Key resource you might like to review. I would highly recommend the documentation, walkthroughs and videos available on MSDN. You can always make use of Fiddler to debug Web Performance Tests. Some community test extensions and plug ins available on Codeplex might also be of interest to you. The Road Ahead Thank you for taking the time out and reading this blog post, you may also want to read Part I and Part II if you haven’t so far. If you enjoyed the post, remember to subscribe to http://feeds.feedburner.com/TarunArora. Questions/Feedback/Suggestions, etc please leave a comment. Next ‘Load Testing in the cloud’, I’ll be working on exploring the possibilities of running Test controller/Agents in the Cloud. See you on the other side! Thank You! Share this post : CodeProject

Read the article

< Previous Page | 3 4 5 6 7