Search Results

Search found 983 results on 40 pages for 'marc alexander reed'.

Page 2/40 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

Parallelism in .NET – Part 9, Configuration in PLINQ and TPL

- by Reed

Parallel LINQ and the Task Parallel Library contain many options for configuration. Although the default configuration options are often ideal, there are times when customizing the behavior is desirable. Both frameworks provide full configuration support. When working with Data Parallelism, there is one primary configuration option we often need to control – the number of threads we want the system to use when parallelizing our routine. By default, PLINQ and the TPL both use the ThreadPool to schedule tasks. Given the major improvements in the ThreadPool in CLR 4, this default behavior is often ideal. However, there are times that the default behavior is not appropriate. For example, if you are working on multiple threads simultaneously, and want to schedule parallel operations from within both threads, you might want to consider restricting each parallel operation to using a subset of the processing cores of the system. Not doing this might over-parallelize your routine, which leads to inefficiencies from having too many context switches. In the Task Parallel Library, configuration is handled via the ParallelOptions class. All of the methods of the Parallel class have an overload which accepts a ParallelOptions argument. We configure the Parallel class by setting the ParallelOptions.MaxDegreeOfParallelism property. For example, let’s revisit one of the simple data parallel examples from Part 2: Parallel.For(0, pixelData.GetUpperBound(0), row => { for (int col=0; col < pixelData.GetUpperBound(1); ++col) { pixelData[row, col] = AdjustContrast(pixelData[row, col], minPixel, maxPixel); } }); .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Here, we’re looping through an image, and calling a method on each pixel in the image. If this was being done on a separate thread, and we knew another thread within our system was going to be doing a similar operation, we likely would want to restrict this to using half of the cores on the system. This could be accomplished easily by doing: var options = new ParallelOptions(); options.MaxDegreeOfParallelism = Math.Max(Environment.ProcessorCount / 2, 1); Parallel.For(0, pixelData.GetUpperBound(0), options, row => { for (int col=0; col < pixelData.GetUpperBound(1); ++col) { pixelData[row, col] = AdjustContrast(pixelData[row, col], minPixel, maxPixel); } }); Now, we’re restricting this routine to using no more than half the cores in our system. Note that I included a check to prevent a single core system from supplying zero; without this check, we’d potentially cause an exception. I also did not hard code a specific value for the MaxDegreeOfParallelism property. One of our goals when parallelizing a routine is allowing it to scale on better hardware. Specifying a hard-coded value would contradict that goal. Parallel LINQ also supports configuration, and in fact, has quite a few more options for configuring the system. The main configuration option we most often need is the same as our TPL option: we need to supply the maximum number of processing threads. In PLINQ, this is done via a new extension method on ParallelQuery<T>: ParallelEnumerable.WithDegreeOfParallelism. Let’s revisit our declarative data parallelism sample from Part 6: double min = collection.AsParallel().Min(item => item.PerformComputation()); Here, we’re performing a computation on each element in the collection, and saving the minimum value of this operation. If we wanted to restrict this to a limited number of threads, we would add our new extension method: int maxThreads = Math.Max(Environment.ProcessorCount / 2, 1); double min = collection .AsParallel() .WithDegreeOfParallelism(maxThreads) .Min(item => item.PerformComputation()); This automatically restricts the PLINQ query to half of the threads on the system. PLINQ provides some additional configuration options. By default, PLINQ will occasionally revert to processing a query in parallel. This occurs because many queries, if parallelized, typically actually cause an overall slowdown compared to a serial processing equivalent. By analyzing the “shape” of the query, PLINQ often decides to run a query serially instead of in parallel. This can occur for (taken from MSDN): Queries that contain a Select, indexed Where, indexed SelectMany, or ElementAt clause after an ordering or filtering operator that has removed or rearranged original indices. Queries that contain a Take, TakeWhile, Skip, SkipWhile operator and where indices in the source sequence are not in the original order. Queries that contain Zip or SequenceEquals, unless one of the data sources has an originally ordered index and the other data source is indexable (i.e. an array or IList(T)). Queries that contain Concat, unless it is applied to indexable data sources. Queries that contain Reverse, unless applied to an indexable data source. If the specific query follows these rules, PLINQ will run the query on a single thread. However, none of these rules look at the specific work being done in the delegates, only at the “shape” of the query. There are cases where running in parallel may still be beneficial, even if the shape is one where it typically parallelizes poorly. In these cases, you can override the default behavior by using the WithExecutionMode extension method. This would be done like so: var reversed = collection .AsParallel() .WithExecutionMode(ParallelExecutionMode.ForceParallelism) .Select(i => i.PerformComputation()) .Reverse(); Here, the default behavior would be to not parallelize the query unless collection implemented IList<T>. We can force this to run in parallel by adding the WithExecutionMode extension method in the method chain. Finally, PLINQ has the ability to configure how results are returned. When a query is filtering or selecting an input collection, the results will need to be streamed back into a single IEnumerable<T> result. For example, the method above returns a new, reversed collection. In this case, the processing of the collection will be done in parallel, but the results need to be streamed back to the caller serially, so they can be enumerated on a single thread. This streaming introduces overhead. IEnumerable<T> isn’t designed with thread safety in mind, so the system needs to handle merging the parallel processes back into a single stream, which introduces synchronization issues. There are two extremes of how this could be accomplished, but both extremes have disadvantages. The system could watch each thread, and whenever a thread produces a result, take that result and send it back to the caller. This would mean that the calling thread would have access to the data as soon as data is available, which is the benefit of this approach. However, it also means that every item is introducing synchronization overhead, since each item needs to be merged individually. On the other extreme, the system could wait until all of the results from all of the threads were ready, then push all of the results back to the calling thread in one shot. The advantage here is that the least amount of synchronization is added to the system, which means the query will, on a whole, run the fastest. However, the calling thread will have to wait for all elements to be processed, so this could introduce a long delay between when a parallel query begins and when results are returned. The default behavior in PLINQ is actually between these two extremes. By default, PLINQ maintains an internal buffer, and chooses an optimal buffer size to maintain. Query results are accumulated into the buffer, then returned in the IEnumerable<T> result in chunks. This provides reasonably fast access to the results, as well as good overall throughput, in most scenarios. However, if we know the nature of our algorithm, we may decide we would prefer one of the other extremes. This can be done by using the WithMergeOptions extension method. For example, if we know that our PerformComputation() routine is very slow, but also variable in runtime, we may want to retrieve results as they are available, with no bufferring. This can be done by changing our above routine to: var reversed = collection .AsParallel() .WithExecutionMode(ParallelExecutionMode.ForceParallelism) .WithMergeOptions(ParallelMergeOptions.NotBuffered) .Select(i => i.PerformComputation()) .Reverse(); On the other hand, if are already on a background thread, and we want to allow the system to maximize its speed, we might want to allow the system to fully buffer the results: var reversed = collection .AsParallel() .WithExecutionMode(ParallelExecutionMode.ForceParallelism) .WithMergeOptions(ParallelMergeOptions.FullyBuffered) .Select(i => i.PerformComputation()) .Reverse(); Notice, also, that you can specify multiple configuration options in a parallel query. By chaining these extension methods together, we generate a query that will always run in parallel, and will always complete before making the results available in our IEnumerable<T>.

Read the article
Parallelism in .NET – Part 12, More on Task Decomposition

- by Reed

Many tasks can be decomposed using a Data Decomposition approach, but often, this is not appropriate. Frequently, decomposing the problem into distinctive tasks that must be performed is a more natural abstraction. However, as I mentioned in Part 1, Task Decomposition tends to be a bit more difficult than data decomposition, and can require a bit more effort. Before we being parallelizing our algorithm based on the tasks being performed, we need to decompose our problem, and take special care of certain considerations such as ordering and grouping of tasks. Up to this point in this series, I’ve focused on parallelization techniques which are most appropriate when a problem space can be decomposed by data. Using PLINQ and the Parallel class, I’ve shown how problem spaces where there is a collection of data, and each element needs to be processed, can potentially be parallelized. However, there are many other routines where this is not appropriate. Often, instead of working on a collection of data, there is a single piece of data which must be processed using an algorithm or series of algorithms. Here, there is no collection of data, but there may still be opportunities for parallelism. As I mentioned before, in cases like this, the approach is to look at your overall routine, and decompose your problem space based on tasks. The idea here is to look for discrete “tasks,” individual pieces of work which can be conceptually thought of as a single operation. Let’s revisit the example I used in Part 1, an application startup path. Say we want our program, at startup, to do a bunch of individual actions, or “tasks”. The following is our list of duties we must perform right at startup: Display a splash screen Request a license from our license manager Check for an update to the software from our web server If an update is available, download it Setup our menu structure based on our current license Open and display our main, welcome Window Hide the splash screen The first step in Task Decomposition is breaking up the problem space into discrete tasks. This, naturally, can be abstracted as seven discrete tasks. In the serial version of our program, if we were to diagram this, the general process would appear as: These tasks, obviously, provide some opportunities for parallelism. Before we can parallelize this routine, we need to analyze these tasks, and find any dependencies between tasks. In this case, our dependencies include: The splash screen must be displayed first, and as quickly as possible. We can’t download an update before we see whether one exists. Our menu structure depends on our license, so we must check for the license before setting up the menus. Since our welcome screen will notify the user of an update, we can’t show it until we’ve downloaded the update. Since our welcome screen includes menus that are customized based off the licensing, we can’t display it until we’ve received a license. We can’t hide the splash until our welcome screen is displayed. By listing our dependencies, we start to see the natural ordering that must occur for the tasks to be processed correctly. The second step in Task Decomposition is determining the dependencies between tasks, and ordering tasks based on their dependencies. Looking at these tasks, and looking at all the dependencies, we quickly see that even a simple decomposition such as this one can get quite complicated. In order to simplify the problem of defining the dependencies, it’s often a useful practice to group our tasks into larger, discrete tasks. The goal when grouping tasks is that you want to make each task “group” have as few dependencies as possible to other tasks or groups, and then work out the dependencies within that group. Typically, this works best when any external dependency is based on the “last” task within the group when it’s ordered, although that is not a firm requirement. This process is often called Grouping Tasks. In our case, we can easily group together tasks, effectively turning this into four discrete task groups: 1. Show our splash screen – This needs to be left as its own task. First, multiple things depend on this task, mainly because we want this to start before any other action, and start as quickly as possible. 2. Check for Update and Download the Update if it Exists - These two tasks logically group together. We know we only download an update if the update exists, so that naturally follows. This task has one dependency as an input, and other tasks only rely on the final task within this group. 3. Request a License, and then Setup the Menus – Here, we can group these two tasks together. Although we mentioned that our welcome screen depends on the license returned, it also depends on setting up the menu, which is the final task here. Setting up our menus cannot happen until after our license is requested. By grouping these together, we further reduce our problem space. 4. Display welcome and hide splash - Finally, we can display our welcome window and hide our splash screen. This task group depends on all three previous task groups – it cannot happen until all three of the previous groups have completed. By grouping the tasks together, we reduce our problem space, and can naturally see a pattern for how this process can be parallelized. The diagram below shows one approach: The orange boxes show each task group, with each task represented within. We can, now, effectively take these tasks, and run a large portion of this process in parallel, including the portions which may be the most time consuming. We’ve now created two parallel paths which our process execution can follow, hopefully speeding up the application startup time dramatically. The main point to remember here is that, when decomposing your problem space by tasks, you need to: Define each discrete action as an individual Task Discover dependencies between your tasks Group tasks based on their dependencies Order the tasks and groups of tasks

Read the article
Parallelism in .NET – Part 2, Simple Imperative Data Parallelism

- by Reed

In my discussion of Decomposition of the problem space, I mentioned that Data Decomposition is often the simplest abstraction to use when trying to parallelize a routine. If a problem can be decomposed based off the data, we will often want to use what MSDN refers to as Data Parallelism as our strategy for implementing our routine. The Task Parallel Library in .NET 4 makes implementing Data Parallelism, for most cases, very simple. Data Parallelism is the main technique we use to parallelize a routine which can be decomposed based off data. Data Parallelism refers to taking a single collection of data, and having a single operation be performed concurrently on elements in the collection. One side note here: Data Parallelism is also sometimes referred to as the Loop Parallelism Pattern or Loop-level Parallelism. In general, for this series, I will try to use the terminology used in the MSDN Documentation for the Task Parallel Library. This should make it easier to investigate these topics in more detail. Once we’ve determined we have a problem that, potentially, can be decomposed based on data, implementation using Data Parallelism in the TPL is quite simple. Let’s take our example from the Data Decomposition discussion – a simple contrast stretching filter. Here, we have a collection of data (pixels), and we need to run a simple operation on each element of the pixel. Once we know the minimum and maximum values, we most likely would have some simple code like the following: for (int row=0; row < pixelData.GetUpperBound(0); ++row) { for (int col=0; col < pixelData.GetUpperBound(1); ++col) { pixelData[row, col] = AdjustContrast(pixelData[row, col], minPixel, maxPixel); } } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } This simple routine loops through a two dimensional array of pixelData, and calls the AdjustContrast routine on each pixel. As I mentioned, when you’re decomposing a problem space, most iteration statements are potentially candidates for data decomposition. Here, we’re using two for loops – one looping through rows in the image, and a second nested loop iterating through the columns. We then perform one, independent operation on each element based on those loop positions. This is a prime candidate – we have no shared data, no dependencies on anything but the pixel which we want to change. Since we’re using a for loop, we can easily parallelize this using the Parallel.For method in the TPL: Parallel.For(0, pixelData.GetUpperBound(0), row => { for (int col=0; col < pixelData.GetUpperBound(1); ++col) { pixelData[row, col] = AdjustContrast(pixelData[row, col], minPixel, maxPixel); } }); Here, by simply changing our first for loop to a call to Parallel.For, we can parallelize this portion of our routine. Parallel.For works, as do many methods in the TPL, by creating a delegate and using it as an argument to a method. In this case, our for loop iteration block becomes a delegate creating via a lambda expression. This lets you write code that, superficially, looks similar to the familiar for loop, but functions quite differently at runtime. We could easily do this to our second for loop as well, but that may not be a good idea. There is a balance to be struck when writing parallel code. We want to have enough work items to keep all of our processors busy, but the more we partition our data, the more overhead we introduce. In this case, we have an image of data – most likely hundreds of pixels in both dimensions. By just parallelizing our first loop, each row of pixels can be run as a single task. With hundreds of rows of data, we are providing fine enough granularity to keep all of our processors busy. If we parallelize both loops, we’re potentially creating millions of independent tasks. This introduces extra overhead with no extra gain, and will actually reduce our overall performance. This leads to my first guideline when writing parallel code: Partition your problem into enough tasks to keep each processor busy throughout the operation, but not more than necessary to keep each processor busy. Also note that I parallelized the outer loop. I could have just as easily partitioned the inner loop. However, partitioning the inner loop would have led to many more discrete work items, each with a smaller amount of work (operate on one pixel instead of one row of pixels). My second guideline when writing parallel code reflects this: Partition your problem in a way to place the most work possible into each task. This typically means, in practice, that you will want to parallelize the routine at the “highest” point possible in the routine, typically the outermost loop. If you’re looking at parallelizing methods which call other methods, you’ll want to try to partition your work high up in the stack – as you get into lower level methods, the performance impact of parallelizing your routines may not overcome the overhead introduced. Parallel.For works great for situations where we know the number of elements we’re going to process in advance. If we’re iterating through an IList<T> or an array, this is a typical approach. However, there are other iteration statements common in C#. In many situations, we’ll use foreach instead of a for loop. This can be more understandable and easier to read, but also has the advantage of working with collections which only implement IEnumerable<T>, where we do not know the number of elements involved in advance. As an example, lets take the following situation. Say we have a collection of Customers, and we want to iterate through each customer, check some information about the customer, and if a certain case is met, send an email to the customer and update our instance to reflect this change. Normally, this might look something like: foreach(var customer in customers) { // Run some process that takes some time... DateTime lastContact = theStore.GetLastContact(customer); TimeSpan timeSinceContact = DateTime.Now - lastContact; // If it's been more than two weeks, send an email, and update... if (timeSinceContact.Days > 14) { theStore.EmailCustomer(customer); customer.LastEmailContact = DateTime.Now; } } Here, we’re doing a fair amount of work for each customer in our collection, but we don’t know how many customers exist. If we assume that theStore.GetLastContact(customer) and theStore.EmailCustomer(customer) are both side-effect free, thread safe operations, we could parallelize this using Parallel.ForEach: Parallel.ForEach(customers, customer => { // Run some process that takes some time... DateTime lastContact = theStore.GetLastContact(customer); TimeSpan timeSinceContact = DateTime.Now - lastContact; // If it's been more than two weeks, send an email, and update... if (timeSinceContact.Days > 14) { theStore.EmailCustomer(customer); customer.LastEmailContact = DateTime.Now; } }); Just like Parallel.For, we rework our loop into a method call accepting a delegate created via a lambda expression. This keeps our new code very similar to our original iteration statement, however, this will now execute in parallel. The same guidelines apply with Parallel.ForEach as with Parallel.For. The other iteration statements, do and while, do not have direct equivalents in the Task Parallel Library. These, however, are very easy to implement using Parallel.ForEach and the yield keyword. Most applications can benefit from implementing some form of Data Parallelism. Iterating through collections and performing “work” is a very common pattern in nearly every application. When the problem can be decomposed by data, we often can parallelize the workload by merely changing foreach statements to Parallel.ForEach method calls, and for loops to Parallel.For method calls. Any time your program operates on a collection, and does a set of work on each item in the collection where that work is not dependent on other information, you very likely have an opportunity to parallelize your routine.

Read the article
Parallelism in .NET – Part 13, Introducing the Task class

- by Reed

Once we’ve used a task-based decomposition to decompose a problem, we need a clean abstraction usable to implement the resulting decomposition. Given that task decomposition is founded upon defining discrete tasks, .NET 4 has introduced a new API for dealing with task related issues, the aptly named Task class. The Task class is a wrapper for a delegate representing a single, discrete task within your decomposition. We will go into various methods of construction for tasks later, but, when reduced to its fundamentals, an instance of a Task is nothing more than a wrapper around a delegate with some utility functionality added. In order to fully understand the Task class within the new Task Parallel Library, it is important to realize that a task really is just a delegate – nothing more. In particular, note that I never mentioned threading or parallelism in my description of a Task. Although the Task class exists in the new System.Threading.Tasks namespace: Tasks are not directly related to threads or multithreading. Of course, Task instances will typically be used in our implementation of concurrency within an application, but the Task class itself does not provide the concurrency used. The Task API supports using Tasks in an entirely single threaded, synchronous manner. Tasks are very much like standard delegates. You can execute a task synchronously via Task.RunSynchronously(), or you can use Task.Start() to schedule a task to run, typically asynchronously. This is very similar to using delegate.Invoke to execute a delegate synchronously, or using delegate.BeginInvoke to execute it asynchronously. The Task class adds some nice functionality on top of a standard delegate which improves usability in both synchronous and multithreaded environments. The first addition provided by Task is a means of handling cancellation via the new unified cancellation mechanism of .NET 4. If the wrapped delegate within a Task raises an OperationCanceledException during it’s operation, which is typically generated via calling ThrowIfCancellationRequested on a CancellationToken, or if the CancellationToken used to construct a Task instance is flagged as canceled, the Task’s IsCanceled property will be set to true automatically. This provides a clean way to determine whether a Task has been canceled, often without requiring specific exception handling. Tasks also provide a clean API which can be used for waiting on a task. Although the Task class explicitly implements IAsyncResult, Tasks provide a nicer usage model than the traditional .NET Asynchronous Programming Model. Instead of needing to track an IAsyncResult handle, you can just directly call Task.Wait() to block until a Task has completed. Overloads exist for providing a timeout, a CancellationToken, or both to prevent waiting indefinitely. In addition, the Task class provides static methods for waiting on multiple tasks – Task.WaitAll and Task.WaitAny, again with overloads providing time out options. This provides a very simple, clean API for waiting on single or multiple tasks. Finally, Tasks provide a much nicer model for Exception handling. If the delegate wrapped within a Task raises an exception, the exception will automatically get wrapped into an AggregateException and exposed via the Task.Exception property. This exception is stored with the Task directly, and does not tear down the application. Later, when Task.Wait() (or Task.WaitAll or Task.WaitAny) is called on this task, an AggregateException will be raised at that point if any of the tasks raised an exception. For example, suppose we have the following code: Task taskOne = new Task( () => { throw new ApplicationException("Random Exception!"); }); Task taskTwo = new Task( () => { throw new ArgumentException("Different exception here"); }); // Start the tasks taskOne.Start(); taskTwo.Start(); try { Task.WaitAll(new[] { taskOne, taskTwo }); } catch (AggregateException e) { Console.WriteLine(e.InnerExceptions.Count); foreach (var inner in e.InnerExceptions) Console.WriteLine(inner.Message); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Here, our routine will print: 2 Different exception here Random Exception! Note that we had two separate tasks, each of which raised two distinctly different types of exceptions. We can handle this cleanly, with very little code, in a much nicer manner than the Asynchronous Programming API. We no longer need to handle TargetInvocationException or worry about implementing the Event-based Asynchronous Pattern properly by setting the AsyncCompletedEventArgs.Error property. Instead, we just raise our exception as normal, and handle AggregateException in a single location in our calling code.

Read the article
Parallelism in .NET – Part 18, Task Continuations with Multiple Tasks

- by Reed

In my introduction to Task continuations I demonstrated how the Task class provides a more expressive alternative to traditional callbacks. Task continuations provide a much cleaner syntax to traditional callbacks, but there are other reasons to switch to using continuations… Task continuations provide a clean syntax, and a very simple, elegant means of synchronizing asynchronous method results with the user interface. In addition, continuations provide a very simple, elegant means of working with collections of tasks. Prior to .NET 4, working with multiple related asynchronous method calls was very tricky. If, for example, we wanted to run two asynchronous operations, followed by a single method call which we wanted to run when the first two methods completed, we’d have to program all of the handling ourselves. We would likely need to take some approach such as using a shared callback which synchronized against a common variable, or using a WaitHandle shared within the callbacks to allow one to wait for the second. Although this could be accomplished easily enough, it requires manually placing this handling into every algorithm which requires this form of blocking. This is error prone, difficult, and can easily lead to subtle bugs. Similar to how the Task class static methods providing a way to block until multiple tasks have completed, TaskFactory contains static methods which allow a continuation to be scheduled upon the completion of multiple tasks: TaskFactory.ContinueWhenAll. This allows you to easily specify a single delegate to run when a collection of tasks has completed. For example, suppose we have a class which fetches data from the network. This can be a long running operation, and potentially fail in certain situations, such as a server being down. As a result, we have three separate servers which we will “query” for our information. Now, suppose we want to grab data from all three servers, and verify that the results are the same from all three. With traditional asynchronous programming in .NET, this would require using three separate callbacks, and managing the synchronization between the various operations ourselves. The Task and TaskFactory classes simplify this for us, allowing us to write: var server1 = Task.Factory.StartNew( () => networkClass.GetResults(firstServer) ); var server2 = Task.Factory.StartNew( () => networkClass.GetResults(secondServer) ); var server3 = Task.Factory.StartNew( () => networkClass.GetResults(thirdServer) ); var result = Task.Factory.ContinueWhenAll( new[] {server1, server2, server3 }, (tasks) => { // Propogate exceptions (see below) Task.WaitAll(tasks); return this.CompareTaskResults( tasks[0].Result, tasks[1].Result, tasks[2].Result); }); .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } This is clean, simple, and elegant. The one complication is the Task.WaitAll(tasks); statement. Although the continuation will not complete until all three tasks (server1, server2, and server3) have completed, there is a potential snag. If the networkClass.GetResults method fails, and raises an exception, we want to make sure to handle it cleanly. By using Task.WaitAll, any exceptions raised within any of our original tasks will get wrapped into a single AggregateException by the WaitAll method, providing us a simplified means of handling the exceptions. If we wait on the continuation, we can trap this AggregateException, and handle it cleanly. Without this line, it’s possible that an exception could remain uncaught and unhandled by a task, which later might trigger a nasty UnobservedTaskException. This would happen any time two of our original tasks failed. Just as we can schedule a continuation to occur when an entire collection of tasks has completed, we can just as easily setup a continuation to run when any single task within a collection completes. If, for example, we didn’t need to compare the results of all three network locations, but only use one, we could still schedule three tasks. We could then have our completion logic work on the first task which completed, and ignore the others. This is done via TaskFactory.ContinueWhenAny: var server1 = Task.Factory.StartNew( () => networkClass.GetResults(firstServer) ); var server2 = Task.Factory.StartNew( () => networkClass.GetResults(secondServer) ); var server3 = Task.Factory.StartNew( () => networkClass.GetResults(thirdServer) ); var result = Task.Factory.ContinueWhenAny( new[] {server1, server2, server3 }, (firstTask) => { return this.ProcessTaskResult(firstTask.Result); }); .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Here, instead of working with all three tasks, we’re just using the first task which finishes. This is very useful, as it allows us to easily work with results of multiple operations, and “throw away” the others. However, you must take care when using ContinueWhenAny to properly handle exceptions. At some point, you should always wait on each task (or use the Task.Result property) in order to propogate any exceptions raised from within the task. Failing to do so can lead to an UnobservedTaskException.

Read the article
Parallelism in .NET – Part 16, Creating Tasks via a TaskFactory

- by Reed

The Task class in the Task Parallel Library supplies a large set of features. However, when creating the task, and assigning it to a TaskScheduler, and starting the Task, there are quite a few steps involved. This gets even more cumbersome when multiple tasks are involved. Each task must be constructed, duplicating any options required, then started individually, potentially on a specific scheduler. At first glance, this makes the new Task class seem like more work than ThreadPool.QueueUserWorkItem in .NET 3.5. In order to simplify this process, and make Tasks simple to use in simple cases, without sacrificing their power and flexibility, the Task Parallel Library added a new class: TaskFactory. The TaskFactory class is intended to “Provide support for creating and scheduling Task objects.” Its entire purpose is to simplify development when working with Task instances. The Task class provides access to the default TaskFactory via the Task.Factory static property. By default, TaskFactory uses the default TaskScheduler to schedule tasks on a ThreadPool thread. By using Task.Factory, we can automatically create and start a task in a single “fire and forget” manner, similar to how we did with ThreadPool.QueueUserWorkItem: Task.Factory.StartNew(() => this.ExecuteBackgroundWork(myData) ); .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } This provides us with the same level of simplicity we had with ThreadPool.QueueUserWorkItem, but even more power. For example, we can now easily wait on the task: // Start our task on a background thread var task = Task.Factory.StartNew(() => this.ExecuteBackgroundWork(myData) ); // Do other work on the main thread, // while the task above executes in the background this.ExecuteWorkSynchronously(); // Wait for the background task to finish task.Wait(); TaskFactory simplifies creation and startup of simple background tasks dramatically. In addition to using the default TaskFactory, it’s often useful to construct a custom TaskFactory. The TaskFactory class includes an entire set of constructors which allow you to specify the default configuration for every Task instance created by that factory. This is particularly useful when using a custom TaskScheduler. For example, look at the sample code for starting a task on the UI thread in Part 15: // Given the following, constructed on the UI thread // TaskScheduler uiScheduler = TaskScheduler.FromCurrentSynchronizationContext(); // When inside a background task, we can do string status = GetUpdatedStatus(); (new Task(() => { statusLabel.Text = status; })) .Start(uiScheduler); This is actually quite a bit more complicated than necessary. When we create the uiScheduler instance, we can use that to construct a TaskFactory that will automatically schedule tasks on the UI thread. To do that, we’d create the following on our main thread, prior to constructing our background tasks: // Construct a task scheduler from the current SynchronizationContext (UI thread) var uiScheduler = TaskScheduler.FromCurrentSynchronizationContext(); // Construct a new TaskFactory using our UI scheduler var uiTaskFactory = new TaskFactory(uiScheduler); If we do this, when we’re on a background thread, we can use this new TaskFactory to marshal a Task back onto the UI thread. Our previous code simplifies to: // When inside a background task, we can do string status = GetUpdatedStatus(); // Update our UI uiTaskFactory.StartNew( () => statusLabel.Text = status); Notice how much simpler this becomes! By taking advantage of the convenience provided by a custom TaskFactory, we can now marshal to set data on the UI thread in a single, clear line of code!

Read the article
Parallelism in .NET – Introduction

- by Reed

Parallel programming is something that every professional developer should understand, but is rarely discussed or taught in detail in a formal manner. Software users are no longer content with applications that lock up the user interface regularly, or take large amounts of time to process data unnecessarily. Modern development requires the use of parallelism. There is no longer any excuses for us as developers. Learning to write parallel software is challenging. It requires more than reading that one chapter on parallelism in our programming language book of choice… Today’s systems are no longer getting faster with each generation; in many cases, newer computers are actually slower than previous generation systems. Modern hardware is shifting towards conservation of power, with processing scalability coming from having multiple computer cores, not faster and faster CPUs. Our CPU frequencies no longer double on a regular basis, but Moore’s Law is still holding strong. Now, however, instead of scaling transistors in order to make processors faster, hardware manufacturers are scaling the transistors in order to add more discrete hardware processing threads to the system. This changes how we should think about software. In order to take advantage of modern systems, we need to redesign and rewrite our algorithms to work in parallel. As with any design domain, it helps tremendously to have a common language, as well as a common set of patterns and tools. For .NET developers, this is an exciting time for parallel programming. Version 4 of the .NET Framework is adding the Task Parallel Library. This has been back-ported to .NET 3.5sp1 as part of the Reactive Extensions for .NET, and is available for use today in both .NET 3.5 and .NET 4.0 beta. In order to fully utilize the Task Parallel Library and parallelism, both in .NET 4 and previous versions, we need to understand the proper terminology. For this series, I will provide an introduction to some of the basic concepts in parallelism, and relate them to the tools available in .NET.

Read the article
Parallelism in .NET – Part 4, Imperative Data Parallelism: Aggregation

- by Reed

In the article on simple data parallelism, I described how to perform an operation on an entire collection of elements in parallel. Often, this is not adequate, as the parallel operation is going to be performing some form of aggregation. Simple examples of this might include taking the sum of the results of processing a function on each element in the collection, or finding the minimum of the collection given some criteria. This can be done using the techniques described in simple data parallelism, however, special care needs to be taken into account to synchronize the shared data appropriately. The Task Parallel Library has tools to assist in this synchronization. The main issue with aggregation when parallelizing a routine is that you need to handle synchronization of data. Since multiple threads will need to write to a shared portion of data. Suppose, for example, that we wanted to parallelize a simple loop that looked for the minimum value within a dataset: double min = double.MaxValue; foreach(var item in collection) { double value = item.PerformComputation(); min = System.Math.Min(min, value); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } This seems like a good candidate for parallelization, but there is a problem here. If we just wrap this into a call to Parallel.ForEach, we’ll introduce a critical race condition, and get the wrong answer. Let’s look at what happens here: // Buggy code! Do not use! double min = double.MaxValue; Parallel.ForEach(collection, item => { double value = item.PerformComputation(); min = System.Math.Min(min, value); }); This code has a fatal flaw: min will be checked, then set, by multiple threads simultaneously. Two threads may perform the check at the same time, and set the wrong value for min. Say we get a value of 1 in thread 1, and a value of 2 in thread 2, and these two elements are the first two to run. If both hit the min check line at the same time, both will determine that min should change, to 1 and 2 respectively. If element 1 happens to set the variable first, then element 2 sets the min variable, we’ll detect a min value of 2 instead of 1. This can lead to wrong answers. Unfortunately, fixing this, with the Parallel.ForEach call we’re using, would require adding locking. We would need to rewrite this like: // Safe, but slow double min = double.MaxValue; // Make a "lock" object object syncObject = new object(); Parallel.ForEach(collection, item => { double value = item.PerformComputation(); lock(syncObject) min = System.Math.Min(min, value); }); This will potentially add a huge amount of overhead to our calculation. Since we can potentially block while waiting on the lock for every single iteration, we will most likely slow this down to where it is actually quite a bit slower than our serial implementation. The problem is the lock statement – any time you use lock(object), you’re almost assuring reduced performance in a parallel situation. This leads to two observations I’ll make: When parallelizing a routine, try to avoid locks. That being said: Always add any and all required synchronization to avoid race conditions. These two observations tend to be opposing forces – we often need to synchronize our algorithms, but we also want to avoid the synchronization when possible. Looking at our routine, there is no way to directly avoid this lock, since each element is potentially being run on a separate thread, and this lock is necessary in order for our routine to function correctly every time. However, this isn’t the only way to design this routine to implement this algorithm. Realize that, although our collection may have thousands or even millions of elements, we have a limited number of Processing Elements (PE). Processing Element is the standard term for a hardware element which can process and execute instructions. This typically is a core in your processor, but many modern systems have multiple hardware execution threads per core. The Task Parallel Library will not execute the work for each item in the collection as a separate work item. Instead, when Parallel.ForEach executes, it will partition the collection into larger “chunks” which get processed on different threads via the ThreadPool. This helps reduce the threading overhead, and help the overall speed. In general, the Parallel class will only use one thread per PE in the system. Given the fact that there are typically fewer threads than work items, we can rethink our algorithm design. We can parallelize our algorithm more effectively by approaching it differently. Because the basic aggregation we are doing here (Min) is communitive, we do not need to perform this in a given order. We knew this to be true already – otherwise, we wouldn’t have been able to parallelize this routine in the first place. With this in mind, we can treat each thread’s work independently, allowing each thread to serially process many elements with no locking, then, after all the threads are complete, “merge” together the results. This can be accomplished via a different set of overloads in the Parallel class: Parallel.ForEach<TSource,TLocal>. The idea behind these overloads is to allow each thread to begin by initializing some local state (TLocal). The thread will then process an entire set of items in the source collection, providing that state to the delegate which processes an individual item. Finally, at the end, a separate delegate is run which allows you to handle merging that local state into your final results. To rewriting our routine using Parallel.ForEach<TSource,TLocal>, we need to provide three delegates instead of one. The most basic version of this function is declared as: public static ParallelLoopResult ForEach<TSource, TLocal>( IEnumerable<TSource> source, Func<TLocal> localInit, Func<TSource, ParallelLoopState, TLocal, TLocal> body, Action<TLocal> localFinally ) The first delegate (the localInit argument) is defined as Func<TLocal>. This delegate initializes our local state. It should return some object we can use to track the results of a single thread’s operations. The second delegate (the body argument) is where our main processing occurs, although now, instead of being an Action<T>, we actually provide a Func<TSource, ParallelLoopState, TLocal, TLocal> delegate. This delegate will receive three arguments: our original element from the collection (TSource), a ParallelLoopState which we can use for early termination, and the instance of our local state we created (TLocal). It should do whatever processing you wish to occur per element, then return the value of the local state after processing is completed. The third delegate (the localFinally argument) is defined as Action<TLocal>. This delegate is passed our local state after it’s been processed by all of the elements this thread will handle. This is where you can merge your final results together. This may require synchronization, but now, instead of synchronizing once per element (potentially millions of times), you’ll only have to synchronize once per thread, which is an ideal situation. Now that I’ve explained how this works, lets look at the code: // Safe, and fast! double min = double.MaxValue; // Make a "lock" object object syncObject = new object(); Parallel.ForEach( collection, // First, we provide a local state initialization delegate. () => double.MaxValue, // Next, we supply the body, which takes the original item, loop state, // and local state, and returns a new local state (item, loopState, localState) => { double value = item.PerformComputation(); return System.Math.Min(localState, value); }, // Finally, we provide an Action<TLocal>, to "merge" results together localState => { // This requires locking, but it's only once per used thread lock(syncObj) min = System.Math.Min(min, localState); } ); Although this is a bit more complicated than the previous version, it is now both thread-safe, and has minimal locking. This same approach can be used by Parallel.For, although now, it’s Parallel.For<TLocal>. When working with Parallel.For<TLocal>, you use the same triplet of delegates, with the same purpose and results. Also, many times, you can completely avoid locking by using a method of the Interlocked class to perform the final aggregation in an atomic operation. The MSDN example demonstrating this same technique using Parallel.For uses the Interlocked class instead of a lock, since they are doing a sum operation on a long variable, which is possible via Interlocked.Add. By taking advantage of local state, we can use the Parallel class methods to parallelize algorithms such as aggregation, which, at first, may seem like poor candidates for parallelization. Doing so requires careful consideration, and often requires a slight redesign of the algorithm, but the performance gains can be significant if handled in a way to avoid excessive synchronization.

Read the article
Parallelism in .NET – Part 11, Divide and Conquer via Parallel.Invoke

- by Reed

Many algorithms are easily written to work via recursion. For example, most data-oriented tasks where a tree of data must be processed are much more easily handled by starting at the root, and recursively “walking” the tree. Some algorithms work this way on flat data structures, such as arrays, as well. This is a form of divide and conquer: an algorithm design which is based around breaking up a set of work recursively, “dividing” the total work in each recursive step, and “conquering” the work when the remaining work is small enough to be solved easily. Recursive algorithms, especially ones based on a form of divide and conquer, are often a very good candidate for parallelization. This is apparent from a common sense standpoint. Since we’re dividing up the total work in the algorithm, we have an obvious, built-in partitioning scheme. Once partitioned, the data can be worked upon independently, so there is good, clean isolation of data. Implementing this type of algorithm is fairly simple. The Parallel class in .NET 4 includes a method suited for this type of operation: Parallel.Invoke. This method works by taking any number of delegates defined as an Action, and operating them all in parallel. The method returns when every delegate has completed: Parallel.Invoke( () => { Console.WriteLine("Action 1 executing in thread {0}", Thread.CurrentThread.ManagedThreadId); }, () => { Console.WriteLine("Action 2 executing in thread {0}", Thread.CurrentThread.ManagedThreadId); }, () => { Console.WriteLine("Action 3 executing in thread {0}", Thread.CurrentThread.ManagedThreadId); } ); .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Running this simple example demonstrates the ease of using this method. For example, on my system, I get three separate thread IDs when running the above code. By allowing any number of delegates to be executed directly, concurrently, the Parallel.Invoke method provides us an easy way to parallelize any algorithm based on divide and conquer. We can divide our work in each step, and execute each task in parallel, recursively. For example, suppose we wanted to implement our own quicksort routine. The quicksort algorithm can be designed based on divide and conquer. In each iteration, we pick a pivot point, and use that to partition the total array. We swap the elements around the pivot, then recursively sort the lists on each side of the pivot. For example, let’s look at this simple, sequential implementation of quicksort: public static void QuickSort<T>(T[] array) where T : IComparable<T> { QuickSortInternal(array, 0, array.Length - 1); } private static void QuickSortInternal<T>(T[] array, int left, int right) where T : IComparable<T> { if (left >= right) { return; } SwapElements(array, left, (left + right) / 2); int last = left; for (int current = left + 1; current <= right; ++current) { if (array[current].CompareTo(array[left]) < 0) { ++last; SwapElements(array, last, current); } } SwapElements(array, left, last); QuickSortInternal(array, left, last - 1); QuickSortInternal(array, last + 1, right); } static void SwapElements<T>(T[] array, int i, int j) { T temp = array[i]; array[i] = array[j]; array[j] = temp; } Here, we implement the quicksort algorithm in a very common, divide and conquer approach. Running this against the built-in Array.Sort routine shows that we get the exact same answers (although the framework’s sort routine is slightly faster). On my system, for example, I can use framework’s sort to sort ten million random doubles in about 7.3s, and this implementation takes about 9.3s on average. Looking at this routine, though, there is a clear opportunity to parallelize. At the end of QuickSortInternal, we recursively call into QuickSortInternal with each partition of the array after the pivot is chosen. This can be rewritten to use Parallel.Invoke by simply changing it to: // Code above is unchanged... SwapElements(array, left, last); Parallel.Invoke( () => QuickSortInternal(array, left, last - 1), () => QuickSortInternal(array, last + 1, right) ); } This routine will now run in parallel. When executing, we now see the CPU usage across all cores spike while it executes. However, there is a significant problem here – by parallelizing this routine, we took it from an execution time of 9.3s to an execution time of approximately 14 seconds! We’re using more resources as seen in the CPU usage, but the overall result is a dramatic slowdown in overall processing time. This occurs because parallelization adds overhead. Each time we split this array, we spawn two new tasks to parallelize this algorithm! This is far, far too many tasks for our cores to operate upon at a single time. In effect, we’re “over-parallelizing” this routine. This is a common problem when working with divide and conquer algorithms, and leads to an important observation: When parallelizing a recursive routine, take special care not to add more tasks than necessary to fully utilize your system. This can be done with a few different approaches, in this case. Typically, the way to handle this is to stop parallelizing the routine at a certain point, and revert back to the serial approach. Since the first few recursions will all still be parallelized, our “deeper” recursive tasks will be running in parallel, and can take full advantage of the machine. This also dramatically reduces the overhead added by parallelizing, since we’re only adding overhead for the first few recursive calls. There are two basic approaches we can take here. The first approach would be to look at the total work size, and if it’s smaller than a specific threshold, revert to our serial implementation. In this case, we could just check right-left, and if it’s under a threshold, call the methods directly instead of using Parallel.Invoke. The second approach is to track how “deep” in the “tree” we are currently at, and if we are below some number of levels, stop parallelizing. This approach is a more general-purpose approach, since it works on routines which parse trees as well as routines working off of a single array, but may not work as well if a poor partitioning strategy is chosen or the tree is not balanced evenly. This can be written very easily. If we pass a maxDepth parameter into our internal routine, we can restrict the amount of times we parallelize by changing the recursive call to: // Code above is unchanged... SwapElements(array, left, last); if (maxDepth < 1) { QuickSortInternal(array, left, last - 1, maxDepth); QuickSortInternal(array, last + 1, right, maxDepth); } else { --maxDepth; Parallel.Invoke( () => QuickSortInternal(array, left, last - 1, maxDepth), () => QuickSortInternal(array, last + 1, right, maxDepth)); } We no longer allow this to parallelize indefinitely – only to a specific depth, at which time we revert to a serial implementation. By starting the routine with a maxDepth equal to Environment.ProcessorCount, we can restrict the total amount of parallel operations significantly, but still provide adequate work for each processing core. With this final change, my timings are much better. On average, I get the following timings: Framework via Array.Sort: 7.3 seconds Serial Quicksort Implementation: 9.3 seconds Naive Parallel Implementation: 14 seconds Parallel Implementation Restricting Depth: 4.7 seconds Finally, we are now faster than the framework’s Array.Sort implementation.

Read the article
Parallelism in .NET – Part 6, Declarative Data Parallelism

- by Reed

When working with a problem that can be decomposed by data, we have a collection, and some operation being performed upon the collection. I’ve demonstrated how this can be parallelized using the Task Parallel Library and imperative programming using imperative data parallelism via the Parallel class. While this provides a huge step forward in terms of power and capabilities, in many cases, special care must still be given for relative common scenarios. C# 3.0 and Visual Basic 9.0 introduced a new, declarative programming model to .NET via the LINQ Project. When working with collections, we can now write software that describes what we want to occur without having to explicitly state how the program should accomplish the task. By taking advantage of LINQ, many operations become much shorter, more elegant, and easier to understand and maintain. Version 4.0 of the .NET framework extends this concept into the parallel computation space by introducing Parallel LINQ. Before we delve into PLINQ, let’s begin with a short discussion of LINQ. LINQ, the extensions to the .NET Framework which implement language integrated query, set, and transform operations, is implemented in many flavors. For our purposes, we are interested in LINQ to Objects. When dealing with parallelizing a routine, we typically are dealing with in-memory data storage. More data-access oriented LINQ variants, such as LINQ to SQL and LINQ to Entities in the Entity Framework fall outside of our concern, since the parallelism there is the concern of the data base engine processing the query itself. LINQ (LINQ to Objects in particular) works by implementing a series of extension methods, most of which work on IEnumerable<T>. The language enhancements use these extension methods to create a very concise, readable alternative to using traditional foreach statement. For example, let’s revisit our minimum aggregation routine we wrote in Part 4: double min = double.MaxValue; foreach(var item in collection) { double value = item.PerformComputation(); min = System.Math.Min(min, value); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Here, we’re doing a very simple computation, but writing this in an imperative style. This can be loosely translated to English as: Create a very large number, and save it in min Loop through each item in the collection. For every item: Perform some computation, and save the result If the computation is less than min, set min to the computation Although this is fairly easy to follow, it’s quite a few lines of code, and it requires us to read through the code, step by step, line by line, in order to understand the intention of the developer. We can rework this same statement, using LINQ: double min = collection.Min(item => item.PerformComputation()); Here, we’re after the same information. However, this is written using a declarative programming style. When we see this code, we’d naturally translate this to English as: Save the Min value of collection, determined via calling item.PerformComputation() That’s it – instead of multiple logical steps, we have one single, declarative request. This makes the developer’s intentions very clear, and very easy to follow. The system is free to implement this using whatever method required. Parallel LINQ (PLINQ) extends LINQ to Objects to support parallel operations. This is a perfect fit in many cases when you have a problem that can be decomposed by data. To show this, let’s again refer to our minimum aggregation routine from Part 4, but this time, let’s review our final, parallelized version: // Safe, and fast! double min = double.MaxValue; // Make a "lock" object object syncObject = new object(); Parallel.ForEach( collection, // First, we provide a local state initialization delegate. () => double.MaxValue, // Next, we supply the body, which takes the original item, loop state, // and local state, and returns a new local state (item, loopState, localState) => { double value = item.PerformComputation(); return System.Math.Min(localState, value); }, // Finally, we provide an Action<TLocal>, to "merge" results together localState => { // This requires locking, but it's only once per used thread lock(syncObj) min = System.Math.Min(min, localState); } ); Here, we’re doing the same computation as above, but fully parallelized. Describing this in English becomes quite a feat: Create a very large number, and save it in min Create a temporary object we can use for locking Call Parallel.ForEach, specifying three delegates For the first delegate: Initialize a local variable to hold the local state to a very large number For the second delegate: For each item in the collection, perform some computation, save the result If the result is less than our local state, save the result in local state For the final delegate: Take a lock on our temporary object to protect our min variable Save the min of our min and local state variables Although this solves our problem, and does it in a very efficient way, we’ve created a set of code that is quite a bit more difficult to understand and maintain. PLINQ provides us with a very nice alternative. In order to use PLINQ, we need to learn one new extension method that works on IEnumerable<T> – ParallelEnumerable.AsParallel(). That’s all we need to learn in order to use PLINQ: one single method. We can write our minimum aggregation in PLINQ very simply: double min = collection.AsParallel().Min(item => item.PerformComputation()); By simply adding “.AsParallel()” to our LINQ to Objects query, we converted this to using PLINQ and running this computation in parallel! This can be loosely translated into English easily, as well: Process the collection in parallel Get the Minimum value, determined by calling PerformComputation on each item Here, our intention is very clear and easy to understand. We just want to perform the same operation we did in serial, but run it “as parallel”. PLINQ completely extends LINQ to Objects: the entire functionality of LINQ to Objects is available. By simply adding a call to AsParallel(), we can specify that a collection should be processed in parallel. This is simple, safe, and incredibly useful.

Read the article
Parallelism in .NET – Part 20, Using Task with Existing APIs

- by Reed

Although the Task class provides a huge amount of flexibility for handling asynchronous actions, the .NET Framework still contains a large number of APIs that are based on the previous asynchronous programming model. While Task and Task<T> provide a much nicer syntax as well as extending the flexibility, allowing features such as continuations based on multiple tasks, the existing APIs don’t directly support this workflow. There is a method in the TaskFactory class which can be used to adapt the existing APIs to the new Task class: TaskFactory.FromAsync. This method provides a way to convert from the BeginOperation/EndOperation method pair syntax common through .NET Framework directly to a Task<T> containing the results of the operation in the task’s Result parameter. While this method does exist, it unfortunately comes at a cost – the method overloads are far from simple to decipher, and the resulting code is not always as easily understood as newer code based directly on the Task class. For example, a single call to handle WebRequest.BeginGetResponse/EndGetReponse, one of the easiest “pairs” of methods to use, looks like the following: var task = Task.Factory.FromAsync<WebResponse>( request.BeginGetResponse, request.EndGetResponse, null); .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } The compiler is unfortunately unable to infer the correct type, and, as a result, the WebReponse must be explicitly mentioned in the method call. As a result, I typically recommend wrapping this into an extension method to ease use. For example, I would place the above in an extension method like: public static class WebRequestExtensions { public static Task<WebResponse> GetReponseAsync(this WebRequest request) { return Task.Factory.FromAsync<WebResponse>( request.BeginGetResponse, request.EndGetResponse, null); } } This dramatically simplifies usage. For example, if we wanted to asynchronously check to see if this blog supported XHTML 1.0, and report that in a text box to the user, we could do: var webRequest = WebRequest.Create("http://www.reedcopsey.com"); webRequest.GetReponseAsync().ContinueWith(t => { using (var sr = new StreamReader(t.Result.GetResponseStream())) { string str = sr.ReadLine();; this.textBox1.Text = string.Format("Page at {0} supports XHTML 1.0: {1}", t.Result.ResponseUri, str.Contains("XHTML 1.0")); } }, TaskScheduler.FromCurrentSynchronizationContext()); By using a continuation with a TaskScheduler based on the current synchronization context, we can keep this request asynchronous, check based on the first line of the response string, and report the results back on our UI directly.

Read the article
Parallelism in .NET – Part 1, Decomposition

- by Reed

The first step in designing any parallelized system is Decomposition. Decomposition is nothing more than taking a problem space and breaking it into discrete parts. When we want to work in parallel, we need to have at least two separate things that we are trying to run. We do this by taking our problem and decomposing it into parts. There are two common abstractions that are useful when discussing parallel decomposition: Data Decomposition and Task Decomposition. These two abstractions allow us to think about our problem in a way that helps leads us to correct decision making in terms of the algorithms we’ll use to parallelize our routine. To start, I will make a couple of minor points. I’d like to stress that Decomposition has nothing to do with specific algorithms or techniques. It’s about how you approach and think about the problem, not how you solve the problem using a specific tool, technique, or library. Decomposing the problem is about constructing the appropriate mental model: once this is done, you can choose the appropriate design and tools, which is a subject for future posts. Decomposition, being unrelated to tools or specific techniques, is not specific to .NET in any way. This should be the first step to parallelizing a problem, and is valid using any framework, language, or toolset. However, this gives us a starting point – without a proper understanding of decomposition, it is difficult to understand the proper usage of specific classes and tools within the .NET framework. Data Decomposition is often the simpler abstraction to use when trying to parallelize a routine. In order to decompose our problem domain by data, we take our entire set of data and break it into smaller, discrete portions, or chunks. We then work on each chunk in the data set in parallel. This is particularly useful if we can process each element of data independently of the rest of the data. In a situation like this, there are some wonderfully simple techniques we can use to take advantage of our data. By decomposing our domain by data, we can very simply parallelize our routines. In general, we, as developers, should be always searching for data that can be decomposed. Finding data to decompose if fairly simple, in many instances. Data decomposition is typically used with collections of data. Any time you have a collection of items, and you’re going to perform work on or with each of the items, you potentially have a situation where parallelism can be exploited. This is fairly easy to do in practice: look for iteration statements in your code, such as for and foreach. Granted, every for loop is not a candidate to be parallelized. If the collection is being modified as it’s iterated, or the processing of elements depends on other elements, the iteration block may need to be processed in serial. However, if this is not the case, data decomposition may be possible. Let’s look at one example of how we might use data decomposition. Suppose we were working with an image, and we were applying a simple contrast stretching filter. When we go to apply the filter, once we know the minimum and maximum values, we can apply this to each pixel independently of the other pixels. This means that we can easily decompose this problem based off data – we will do the same operation, in parallel, on individual chunks of data (each pixel). Task Decomposition, on the other hand, is focused on the individual tasks that need to be performed instead of focusing on the data. In order to decompose our problem domain by tasks, we need to think about our algorithm in terms of discrete operations, or tasks, which can then later be parallelized. Task decomposition, in practice, can be a bit more tricky than data decomposition. Here, we need to look at what our algorithm actually does, and how it performs its actions. Once we have all of the basic steps taken into account, we can try to analyze them and determine whether there are any constraints in terms of shared data or ordering. There are no simple things to look for in terms of finding tasks we can decompose for parallelism; every algorithm is unique in terms of its tasks, so every algorithm will have unique opportunities for task decomposition. For example, say we want our software to perform some customized actions on startup, prior to showing our main screen. Perhaps we want to check for proper licensing, notify the user if the license is not valid, and also check for updates to the program. Once we verify the license, and that there are no updates, we’ll start normally. In this case, we can decompose this problem into tasks – we have a few tasks, but there are at least two discrete, independent tasks (check licensing, check for updates) which we can perform in parallel. Once those are completed, we will continue on with our other tasks. One final note – Data Decomposition and Task Decomposition are not mutually exclusive. Often, you’ll mix the two approaches while trying to parallelize a single routine. It’s possible to decompose your problem based off data, then further decompose the processing of each element of data based on tasks. This just provides a framework for thinking about our algorithms, and for discussing the problem.

Read the article
Performance and Optimization Isn’t Evil

- by Reed

Donald Knuth is a fairly amazing guy. I consider him one of the most influential contributors to computer science of all time. Unfortunately, most of the time I hear his name, I cringe. This is because it’s typically somebody quoting a small portion of one of his famous statements on optimization: “premature optimization is the root of all evil.” I mention that this is only a portion of the entire quote, and, as such, I feel that Knuth is being quoted out of context. Optimization is important. It is a critical part of every software development effort, and should never be ignored. A developer who ignores optimization is not a professional. Every developer should understand optimization – know what to optimize, when to optimize it, and how to think about code in a way that is intelligent and productive from day one. I want to start by discussing my own, personal motivation here. I recently wrote about a performance issue I ran across, and was slammed by multiple comments and emails that effectively boiled down to: “You’re an idiot. Premature optimization is the root of all evil. This doesn’t matter.” It didn’t matter that I discovered this while measuring in a profiler, and that it was a portion of my code base that can take “many hours to complete.” Even so, multiple people instantly jump to “it’s premature – it doesn’t matter.” This is a common thread I see. For example, StackOverflow has many pages of posts with answers that boil down to (mis)quoting Knuth. In fact, just about any question relating to a performance related issue gets this quote thrown at it immediately – whether it deserves it or not. That being said, I did receive some positive comments and emails as well. Many people want to understand how to optimize their code, approaches to take, tools and techniques they can use, and any other advice they can discover. First, lets get back to Knuth – I mentioned before that Knuth is being quoted out of context. Lets start by looking at the entire quote from his 1974 paper Structured Programming with go to Statements: “We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified.” Ironically, if you read Knuth’s original paper, this statement was made in the middle of a discussion of how Knuth himself had changed how he approaches optimization. It was never a statement saying “don’t optimize”, but rather, “optimizing intelligently provides huge advantages.” His approach had three benefits: “a) it doesn’t take long” … “b) the payoff is real”, c) you can “be less efficient in the other parts of my programs, which therefore are more readable and more easily written and debugged.” Looking at Knuth’s premise here, and reading that section of his paper, really leads to a few observations: Optimization is important “he will be wise to look carefully at the critical code” Normally, 3% of your code – three lines out of every 100 you write, are “critical code” and will require some optimization: “we should not pass up our opportunities in that critical 3%” Optimization, if done well, should not be time consuming: “it doesn’t take long” Optimization, if done correctly, provides real benefits: “the payoff is real” None of this is new information. People who care about optimization have been discussing this for years – for example, Rico Mariani’s Designing For Performance (a fantastic article) discusses many of the same issues very intelligently. That being said, many developers seem unable or unwilling to consider optimization. Many others don’t seem to know where to start. As such, I’m going to spend some time writing about optimization – what is it, how should we think about it, and what can we do to improve our own code.

Read the article
Parallelism in .NET – Part 19, TaskContinuationOptions

- by Reed

My introduction to Task continuations demonstrates continuations on the Task class. In addition, I’ve shown how continuations allow handling of multiple tasks in a clean, concise manner. Continuations can also be used to handle exceptional situations using a clean, simple syntax. In addition to standard Task continuations , the Task class provides some options for filtering continuations automatically. This is handled via the TaskContinationOptions enumeration, which provides hints to the TaskScheduler that it should only continue based on the operation of the antecedent task. This is especially useful when dealing with exceptions. For example, we can extend the sample from our earlier continuation discussion to include support for handling exceptions thrown by the Factorize method: // Get a copy of the UI-thread task scheduler up front to use later var uiScheduler = TaskScheduler.FromCurrentSynchronizationContext(); // Start our task var factorize = Task.Factory.StartNew( () => { int primeFactor1 = 0; int primeFactor2 = 0; bool result = Factorize(10298312, ref primeFactor1, ref primeFactor2); return new { Result = result, Factor1 = primeFactor1, Factor2 = primeFactor2 }; }); // When we succeed, report the results to the UI factorize.ContinueWith(task => textBox1.Text = string.Format("{0}/{1} [Succeeded {2}]", task.Result.Factor1, task.Result.Factor2, task.Result.Result), CancellationToken.None, TaskContinuationOptions.NotOnFaulted, uiScheduler); // When we have an exception, report it factorize.ContinueWith(task => textBox1.Text = string.Format("Error: {0}", task.Exception.Message), CancellationToken.None, TaskContinuationOptions.OnlyOnFaulted, uiScheduler); .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } The above code works by using a combination of features. First, we schedule our task, the same way as in the previous example. However, in this case, we use a different overload of Task.ContinueWith which allows us to specify both a specific TaskScheduler (in order to have your continuation run on the UI’s synchronization context) as well as a TaskContinuationOption. In the first continuation, we tell the continuation that we only want it to run when there was not an exception by specifying TaskContinuationOptions.NotOnFaulted. When our factorize task completes successfully, this continuation will automatically run on the UI thread, and provide the appropriate feedback. However, if the factorize task has an exception – for example, if the Factorize method throws an exception due to an improper input value, the second continuation will run. This occurs due to the specification of TaskContinuationOptions.OnlyOnFaulted in the options. In this case, we’ll report the error received to the user. We can use TaskContinuationOptions to filter our continuations by whether or not an exception occurred and whether or not a task was cancelled. This allows us to handle many situations, and is especially useful when trying to maintain a valid application state without ever blocking the user interface. The same concepts can be extended even further, and allow you to chain together many tasks based on the success of the previous ones. Continuations can even be used to create a state machine with full error handling, all without blocking the user interface thread.

Read the article
Launching a WPF Window in a Separate Thread, Part 1

- by Reed

Typically, I strongly recommend keeping the user interface within an application’s main thread, and using multiple threads to move the actual “work” into background threads. However, there are rare times when creating a separate, dedicated thread for a Window can be beneficial. This is even acknowledged in the MSDN samples, such as the Multiple Windows, Multiple Threads sample. However, doing this correctly is difficult. Even the referenced MSDN sample has major flaws, and will fail horribly in certain scenarios. To ease this, I wrote a small class that alleviates some of the difficulties involved. The MSDN Multiple Windows, Multiple Threads Sample shows how to launch a new thread with a WPF Window, and will work in most cases. The sample code (commented and slightly modified) works out to the following: // Create a thread Thread newWindowThread = new Thread(new ThreadStart( () => { // Create and show the Window Window1 tempWindow = new Window1(); tempWindow.Show(); // Start the Dispatcher Processing System.Windows.Threading.Dispatcher.Run(); })); // Set the apartment state newWindowThread.SetApartmentState(ApartmentState.STA); // Make the thread a background thread newWindowThread.IsBackground = true; // Start the thread newWindowThread.Start(); .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } This sample creates a thread, marks it as single threaded apartment state, and starts the Dispatcher on that thread. That is the minimum requirements to get a Window displaying and handling messages correctly, but, unfortunately, has some serious flaws. The first issue – the created thread will run continuously until the application shuts down, given the code in the sample. The problem is that the ThreadStart delegate used ends with running the Dispatcher. However, nothing ever stops the Dispatcher processing. The thread was created as a Background thread, which prevents it from keeping the application alive, but the Dispatcher will continue to pump dispatcher frames until the application shuts down. In order to fix this, we need to call Dispatcher.InvokeShutdown after the Window is closed. This would require modifying the above sample to subscribe to the Window’s Closed event, and, at that point, shutdown the Dispatcher: // Create a thread Thread newWindowThread = new Thread(new ThreadStart( () => { Window1 tempWindow = new Window1(); // When the window closes, shut down the dispatcher tempWindow.Closed += (s,e) => Dispatcher.CurrentDispatcher.BeginInvokeShutdown(DispatcherPriority.Background); tempWindow.Show(); // Start the Dispatcher Processing System.Windows.Threading.Dispatcher.Run(); })); // Setup and start thread as before This eliminates the first issue. Now, when the Window is closed, the new thread’s Dispatcher will shut itself down, which in turn will cause the thread to complete. The above code will work correctly for most situations. However, there is still a potential problem which could arise depending on the content of the Window1 class. This is particularly nasty, as the code could easily work for most windows, but fail on others. The problem is, at the point where the Window is constructed, there is no active SynchronizationContext. This is unlikely to be a problem in most cases, but is an absolute requirement if there is code within the constructor of Window1 which relies on a context being in place. While this sounds like an edge case, it’s fairly common. For example, if a BackgroundWorker is started within the constructor, or a TaskScheduler is built using TaskScheduler.FromCurrentSynchronizationContext() with the expectation of synchronizing work to the UI thread, an exception will be raised at some point. Both of these classes rely on the existence of a proper context being installed to SynchronizationContext.Current, which happens automatically, but not until Dispatcher.Run is called. In the above case, SynchronizationContext.Current will return null during the Window’s construction, which can cause exceptions to occur or unexpected behavior. Luckily, this is fairly easy to correct. We need to do three things, in order, prior to creating our Window: Create and initialize the Dispatcher for the new thread manually Create a synchronization context for the thread which uses the Dispatcher Install the synchronization context Creating the Dispatcher is quite simple – The Dispatcher.CurrentDispatcher property gets the current thread’s Dispatcher and “creates a new Dispatcher if one is not already associated with the thread.” Once we have the correct Dispatcher, we can create a SynchronizationContext which uses the dispatcher by creating a DispatcherSynchronizationContext. Finally, this synchronization context can be installed as the current thread’s context via SynchronizationContext.SetSynchronizationContext. These three steps can easily be added to the above via a single line of code: // Create a thread Thread newWindowThread = new Thread(new ThreadStart( () => { // Create our context, and install it: SynchronizationContext.SetSynchronizationContext( new DispatcherSynchronizationContext( Dispatcher.CurrentDispatcher)); Window1 tempWindow = new Window1(); // When the window closes, shut down the dispatcher tempWindow.Closed += (s,e) => Dispatcher.CurrentDispatcher.BeginInvokeShutdown(DispatcherPriority.Background); tempWindow.Show(); // Start the Dispatcher Processing System.Windows.Threading.Dispatcher.Run(); })); // Setup and start thread as before This now forces the synchronization context to be in place before the Window is created and correctly shuts down the Dispatcher when the window closes. However, there are quite a few steps. In my next post, I’ll show how to make this operation more reusable by creating a class with a far simpler API…

Read the article
Parallelism in .NET – Part 8, PLINQ’s ForAll Method

- by Reed

Parallel LINQ extends LINQ to Objects, and is typically very similar. However, as I previously discussed, there are some differences. Although the standard way to handle simple Data Parellelism is via Parallel.ForEach, it’s possible to do the same thing via PLINQ. PLINQ adds a new method unavailable in standard LINQ which provides new functionality… LINQ is designed to provide a much simpler way of handling querying, including filtering, ordering, grouping, and many other benefits. Reading the description in LINQ to Objects on MSDN, it becomes clear that the thinking behind LINQ deals with retrieval of data. LINQ works by adding a functional programming style on top of .NET, allowing us to express filters in terms of predicate functions, for example. PLINQ is, generally, very similar. Typically, when using PLINQ, we write declarative statements to filter a dataset or perform an aggregation. However, PLINQ adds one new method, which provides a very different purpose: ForAll. The ForAll method is defined on ParallelEnumerable, and will work upon any ParallelQuery<T>. Unlike the sequence operators in LINQ and PLINQ, ForAll is intended to cause side effects. It does not filter a collection, but rather invokes an action on each element of the collection. At first glance, this seems like a bad idea. For example, Eric Lippert clearly explained two philosophical objections to providing an IEnumerable<T>.ForEach extension method, one of which still applies when parallelized. The sole purpose of this method is to cause side effects, and as such, I agree that the ForAll method “violates the functional programming principles that all the other sequence operators are based upon”, in exactly the same manner an IEnumerable<T>.ForEach extension method would violate these principles. Eric Lippert’s second reason for disliking a ForEach extension method does not necessarily apply to ForAll – replacing ForAll with a call to Parallel.ForEach has the same closure semantics, so there is no loss there. Although ForAll may have philosophical issues, there is a pragmatic reason to include this method. Without ForAll, we would take a fairly serious performance hit in many situations. Often, we need to perform some filtering or grouping, then perform an action using the results of our filter. Using a standard foreach statement to perform our action would avoid this philosophical issue: // Filter our collection var filteredItems = collection.AsParallel().Where( i => i.SomePredicate() ); // Now perform an action foreach (var item in filteredItems) { // These will now run serially item.DoSomething(); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } This would cause a loss in performance, since we lose any parallelism in place, and cause all of our actions to be run serially. We could easily use a Parallel.ForEach instead, which adds parallelism to the actions: // Filter our collection var filteredItems = collection.AsParallel().Where( i => i.SomePredicate() ); // Now perform an action once the filter completes Parallel.ForEach(filteredItems, item => { // These will now run in parallel item.DoSomething(); }); This is a noticeable improvement, since both our filtering and our actions run parallelized. However, there is still a large bottleneck in place here. The problem lies with my comment “perform an action once the filter completes”. Here, we’re parallelizing the filter, then collecting all of the results, blocking until the filter completes. Once the filtering of every element is completed, we then repartition the results of the filter, reschedule into multiple threads, and perform the action on each element. By moving this into two separate statements, we potentially double our parallelization overhead, since we’re forcing the work to be partitioned and scheduled twice as many times. This is where the pragmatism comes into play. By violating our functional principles, we gain the ability to avoid the overhead and cost of rescheduling the work: // Perform an action on the results of our filter collection .AsParallel() .Where( i => i.SomePredicate() ) .ForAll( i => i.DoSomething() ); The ability to avoid the scheduling overhead is a compelling reason to use ForAll. This really goes back to one of the key points I discussed in data parallelism: Partition your problem in a way to place the most work possible into each task. Here, this means leaving the statement attached to the expression, even though it causes side effects and is not standard usage for LINQ. This leads to my one guideline for using ForAll: The ForAll extension method should only be used to process the results of a parallel query, as returned by a PLINQ expression. Any other usage scenario should use Parallel.ForEach, instead.

Read the article
Parallelism in .NET – Part 17, Think Continuations, not Callbacks

- by Reed

In traditional asynchronous programming, we’d often use a callback to handle notification of a background task’s completion. The Task class in the Task Parallel Library introduces a cleaner alternative to the traditional callback: continuation tasks. Asynchronous programming methods typically required callback functions. For example, MSDN’s Asynchronous Delegates Programming Sample shows a class that factorizes a number. The original method in the example has the following signature: public static bool Factorize(int number, ref int primefactor1, ref int primefactor2) { //... .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } However, calling this is quite “tricky”, even if we modernize the sample to use lambda expressions via C# 3.0. Normally, we could call this method like so: int primeFactor1 = 0; int primeFactor2 = 0; bool answer = Factorize(10298312, ref primeFactor1, ref primeFactor2); Console.WriteLine("{0}/{1} [Succeeded {2}]", primeFactor1, primeFactor2, answer); If we want to make this operation run in the background, and report to the console via a callback, things get tricker. First, we need a delegate definition: public delegate bool AsyncFactorCaller( int number, ref int primefactor1, ref int primefactor2); Then we need to use BeginInvoke to run this method asynchronously: int primeFactor1 = 0; int primeFactor2 = 0; AsyncFactorCaller caller = new AsyncFactorCaller(Factorize); caller.BeginInvoke(10298312, ref primeFactor1, ref primeFactor2, result => { int factor1 = 0; int factor2 = 0; bool answer = caller.EndInvoke(ref factor1, ref factor2, result); Console.WriteLine("{0}/{1} [Succeeded {2}]", factor1, factor2, answer); }, null); This works, but is quite difficult to understand from a conceptual standpoint. To combat this, the framework added the Event-based Asynchronous Pattern, but it isn’t much easier to understand or author. Using .NET 4’s new Task<T> class and a continuation, we can dramatically simplify the implementation of the above code, as well as make it much more understandable. We do this via the Task.ContinueWith method. This method will schedule a new Task upon completion of the original task, and provide the original Task (including its Result if it’s a Task<T>) as an argument. Using Task, we can eliminate the delegate, and rewrite this code like so: var background = Task.Factory.StartNew( () => { int primeFactor1 = 0; int primeFactor2 = 0; bool result = Factorize(10298312, ref primeFactor1, ref primeFactor2); return new { Result = result, Factor1 = primeFactor1, Factor2 = primeFactor2 }; }); background.ContinueWith(task => Console.WriteLine("{0}/{1} [Succeeded {2}]", task.Result.Factor1, task.Result.Factor2, task.Result.Result)); This is much simpler to understand, in my opinion. Here, we’re explicitly asking to start a new task, then continue the task with a resulting task. In our case, our method used ref parameters (this was from the MSDN Sample), so there is a little bit of extra boiler plate involved, but the code is at least easy to understand. That being said, this isn’t dramatically shorter when compared with our C# 3 port of the MSDN code above. However, if we were to extend our requirements a bit, we can start to see more advantages to the Task based approach. For example, supposed we need to report the results in a user interface control instead of reporting it to the Console. This would be a common operation, but now, we have to think about marshaling our calls back to the user interface. This is probably going to require calling Control.Invoke or Dispatcher.Invoke within our callback, forcing us to specify a delegate within the delegate. The maintainability and ease of understanding drops. However, just as a standard Task can be created with a TaskScheduler that uses the UI synchronization context, so too can we continue a task with a specific context. There are Task.ContinueWith method overloads which allow you to provide a TaskScheduler. This means you can schedule the continuation to run on the UI thread, by simply doing: Task.Factory.StartNew( () => { int primeFactor1 = 0; int primeFactor2 = 0; bool result = Factorize(10298312, ref primeFactor1, ref primeFactor2); return new { Result = result, Factor1 = primeFactor1, Factor2 = primeFactor2 }; }).ContinueWith(task => textBox1.Text = string.Format("{0}/{1} [Succeeded {2}]", task.Result.Factor1, task.Result.Factor2, task.Result.Result), TaskScheduler.FromCurrentSynchronizationContext()); This is far more understandable than the alternative. By using Task.ContinueWith in conjunction with TaskScheduler.FromCurrentSynchronizationContext(), we get a simple way to push any work onto a background thread, and update the user interface on the proper UI thread. This technique works with Windows Presentation Foundation as well as Windows Forms, with no change in methodology.

Read the article
2010 Visual C# MVP Award

- by Reed

I received a pleasant surprise today. I was presented this morning with the 2010 Microsoft® MVP Award for Visual C#. According to the award email, this “award is given to exceptional technical community leaders who actively share their high quality, real world expertise with others.” I feel honored and proud to receive this award, and hope that I can continue to be a valuable member of the community in the future. Thank you to everyone who nominated me!

Read the article
Reflections on GiveCamp

- by Reed

I participated in the Seattle GiveCamp over the weekend, and am entirely impressed. GiveCamp is a great event – I especially like how rewarding it is for everybody involved. I strongly encourage any and all developers to watch for future GiveCamp events, and consider participating, for many reasons… GiveCamp provides real value to organizations that truly need help. The Seattle event alone succeeded in helping sixteen non-profit organizations in many different ways. The projects involved varied dramatically, including website redesigns, SEO, reworking data management workflows, and even game development. Many non-profits have a strong need for good, quality technical help. However, nearly every non-profit organization has an incredibly limited budget. GiveCamp is a way to really give back, and provide incredibly valuable help to organizations that truly benefit. My experience has shown many developers to be incredibly generous – this is a chance to dedicate your energy to helping others in a way that really takes advantage of your expertise. Your time as a developer is incredibly valuable, and this puts something of incredible value directly into the hands of places its needed. First, and foremost, GiveCamp is about providing technical help to non-profit organizations in need. GiveCamp can make you a better developer. This is a fantastic opportunity for us, as developers, to work with new people, in a new setting. The incredibly short time frame (one weekend for a deliverable project) and intense motivation to succeed provides a huge opportunity for learning from peers. I’d personally like to thank off the developers with whom I worked – I learned something from each and every one of you. I hope to see and work with all of you again someday. GiveCamp provides an opportunity for you to work outside of your comfort zone. While it’s always nice to be an expert, it’s also valuable to work on a project where you have little or no direct experience. My team focused on a complete reworking of our organizations message and a complete new website redesign and deployment using WordPress. While I’d used WordPress for my blog, and had some experience, this is completely unrelated to my professional work. In fact, nobody on our team normally worked directly with the technologies involved – yet together we managed to succeed in delivering our goals. As developers, it’s easy to want to stay abreast of new technology surrounding our expertise, but its rare that we get a chance to sit down and work on something practical that is completely outside of our normal realm of work. I’m a desktop developer by trade, and spent much of the weekend working with CSS and Photoshop. Many of the projects organizations need don’t match perfectly with the skill set in the room – yet all of the software professionals rose to the occasion and delivered practical, usable applications. GiveCamp is a short term, known commitment. While this seems obvious, I think it’s an important aspect to remember. This is a huge part of what makes it successful – you can work, completely focused, on a project, then walk away completely when you’re done. There is no expectation of continued involvement. While many of the professionals I’ve talked to are willing to contribute some amount of their time beyond the camp, this is not expected. The freedom this provides is immense. In addition, the motivation this brings is incredibly valuable. Every developer in the room was very focused on delivering in time – you have one shot to get it as good as possible, and leave it with the organization in a way that can be maintained by them. This is a rare experience – and excellent practice at time management for everyone involved. GiveCamp provides a great way to meet and network with your peers. Not only do you get to network with other software professionals in your area – you get to network with amazing people. Every single person in the room is there to try to help people. The balance of altruism, intelligence, and expertise in the room is something I’ve never before experienced. During the presentations of what was accomplished, I felt blessed to participate. I know many people in the room were incredibly touched by the level of dedication and accomplishment over the weekend. GiveCamp is fun. At the end of the experience, I would have signed up again, even if it was a painful, tedious weekend – merely due to the amazing accomplishments achieved throughout the event. However, the event is fun. Everybody I talked to, the entire weekend, was having a good time. While there were many faces focused into a near grimace at times (including mine, I’ll admit), this was always in response to a particularly challenging problem or task. The challenges just added to the overall enjoyment of the weekend – part of why I became a developer in the first place is my love for challenge and puzzles, and a short deadline using unfamiliar technology provided plenty of opportunity for puzzles. As soon as people would stand up, it was another smile. If you’re a developer, I’d recommend looking at GiveCamp more closely. Watch for an event in your area. If there isn’t one, consider building a team and organizing an event. The experience is worth the commitment.

Read the article
C# Performance Pitfall – Interop Scenarios Change the Rules

- by Reed

C# and .NET, overall, really do have fantastic performance in my opinion. That being said, the performance characteristics dramatically differ from native programming, and take some relearning if you’re used to doing performance optimization in most other languages, especially C, C++, and similar. However, there are times when revisiting tricks learned in native code play a critical role in performance optimization in C#. I recently ran across a nasty scenario that illustrated to me how dangerous following any fixed rules for optimization can be… The rules in C# when optimizing code are very different than C or C++. Often, they’re exactly backwards. For example, in C and C++, lifting a variable out of loops in order to avoid memory allocations often can have huge advantages. If some function within a call graph is allocating memory dynamically, and that gets called in a loop, it can dramatically slow down a routine. This can be a tricky bottleneck to track down, even with a profiler. Looking at the memory allocation graph is usually the key for spotting this routine, as it’s often “hidden” deep in call graph. For example, while optimizing some of my scientific routines, I ran into a situation where I had a loop similar to: for (i=0; i<numberToProcess; ++i) { // Do some work ProcessElement(element[i]); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } This loop was at a fairly high level in the call graph, and often could take many hours to complete, depending on the input data. As such, any performance optimization we could achieve would be greatly appreciated by our users. After a fair bit of profiling, I noticed that a couple of function calls down the call graph (inside of ProcessElement), there was some code that effectively was doing: // Allocate some data required DataStructure* data = new DataStructure(num); // Call into a subroutine that passed around and manipulated this data highly CallSubroutine(data); // Read and use some values from here double values = data->Foo; // Cleanup delete data; // ... return bar; Normally, if “DataStructure” was a simple data type, I could just allocate it on the stack. However, it’s constructor, internally, allocated it’s own memory using new, so this wouldn’t eliminate the problem. In this case, however, I could change the call signatures to allow the pointer to the data structure to be passed into ProcessElement and through the call graph, allowing the inner routine to reuse the same “data” memory instead of allocating. At the highest level, my code effectively changed to something like: DataStructure* data = new DataStructure(numberToProcess); for (i=0; i<numberToProcess; ++i) { // Do some work ProcessElement(element[i], data); } delete data; Granted, this dramatically reduced the maintainability of the code, so it wasn’t something I wanted to do unless there was a significant benefit. In this case, after profiling the new version, I found that it increased the overall performance dramatically – my main test case went from 35 minutes runtime down to 21 minutes. This was such a significant improvement, I felt it was worth the reduction in maintainability. In C and C++, it’s generally a good idea (for performance) to: Reduce the number of memory allocations as much as possible, Use fewer, larger memory allocations instead of many smaller ones, and Allocate as high up the call stack as possible, and reuse memory I’ve seen many people try to make similar optimizations in C# code. For good or bad, this is typically not a good idea. The garbage collector in .NET completely changes the rules here. In C#, reallocating memory in a loop is not always a bad idea. In this scenario, for example, I may have been much better off leaving the original code alone. The reason for this is the garbage collector. The GC in .NET is incredibly effective, and leaving the allocation deep inside the call stack has some huge advantages. First and foremost, it tends to make the code more maintainable – passing around object references tends to couple the methods together more than necessary, and overall increase the complexity of the code. This is something that should be avoided unless there is a significant reason. Second, (unlike C and C++) memory allocation of a single object in C# is normally cheap and fast. Finally, and most critically, there is a large advantage to having short lived objects. If you lift a variable out of the loop and reuse the memory, its much more likely that object will get promoted to Gen1 (or worse, Gen2). This can cause expensive compaction operations to be required, and also lead to (at least temporary) memory fragmentation as well as more costly collections later. As such, I’ve found that it’s often (though not always) faster to leave memory allocations where you’d naturally place them – deep inside of the call graph, inside of the loops. This causes the objects to stay very short lived, which in turn increases the efficiency of the garbage collector, and can dramatically improve the overall performance of the routine as a whole. In C#, I tend to: Keep variable declarations in the tightest scope possible Declare and allocate objects at usage While this tends to cause some of the same goals (reducing unnecessary allocations, etc), the goal here is a bit different – it’s about keeping the objects rooted for as little time as possible in order to (attempt) to keep them completely in Gen0, or worst case, Gen1. It also has the huge advantage of keeping the code very maintainable – objects are used and “released” as soon as possible, which keeps the code very clean. It does, however, often have the side effect of causing more allocations to occur, but keeping the objects rooted for a much shorter time. Now – nowhere here am I suggesting that these rules are hard, fast rules that are always true. That being said, my time spent optimizing over the years encourages me to naturally write code that follows the above guidelines, then profile and adjust as necessary. In my current project, however, I ran across one of those nasty little pitfalls that’s something to keep in mind – interop changes the rules. In this case, I was dealing with an API that, internally, used some COM objects. In this case, these COM objects were leading to native allocations (most likely C++) occurring in a loop deep in my call graph. Even though I was writing nice, clean managed code, the normal managed code rules for performance no longer apply. After profiling to find the bottleneck in my code, I realized that my inner loop, a innocuous looking block of C# code, was effectively causing a set of native memory allocations in every iteration. This required going back to a “native programming” mindset for optimization. Lifting these variables and reusing them took a 1:10 routine down to 0:20 – again, a very worthwhile improvement. Overall, the lessons here are: Always profile if you suspect a performance problem – don’t assume any rule is correct, or any code is efficient just because it looks like it should be Remember to check memory allocations when profiling, not just CPU cycles Interop scenarios often cause managed code to act very differently than “normal” managed code. Native code can be hidden very cleverly inside of managed wrappers

Read the article
Setting useLegacyV2RuntimeActivationPolicy At Runtime

- by Reed

Version 4.0 of the .NET Framework included a new CLR which is almost entirely backwards compatible with the 2.0 version of the CLR. However, by default, mixed-mode assemblies targeting .NET 3.5sp1 and earlier will fail to load in a .NET 4 application. Fixing this requires setting useLegacyV2RuntimeActivationPolicy in your app.Config for the application. While there are many good reasons for this decision, there are times when this is extremely frustrating, especially when writing a library. As such, there are (rare) times when it would be beneficial to set this in code, at runtime, as well as verify that it’s running correctly prior to receiving a FileLoadException. Typically, loading a pre-.NET 4 mixed mode assembly is handled simply by changing your app.Config file, and including the relevant attribute in the startup element: <?xml version="1.0" encoding="utf-8" ?> <configuration> <startup useLegacyV2RuntimeActivationPolicy="true"> <supportedRuntime version="v4.0"/> </startup> </configuration> .csharpcode { background-color: #ffffff; font-family: consolas, "Courier New", courier, monospace; color: black; font-size: small } .csharpcode pre { background-color: #ffffff; font-family: consolas, "Courier New", courier, monospace; color: black; font-size: small } .csharpcode pre { margin: 0em } .csharpcode .rem { color: #008000 } .csharpcode .kwrd { color: #0000ff } .csharpcode .str { color: #006080 } .csharpcode .op { color: #0000c0 } .csharpcode .preproc { color: #cc6633 } .csharpcode .asp { background-color: #ffff00 } .csharpcode .html { color: #800000 } .csharpcode .attr { color: #ff0000 } .csharpcode .alt { background-color: #f4f4f4; margin: 0em; width: 100% } .csharpcode .lnum { color: #606060 } This causes your application to run correctly, and load the older, mixed-mode assembly without issues. For full details on what’s happening here and why, I recommend reading Mark Miller’s detailed explanation of this attribute and the reasoning behind it. Before I show any code, let me say: I strongly recommend using the official approach of using app.config to set this policy. That being said, there are (rare) times when, for one reason or another, changing the application configuration file is less than ideal. While this is the supported approach to handling this issue, the CLR Hosting API includes a means of setting this programmatically via the ICLRRuntimeInfo interface. Normally, this is used if you’re hosting the CLR in a native application in order to set this, at runtime, prior to loading the assemblies. However, the F# Samples include a nice trick showing how to load this API and bind this policy, at runtime. This was required in order to host the Managed DirectX API, which is built against an older version of the CLR. This is fairly easy to port to C#. Instead of a direct port, I also added a little addition – by trapping the COM exception received if unable to bind (which will occur if the 2.0 CLR is already bound), I also allow a runtime check of whether this property was setup properly: public static class RuntimePolicyHelper { public static bool LegacyV2RuntimeEnabledSuccessfully { get; private set; } static RuntimePolicyHelper() { ICLRRuntimeInfo clrRuntimeInfo = (ICLRRuntimeInfo)RuntimeEnvironment.GetRuntimeInterfaceAsObject( Guid.Empty, typeof(ICLRRuntimeInfo).GUID); try { clrRuntimeInfo.BindAsLegacyV2Runtime(); LegacyV2RuntimeEnabledSuccessfully = true; } catch (COMException) { // This occurs with an HRESULT meaning // "A different runtime was already bound to the legacy CLR version 2 activation policy." LegacyV2RuntimeEnabledSuccessfully = false; } } [ComImport] [InterfaceType(ComInterfaceType.InterfaceIsIUnknown)] [Guid("BD39D1D2-BA2F-486A-89B0-B4B0CB466891")] private interface ICLRRuntimeInfo { void xGetVersionString(); void xGetRuntimeDirectory(); void xIsLoaded(); void xIsLoadable(); void xLoadErrorString(); void xLoadLibrary(); void xGetProcAddress(); void xGetInterface(); void xSetDefaultStartupFlags(); void xGetDefaultStartupFlags(); [MethodImpl(MethodImplOptions.InternalCall, MethodCodeType = MethodCodeType.Runtime)] void BindAsLegacyV2Runtime(); } } Using this, it’s possible to not only set this at runtime, but also verify, prior to loading your mixed mode assembly, whether this will succeed. In my case, this was quite useful – I am working on a library purely for internal use which uses a numerical package that is supplied with both a completely managed as well as a native solver. The native solver uses a CLR 2 mixed-mode assembly, but is dramatically faster than the pure managed approach. By checking RuntimePolicyHelper.LegacyV2RuntimeEnabledSuccessfully at runtime, I can decide whether to enable the native solver, and only do so if I successfully bound this policy. There are some tricks required here – To enable this sort of fallback behavior, you must make these checks in a type that doesn’t cause the mixed mode assembly to be loaded. In my case, this forced me to encapsulate the library I was using entirely in a separate class, perform the check, then pass through the required calls to that class. Otherwise, the library will load before the hosting process gets enabled, which in turn will fail. This code will also, of course, try to enable the runtime policy before the first time you use this class – which typically means just before the first time you check the boolean value. As a result, checking this early on in the application is more likely to allow it to work. Finally, if you’re using a library, this has to be called prior to the 2.0 CLR loading. This will cause it to fail if you try to use it to enable this policy in a plugin for most third party applications that don’t have their app.config setup properly, as they will likely have already loaded the 2.0 runtime. As an example, take a simple audio player. The code below shows how this can be used to properly, at runtime, only use the “native” API if this will succeed, and fallback (or raise a nicer exception) if this will fail: public class AudioPlayer { private IAudioEngine audioEngine; public AudioPlayer() { if (RuntimePolicyHelper.LegacyV2RuntimeEnabledSuccessfully) { // This will load a CLR 2 mixed mode assembly this.audioEngine = new AudioEngineNative(); } else { this.audioEngine = new AudioEngineManaged(); } } public void Play(string filename) { this.audioEngine.Play(filename); } } Now – the warning: This approach works, but I would be very hesitant to use it in public facing production code, especially for anything other than initializing your own application. While this should work in a library, using it has a very nasty side effect: you change the runtime policy of the executing application in a way that is very hidden and non-obvious.

Read the article
C# 5 Async, Part 2: Asynchrony Today

- by Reed

The .NET Framework has always supported asynchronous operations. However, different mechanisms for supporting exist throughout the framework. While there are at least three separate asynchronous patterns used through the framework, only the latest is directly usable with the new Visual Studio Async CTP. Before delving into details on the new features, I will talk about existing asynchronous code, and demonstrate how to adapt it for use with the new pattern. The first asynchronous pattern used in the .NET framework was the Asynchronous Programming Model (APM). This pattern was based around callbacks. A method is used to start the operation. It typically is named as BeginSomeOperation. This method is passed a callback defined as an AsyncCallback, and returns an object that implements IAsyncResult. Later, the IAsyncResult is used in a call to a method named EndSomeOperation, which blocks until completion and returns the value normally directly returned from the synchronous version of the operation. Often, the EndSomeOperation call would be called from the callback function passed, which allows you to write code that never blocks. While this pattern works perfectly to prevent blocking, it can make quite confusing code, and be difficult to implement. For example, the sample code provided for FileStream’s BeginRead/EndRead methods is not simple to understand. In addition, implementing your own asynchronous methods requires creating an entire class just to implement the IAsyncResult. Given the complexity of the APM, other options have been introduced in later versions of the framework. The next major pattern introduced was the Event-based Asynchronous Pattern (EAP). This provides a simpler pattern for asynchronous operations. It works by providing a method typically named SomeOperationAsync, which signals its completion via an event typically named SomeOperationCompleted. The EAP provides a simpler model for asynchronous programming. It is much easier to understand and use, and far simpler to implement. Instead of requiring a custom class and callbacks, the standard event mechanism in C# is used directly. For example, the WebClient class uses this extensively. A method is used, such as DownloadDataAsync, and the results are returned via the DownloadDataCompleted event. While the EAP is far simpler to understand and use than the APM, it is still not ideal. By separating your code into method calls and event handlers, the logic of your program gets more complex. It also typically loses the ability to block until the result is received, which is often useful. Blocking often requires writing the code to block by hand, which is error prone and adds complexity. As a result, .NET 4 introduced a third major pattern for asynchronous programming. The Task<T> class introduced a new, simpler concept for asynchrony. Task and Task<T> effectively represent an operation that will complete at some point in the future. This is a perfect model for thinking about asynchronous code, and is the preferred model for all new code going forward. Task and Task<T> provide all of the advantages of both the APM and the EAP models – you have the ability to block on results (via Task.Wait() or Task<T>.Result), and you can stay completely asynchronous via the use of Task Continuations. In addition, the Task class provides a new model for task composition and error and cancelation handling. This is a far superior option to the previous asynchronous patterns. The Visual Studio Async CTP extends the Task based asynchronous model, allowing it to be used in a much simpler manner. However, it requires the use of Task and Task<T> for all operations.

Read the article
Attached Property port of my Window Close Behavior

- by Reed

Nishant Sivakumar just posted a nice article on The Code Project. It is a port of the MVVM-friendly Blend Behavior I wrote about in a previous article to WPF using Attached Properties. While similar to the WindowCloseBehavior code I posted on the Expression Code Gallery, Nishant Sivakumar’s version works in WPF without taking a dependency on the Expression Blend SDK. I highly recommend reading this article: Handling a Window’s Closed and Closing Events in the View-Model. It is a very nice alternative approach to this common problem in MVVM.

Read the article
Async CTP Refresh for Visual Studio 2010 SP1 Released

- by Reed

The Visual Studio team today released an update to the Visual Studio Async CTP which allows it to be used with Visual Studio SP1. This new CTP includes some very nice new additions over the previous CTP. The main highlights of this release include: Compatibility with Visual Studio SP1 APIs for Windows Phone 7 Compatibility with non-English installations Compatibility with Visual Studio Express Edition More efficient Async methods due to a change in the API Numerous bug fixes New EULA which allows distribution in production environments Anybody using the Async CTP should consider upgrading to the new version immediately. For details, visit the Visual Studio Asynchronous Programming page on MSDN.

Read the article
ConcurrentDictionary<TKey,TValue> used with Lazy<T>

- by Reed

In a recent thread on the MSDN forum for the TPL, Stephen Toub suggested mixing ConcurrentDictionary<T,U> with Lazy<T>. This provides a fantastic model for creating a thread safe dictionary of values where the construction of the value type is expensive. This is an incredibly useful pattern for many operations, such as value caches. The ConcurrentDictionary<TKey, TValue> class was added in .NET 4, and provides a thread-safe, lock free collection of key value pairs. While this is a fantastic replacement for Dictionary<TKey, TValue>, it has a potential flaw when used with values where construction of the value class is expensive. The typical way this is used is to call a method such as GetOrAdd to fetch or add a value to the dictionary. It handles all of the thread safety for you, but as a result, if two threads call this simultaneously, two instances of TValue can easily be constructed. If TValue is very expensive to construct, or worse, has side effects if constructed too often, this is less than desirable. While you can easily work around this with locking, Stephen Toub provided a very clever alternative – using Lazy<TValue> as the value in the dictionary instead. This looks like the following. Instead of calling: MyValue value = dictionary.GetOrAdd( key, () => new MyValue(key)); .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } We would instead use a ConcurrentDictionary<TKey, Lazy<TValue>>, and write: MyValue value = dictionary.GetOrAdd( key, () => new Lazy<MyValue>( () => new MyValue(key))) .Value; This simple change dramatically changes how the operation works. Now, if two threads call this simultaneously, instead of constructing two MyValue instances, we construct two Lazy<MyValue> instances. However, the Lazy<T> class is very cheap to construct. Unlike “MyValue”, we can safely afford to construct this twice and “throw away” one of the instances. We then call Lazy<T>.Value at the end to fetch our “MyValue” instance. At this point, GetOrAdd will always return the same instance of Lazy<MyValue>. Since Lazy<T> doesn’t construct the MyValue instance until requested, the actual MyClass instance returned is only constructed once.

Read the article

Search Results

Search found 983 results on 40 pages for 'marc alexander reed'.

Page 2/40 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

- by Reed

- by Reed

- by Reed

- by Reed

- by Reed

- by Reed

- by Reed

- by Reed

- by Reed

- by Reed

- by Reed

- by Reed

- by Reed

- by Reed

- by Reed

- by Reed

- by Reed

- by Reed

- by Reed

- by Reed

- by Reed

- by Reed

- by Reed

- by Reed

- by Reed

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >