Search Results

Search found 6559 results on 263 pages for 'parallel foreach'.

Page 33/263 | < Previous Page | 29 30 31 32 33 34 35 36 37 38 39 40  | Next Page >

  • how to cancel an awaiting task c#

    - by user1748906
    I am trying to cancel a task awaiting for network IO using CancellationTokenSource, but I have to wait until TcpClient connects: try { while (true) { token.Token.ThrowIfCancellationRequested(); Thread.Sleep(int.MaxValue); //simulating a TcpListener waiting for request } } any ideas ? Secondly, is it OK to start each client in a separate task ?

    Read the article

  • Parallelism in .NET – Part 5, Partitioning of Work

    - by Reed
    When parallelizing any routine, we start by decomposing the problem.  Once the problem is understood, we need to break our work into separate tasks, so each task can be run on a different processing element.  This process is called partitioning. Partitioning our tasks is a challenging feat.  There are opposing forces at work here: too many partitions adds overhead, too few partitions leaves processors idle.  Trying to work the perfect balance between the two extremes is the goal for which we should aim.  Luckily, the Task Parallel Library automatically handles much of this process.  However, there are situations where the default partitioning may not be appropriate, and knowledge of our routines may allow us to guide the framework to making better decisions. First off, I’d like to say that this is a more advanced topic.  It is perfectly acceptable to use the parallel constructs in the framework without considering the partitioning taking place.  The default behavior in the Task Parallel Library is very well-behaved, even for unusual work loads, and should rarely be adjusted.  I have found few situations where the default partitioning behavior in the TPL is not as good or better than my own hand-written partitioning routines, and recommend using the defaults unless there is a strong, measured, and profiled reason to avoid using them.  However, understanding partitioning, and how the TPL partitions your data, helps in understanding the proper usage of the TPL. I indirectly mentioned partitioning while discussing aggregation.  Typically, our systems will have a limited number of Processing Elements (PE), which is the terminology used for hardware capable of processing a stream of instructions.  For example, in a standard Intel i7 system, there are four processor cores, each of which has two potential hardware threads due to Hyperthreading.  This gives us a total of 8 PEs – theoretically, we can have up to eight operations occurring concurrently within our system. In order to fully exploit this power, we need to partition our work into Tasks.  A task is a simple set of instructions that can be run on a PE.  Ideally, we want to have at least one task per PE in the system, since fewer tasks means that some of our processing power will be sitting idle.  A naive implementation would be to just take our data, and partition it with one element in our collection being treated as one task.  When we loop through our collection in parallel, using this approach, we’d just process one item at a time, then reuse that thread to process the next, etc.  There’s a flaw in this approach, however.  It will tend to be slower than necessary, often slower than processing the data serially. The problem is that there is overhead associated with each task.  When we take a simple foreach loop body and implement it using the TPL, we add overhead.  First, we change the body from a simple statement to a delegate, which must be invoked.  In order to invoke the delegate on a separate thread, the delegate gets added to the ThreadPool’s current work queue, and the ThreadPool must pull this off the queue, assign it to a free thread, then execute it.  If our collection had one million elements, the overhead of trying to spawn one million tasks would destroy our performance. The answer, here, is to partition our collection into groups, and have each group of elements treated as a single task.  By adding a partitioning step, we can break our total work into small enough tasks to keep our processors busy, but large enough tasks to avoid overburdening the ThreadPool.  There are two clear, opposing goals here: Always try to keep each processor working, but also try to keep the individual partitions as large as possible. When using Parallel.For, the partitioning is always handled automatically.  At first, partitioning here seems simple.  A naive implementation would merely split the total element count up by the number of PEs in the system, and assign a chunk of data to each processor.  Many hand-written partitioning schemes work in this exactly manner.  This perfectly balanced, static partitioning scheme works very well if the amount of work is constant for each element.  However, this is rarely the case.  Often, the length of time required to process an element grows as we progress through the collection, especially if we’re doing numerical computations.  In this case, the first PEs will finish early, and sit idle waiting on the last chunks to finish.  Sometimes, work can decrease as we progress, since previous computations may be used to speed up later computations.  In this situation, the first chunks will be working far longer than the last chunks.  In order to balance the workload, many implementations create many small chunks, and reuse threads.  This adds overhead, but does provide better load balancing, which in turn improves performance. The Task Parallel Library handles this more elaborately.  Chunks are determined at runtime, and start small.  They grow slowly over time, getting larger and larger.  This tends to lead to a near optimum load balancing, even in odd cases such as increasing or decreasing workloads.  Parallel.ForEach is a bit more complicated, however. When working with a generic IEnumerable<T>, the number of items required for processing is not known in advance, and must be discovered at runtime.  In addition, since we don’t have direct access to each element, the scheduler must enumerate the collection to process it.  Since IEnumerable<T> is not thread safe, it must lock on elements as it enumerates, create temporary collections for each chunk to process, and schedule this out.  By default, it uses a partitioning method similar to the one described above.  We can see this directly by looking at the Visual Partitioning sample shipped by the Task Parallel Library team, and available as part of the Samples for Parallel Programming.  When we run the sample, with four cores and the default, Load Balancing partitioning scheme, we see this: The colored bands represent each processing core.  You can see that, when we started (at the top), we begin with very small bands of color.  As the routine progresses through the Parallel.ForEach, the chunks get larger and larger (seen by larger and larger stripes). Most of the time, this is fantastic behavior, and most likely will out perform any custom written partitioning.  However, if your routine is not scaling well, it may be due to a failure in the default partitioning to handle your specific case.  With prior knowledge about your work, it may be possible to partition data more meaningfully than the default Partitioner. There is the option to use an overload of Parallel.ForEach which takes a Partitioner<T> instance.  The Partitioner<T> class is an abstract class which allows for both static and dynamic partitioning.  By overriding Partitioner<T>.SupportsDynamicPartitions, you can specify whether a dynamic approach is available.  If not, your custom Partitioner<T> subclass would override GetPartitions(int), which returns a list of IEnumerator<T> instances.  These are then used by the Parallel class to split work up amongst processors.  When dynamic partitioning is available, GetDynamicPartitions() is used, which returns an IEnumerable<T> for each partition.  If you do decide to implement your own Partitioner<T>, keep in mind the goals and tradeoffs of different partitioning strategies, and design appropriately. The Samples for Parallel Programming project includes a ChunkPartitioner class in the ParallelExtensionsExtras project.  This provides example code for implementing your own, custom allocation strategies, including a static allocator of a given chunk size.  Although implementing your own Partitioner<T> is possible, as I mentioned above, this is rarely required or useful in practice.  The default behavior of the TPL is very good, often better than any hand written partitioning strategy.

    Read the article

  • How to show Label1.Text for each item in a foreach statement?

    - by Mike108
    I want to show each item Id that is doing now dynamically in a foreach statement. But the following code only shows the last item Id in Label1.Text. How to show Label1.Text for each item in a foreach statement? I use asp.net C#. protected void Button1_Click(object sender, EventArgs e) { List<int> list = new List<int>() { 1,2,3,4,5 }; foreach (var item in list) { Label1.Text = string.Format("I'm doing item {0} now.", item.ToString()); Thread.Sleep(1 * 1000); } }

    Read the article

  • How to do client callback for each item in a foreach statement using c#?

    - by Mike108
    I want to show each item Id that is doing now dynamically in a foreach statement. How to do some kind of client callback to show the item Id in a foreach statement? <asp:Label ID="Label1" runat="server" Text="Label"></asp:Label> <asp:Button ID="Button1" runat="server" onclick="Button1_Click" Text="Button" /> . protected void Button1_Click(object sender, EventArgs e) { List<int> list = new List<int>() { 1,2,3,4,5 }; foreach (var item in list) { Label1.Text = string.Format("I'm doing item {0} now.", item.ToString()); Page.RegisterStartupScript("", string.Format("<script>alert('doing item {0} now')</script>", item.ToString())); Thread.Sleep(1 * 1000); } }

    Read the article

  • How to get user inputs of trinidad componant which are placed in tr:foreach tag?

    - by Navnath
    Hi There, Can any one tell me how do I get user inputs from jsf component which are placed inside tr:foreach tag? I am trying to show multiple table where there are some fields which user can be input for it. I don't know how many table are going to display on page, because that decide at run time. So I put that table tag inside foreach tag. Now I want to reat each tables each record. But I am not able to do that because, there is no any binding attribute for foreach tag. Just sample code..... Thank You, Navnath Kumbhar.

    Read the article

  • Is there a way to control how pytest-xdist runs tests in parallel?

    - by superselector
    I have the following directory layout: runner.py lib/ tests/ testsuite1/ testsuite1.py testsuite2/ testsuite2.py testsuite3/ testsuite3.py testsuite4/ testsuite4.py The format of testsuite*.py modules is as follows: import pytest class testsomething: def setup_class(self): ''' do some setup ''' # Do some setup stuff here def teardown_class(self): '''' do some teardown''' # Do some teardown stuff here def test1(self): # Do some test1 related stuff def test2(self): # Do some test2 related stuff .... .... .... def test40(self): # Do some test40 related stuff if __name__=='__main()__' pytest.main(args=[os.path.abspath(__file__)]) The problem I have is that I would like to execute the 'testsuites' in parallel i.e. I want testsuite1, testsuite2, testsuite3 and testsuite4 to start execution in parallel but individual tests within the testsuites need to be executed serially. When I use the 'xdist' plugin from py.test and kick off the tests using 'py.test -n 4', py.test is gathering all the tests and randomly load balancing the tests among 4 workers. This leads to the 'setup_class' method to be executed every time of each test within a 'testsuitex.py' module (which defeats my purpose. I want setup_class to be executed only once per class and tests executed serially there after). Essentially what I want the execution to look like is: worker1: executes all tests in testsuite1.py serially worker2: executes all tests in testsuite2.py serially worker3: executes all tests in testsuite3.py serially worker4: executes all tests in testsuite4.py serially while worker1, worker2, worker3 and worker4 are all executed in parallel. Is there a way to achieve this in 'pytest-xidst' framework? The only option that I can think of is to kick off different processes to execute each test suite individually within runner.py: def test_execute_func(testsuite_path): subprocess.process('py.test %s' % testsuite_path) if __name__=='__main__': #Gather all the testsuite names for each testsuite: multiprocessing.Process(test_execute_func,(testsuite_path,))

    Read the article

  • How can I test if a point lies between two parallel lines?

    - by Harold
    In the game I'm designing there is a blast that shoots out from an origin point towards the direction of the mouse. The width of this blast is always going to be the same. Along the bottom of the screen (what's currently) squares move about which should be effected by the blast that the player controls. Currently I am trying to work out a way to discover if the corners of these squares are within the blast's two bounding lines. I thought the best way to do this would be to rotate the corners of the square around an origin point as if the blast were completely horizontal and see if the Y values of the corners were less than or equal to the width of the blast which would mean that they lie within the effected region, but I can't work out

    Read the article

  • How to get scripted programs governing game entities run in parallel with a game loop?

    - by Jim
    I recently discovered Crobot which is (briefly) a game where each player codes a virtual robot in a pseudo-C language. Each robot is then put in an arena where it fights against other robots. A robots' source code has this shape : /* Beginning file robot.r */ main() { while (1) { /* Do whatever you want */ ... move(); ... fire(); } } /* End file robot.r */ You can see that : The code is totally independent from any library/include Some predefined functions are available (move, fire, etc…) The program has its own game loop, and consequently is not called every frame My question is: How to achieve a similar result using scripted languages in collaboration with a C/C++ main program ? I found a possible approach using Python, multi-threading and shared memory, although I am not sure yet that it is possible this way. TCP/IP seems a bit too complicated for this kind of application.

    Read the article

  • How to prioritize tasks when you have multiple programming projects running in parallel?

    - by Vinko Vrsalovic
    Say you have 5 customers, you develop 2 or 3 different projects for each. Each project has Xi tasks. Each project takes from 2 to 10 man weeks. Given that there are few resources, it is desired to minimize the management overhead. Two questions in this scenario: What tools would you use to prioritize the tasks and track their completion, while tending to minimize the overhead? What criteria would you take into consideration to determine which task to assign to the next available resource given that the primary objective is to increase throughput (more projects finished per time unit, this objective conflicts with starting one project and finishing it and then moving on to the next)? Ideas, management techniques, algorithms are welcome

    Read the article

  • Why is lowlatency kernel not being updated in parallel with the generic kernel?

    - by FlabbergastedPickle
    All, Any idea when we'll see updates to the lowlatency version of the Ubuntu 12.04 kernel? It is still stuck at 3.2.0.23 whereas the generic kernel is already several updates ahead of it at a version 3.2.0.25? NB: I am using a 64-bit version but I don't think this is limited to the 64-bit kernels alone but rather affects both 32-bit and 64-bit builds. Please do correct me if I am wrong about this.

    Read the article

  • How to find optimal path visit every node with parallel workers complicated by dynamic edge costs?

    - by Aaron Anodide
    Say you have an acyclic directed graph with weighted edges and create N workers. My goal is to calculate the optimal way those workers can traverse the entire graph in parralel. However, edge costs may change along the way. Example: A -1-> B A -2-> C B -3-> C (if A has already been visited) B -5-> C (if A has not already been visited) Does what I describe lend itself to a standard algorithmic approach, or alternately can someone suggest if I'm looking at this in an inherently flawed way (i have an intuition I might be)?

    Read the article

  • Shows how to use the new Tasks namespace to download multiple documents in parallel.

    In C# 4.0, Task parallelism is the lowest-level approach to parallelization with PFX. The classes for working at this level are defined in the System.Threading.Tasks namespace.  read moreBy Peter BrombergDid you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

    Read the article

  • Can multiple windows users connect to a Mac Mini OS X Server and run applications in parallel?

    - by ilight
    I want to validate the current situation :- I have multiple users who have to use designing applications like Adobe Photoshop, Illustrator etc and maybe some Mac specific applications like iWork and they need to be working on the applications in parallel. Can I setup a Mac Mini OS X Server and create separate user accounts and give to these users so that they can remote login to the OS X Server simultaneously from their Windows machines and use any application they want? In crux, can they share the server resources and applications from their windows machines?

    Read the article

  • Parallelism in .NET – Part 6, Declarative Data Parallelism

    - by Reed
    When working with a problem that can be decomposed by data, we have a collection, and some operation being performed upon the collection.  I’ve demonstrated how this can be parallelized using the Task Parallel Library and imperative programming using imperative data parallelism via the Parallel class.  While this provides a huge step forward in terms of power and capabilities, in many cases, special care must still be given for relative common scenarios. C# 3.0 and Visual Basic 9.0 introduced a new, declarative programming model to .NET via the LINQ Project.  When working with collections, we can now write software that describes what we want to occur without having to explicitly state how the program should accomplish the task.  By taking advantage of LINQ, many operations become much shorter, more elegant, and easier to understand and maintain.  Version 4.0 of the .NET framework extends this concept into the parallel computation space by introducing Parallel LINQ. Before we delve into PLINQ, let’s begin with a short discussion of LINQ.  LINQ, the extensions to the .NET Framework which implement language integrated query, set, and transform operations, is implemented in many flavors.  For our purposes, we are interested in LINQ to Objects.  When dealing with parallelizing a routine, we typically are dealing with in-memory data storage.  More data-access oriented LINQ variants, such as LINQ to SQL and LINQ to Entities in the Entity Framework fall outside of our concern, since the parallelism there is the concern of the data base engine processing the query itself. LINQ (LINQ to Objects in particular) works by implementing a series of extension methods, most of which work on IEnumerable<T>.  The language enhancements use these extension methods to create a very concise, readable alternative to using traditional foreach statement.  For example, let’s revisit our minimum aggregation routine we wrote in Part 4: double min = double.MaxValue; foreach(var item in collection) { double value = item.PerformComputation(); min = System.Math.Min(min, value); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Here, we’re doing a very simple computation, but writing this in an imperative style.  This can be loosely translated to English as: Create a very large number, and save it in min Loop through each item in the collection. For every item: Perform some computation, and save the result If the computation is less than min, set min to the computation Although this is fairly easy to follow, it’s quite a few lines of code, and it requires us to read through the code, step by step, line by line, in order to understand the intention of the developer. We can rework this same statement, using LINQ: double min = collection.Min(item => item.PerformComputation()); Here, we’re after the same information.  However, this is written using a declarative programming style.  When we see this code, we’d naturally translate this to English as: Save the Min value of collection, determined via calling item.PerformComputation() That’s it – instead of multiple logical steps, we have one single, declarative request.  This makes the developer’s intentions very clear, and very easy to follow.  The system is free to implement this using whatever method required. Parallel LINQ (PLINQ) extends LINQ to Objects to support parallel operations.  This is a perfect fit in many cases when you have a problem that can be decomposed by data.  To show this, let’s again refer to our minimum aggregation routine from Part 4, but this time, let’s review our final, parallelized version: // Safe, and fast! double min = double.MaxValue; // Make a "lock" object object syncObject = new object(); Parallel.ForEach( collection, // First, we provide a local state initialization delegate. () => double.MaxValue, // Next, we supply the body, which takes the original item, loop state, // and local state, and returns a new local state (item, loopState, localState) => { double value = item.PerformComputation(); return System.Math.Min(localState, value); }, // Finally, we provide an Action<TLocal>, to "merge" results together localState => { // This requires locking, but it's only once per used thread lock(syncObj) min = System.Math.Min(min, localState); } ); Here, we’re doing the same computation as above, but fully parallelized.  Describing this in English becomes quite a feat: Create a very large number, and save it in min Create a temporary object we can use for locking Call Parallel.ForEach, specifying three delegates For the first delegate: Initialize a local variable to hold the local state to a very large number For the second delegate: For each item in the collection, perform some computation, save the result If the result is less than our local state, save the result in local state For the final delegate: Take a lock on our temporary object to protect our min variable Save the min of our min and local state variables Although this solves our problem, and does it in a very efficient way, we’ve created a set of code that is quite a bit more difficult to understand and maintain. PLINQ provides us with a very nice alternative.  In order to use PLINQ, we need to learn one new extension method that works on IEnumerable<T> – ParallelEnumerable.AsParallel(). That’s all we need to learn in order to use PLINQ: one single method.  We can write our minimum aggregation in PLINQ very simply: double min = collection.AsParallel().Min(item => item.PerformComputation()); By simply adding “.AsParallel()” to our LINQ to Objects query, we converted this to using PLINQ and running this computation in parallel!  This can be loosely translated into English easily, as well: Process the collection in parallel Get the Minimum value, determined by calling PerformComputation on each item Here, our intention is very clear and easy to understand.  We just want to perform the same operation we did in serial, but run it “as parallel”.  PLINQ completely extends LINQ to Objects: the entire functionality of LINQ to Objects is available.  By simply adding a call to AsParallel(), we can specify that a collection should be processed in parallel.  This is simple, safe, and incredibly useful.

    Read the article

  • April 2010 Meeting of Israel Dot Net Developers User Group (IDNDUG)

    - by Jackie Goldstein
    Note the special date of this meeting - Thursday April 29, 2010 The April 2010 meeting of the Israel Dot Net Developers User Group will be held on Thursday April 29, 2010 .   This meeting will focus on parallel programming – in general and the support in VS 2010.  Our speaker will be Asaf Shelly, a recognized expert in parallel programming. Abstract : (1) Parallel Programming in Microsoft's Environments. The fundamentals of Windows have always been parallel. Starting with message queues...(read more)

    Read the article

  • Windows Azure Recipe: High Performance Computing

    - by Clint Edmonson
    One of the most attractive ways to use a cloud platform is for parallel processing. Commonly known as high-performance computing (HPC), this approach relies on executing code on many machines at the same time. On Windows Azure, this means running many role instances simultaneously, all working in parallel to solve some problem. Doing this requires some way to schedule applications, which means distributing their work across these instances. To allow this, Windows Azure provides the HPC Scheduler. This service can work with HPC applications built to use the industry-standard Message Passing Interface (MPI). Software that does finite element analysis, such as car crash simulations, is one example of this type of application, and there are many others. The HPC Scheduler can also be used with so-called embarrassingly parallel applications, such as Monte Carlo simulations. Whatever problem is addressed, the value this component provides is the same: It handles the complex problem of scheduling parallel computing work across many Windows Azure worker role instances. Drivers Elastic compute and storage resources Cost avoidance Solution Here’s a sketch of a solution using our Windows Azure HPC SDK: Ingredients Web Role – this hosts a HPC scheduler web portal to allow web based job submission and management. It also exposes an HTTP web service API to allow other tools (including Visual Studio) to post jobs as well. Worker Role – typically multiple worker roles are enlisted, including at least one head node that schedules jobs to be run among the remaining compute nodes. Database – stores state information about the job queue and resource configuration for the solution. Blobs, Tables, Queues, Caching (optional) – many parallel algorithms persist intermediate and/or permanent data as a result of their processing. These fast, highly reliable, parallelizable storage options are all available to all the jobs being processed. Training Here is a link to online Windows Azure training labs where you can learn more about the individual ingredients described above. (Note: The entire Windows Azure Training Kit can also be downloaded for offline use.) Windows Azure HPC Scheduler (3 labs)  The Windows Azure HPC Scheduler includes modules and features that enable you to launch and manage high-performance computing (HPC) applications and other parallel workloads within a Windows Azure service. The scheduler supports parallel computational tasks such as parametric sweeps, Message Passing Interface (MPI) processes, and service-oriented architecture (SOA) requests across your computing resources in Windows Azure. With the Windows Azure HPC Scheduler SDK, developers can create Windows Azure deployments that support scalable, compute-intensive, parallel applications. See my Windows Azure Resource Guide for more guidance on how to get started, including links web portals, training kits, samples, and blogs related to Windows Azure.

    Read the article

  • Parallel Programming. Boost's MPI, OpenMP, TBB, or something else?

    - by unknownthreat
    Hello, I am totally a novice in parallel programming, but I do know how to program C++. Now, I am looking around for parallel programming library. I just want to give it a try, just for fun, and right now, I found 3 APIs, but I am not sure which one should I stick with. Right now, I see Boost's MPI, OpenMP and TBB. For anyone who have experienced with any of these 3 API (or any other parallelism API), could you please tell me the difference between these? Are there any factor to consider, like AMD or Intel architecture?

    Read the article

  • What tool can record multiple parallel stream to files of defined size?

    - by Hauke
    I would like to record record multiple audio web streams like this one in parallel to an mp3 or wma file for a duration of several days. I would like to be able to limit the file size or the duration stored in each file. The tool can be for any operating system. I do not need anything fancy like song recognition, metadata or silence detection. I haven't been able to find such a piece of software so far. Example: Tap channel "News" results in: News-090902-0000-0100.mp3, News-090902-0100-0200.mp3, etc... Who knows what tool can do this? It can be commercial software. Link in fulltext: 88.84.145.116:8000/listen.pls

    Read the article

  • Load images in parallel - supported by browser or a feature to implement?

    - by Michael Mao
    Hi all: I am not a pro in web development and Apache server still remains a mystery to me. we've got a project which runs on LAMP, pretty much like all the commercial hosting plans. I am confused about one problem : does modern browsers support image loading in parallel? or this requires some special feature/config set up from server side? Can this be done with PHP coding or by some server-side configuration? Is a special content delivery networking needed for this? The benchmark demonstration will be the flickr website. I am too suprised to see how all image thumbnails are loaded in a short time after a search as if there were only one image to load. Sorry I cannot present any code to you... completed lost in this:(

    Read the article

  • Parallelism in .NET – Part 3, Imperative Data Parallelism: Early Termination

    - by Reed
    Although simple data parallelism allows us to easily parallelize many of our iteration statements, there are cases that it does not handle well.  In my previous discussion, I focused on data parallelism with no shared state, and where every element is being processed exactly the same. Unfortunately, there are many common cases where this does not happen.  If we are dealing with a loop that requires early termination, extra care is required when parallelizing. Often, while processing in a loop, once a certain condition is met, it is no longer necessary to continue processing.  This may be a matter of finding a specific element within the collection, or reaching some error case.  The important distinction here is that, it is often impossible to know until runtime, what set of elements needs to be processed. In my initial discussion of data parallelism, I mentioned that this technique is a candidate when you can decompose the problem based on the data involved, and you wish to apply a single operation concurrently on all of the elements of a collection.  This covers many of the potential cases, but sometimes, after processing some of the elements, we need to stop processing. As an example, lets go back to our previous Parallel.ForEach example with contacting a customer.  However, this time, we’ll change the requirements slightly.  In this case, we’ll add an extra condition – if the store is unable to email the customer, we will exit gracefully.  The thinking here, of course, is that if the store is currently unable to email, the next time this operation runs, it will handle the same situation, so we can just skip our processing entirely.  The original, serial case, with this extra condition, might look something like the following: foreach(var customer in customers) { // Run some process that takes some time... DateTime lastContact = theStore.GetLastContact(customer); TimeSpan timeSinceContact = DateTime.Now - lastContact; // If it's been more than two weeks, send an email, and update... if (timeSinceContact.Days > 14) { // Exit gracefully if we fail to email, since this // entire process can be repeated later without issue. if (theStore.EmailCustomer(customer) == false) break; customer.LastEmailContact = DateTime.Now; } } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Here, we’re processing our loop, but at any point, if we fail to send our email successfully, we just abandon this process, and assume that it will get handled correctly the next time our routine is run.  If we try to parallelize this using Parallel.ForEach, as we did previously, we’ll run into an error almost immediately: the break statement we’re using is only valid when enclosed within an iteration statement, such as foreach.  When we switch to Parallel.ForEach, we’re no longer within an iteration statement – we’re a delegate running in a method. This needs to be handled slightly differently when parallelized.  Instead of using the break statement, we need to utilize a new class in the Task Parallel Library: ParallelLoopState.  The ParallelLoopState class is intended to allow concurrently running loop bodies a way to interact with each other, and provides us with a way to break out of a loop.  In order to use this, we will use a different overload of Parallel.ForEach which takes an IEnumerable<T> and an Action<T, ParallelLoopState> instead of an Action<T>.  Using this, we can parallelize the above operation by doing: Parallel.ForEach(customers, (customer, parallelLoopState) => { // Run some process that takes some time... DateTime lastContact = theStore.GetLastContact(customer); TimeSpan timeSinceContact = DateTime.Now - lastContact; // If it's been more than two weeks, send an email, and update... if (timeSinceContact.Days > 14) { // Exit gracefully if we fail to email, since this // entire process can be repeated later without issue. if (theStore.EmailCustomer(customer) == false) parallelLoopState.Break(); else customer.LastEmailContact = DateTime.Now; } }); There are a couple of important points here.  First, we didn’t actually instantiate the ParallelLoopState instance.  It was provided directly to us via the Parallel class.  All we needed to do was change our lambda expression to reflect that we want to use the loop state, and the Parallel class creates an instance for our use.  We also needed to change our logic slightly when we call Break().  Since Break() doesn’t stop the program flow within our block, we needed to add an else case to only set the property in customer when we succeeded.  This same technique can be used to break out of a Parallel.For loop. That being said, there is a huge difference between using ParallelLoopState to cause early termination and to use break in a standard iteration statement.  When dealing with a loop serially, break will immediately terminate the processing within the closest enclosing loop statement.  Calling ParallelLoopState.Break(), however, has a very different behavior. The issue is that, now, we’re no longer processing one element at a time.  If we break in one of our threads, there are other threads that will likely still be executing.  This leads to an important observation about termination of parallel code: Early termination in parallel routines is not immediate.  Code will continue to run after you request a termination. This may seem problematic at first, but it is something you just need to keep in mind while designing your routine.  ParallelLoopState.Break() should be thought of as a request.  We are telling the runtime that no elements that were in the collection past the element we’re currently processing need to be processed, and leaving it up to the runtime to decide how to handle this as gracefully as possible.  Although this may seem problematic at first, it is a good thing.  If the runtime tried to immediately stop processing, many of our elements would be partially processed.  It would be like putting a return statement in a random location throughout our loop body – which could have horrific consequences to our code’s maintainability. In order to understand and effectively write parallel routines, we, as developers, need a subtle, but profound shift in our thinking.  We can no longer think in terms of sequential processes, but rather need to think in terms of requests to the system that may be handled differently than we’d first expect.  This is more natural to developers who have dealt with asynchronous models previously, but is an important distinction when moving to concurrent programming models. As an example, I’ll discuss the Break() method.  ParallelLoopState.Break() functions in a way that may be unexpected at first.  When you call Break() from a loop body, the runtime will continue to process all elements of the collection that were found prior to the element that was being processed when the Break() method was called.  This is done to keep the behavior of the Break() method as close to the behavior of the break statement as possible. We can see the behavior in this simple code: var collection = Enumerable.Range(0, 20); var pResult = Parallel.ForEach(collection, (element, state) => { if (element > 10) { Console.WriteLine("Breaking on {0}", element); state.Break(); } Console.WriteLine(element); }); If we run this, we get a result that may seem unexpected at first: 0 2 1 5 6 3 4 10 Breaking on 11 11 Breaking on 12 12 9 Breaking on 13 13 7 8 Breaking on 15 15 What is occurring here is that we loop until we find the first element where the element is greater than 10.  In this case, this was found, the first time, when one of our threads reached element 11.  It requested that the loop stop by calling Break() at this point.  However, the loop continued processing until all of the elements less than 11 were completed, then terminated.  This means that it will guarantee that elements 9, 7, and 8 are completed before it stops processing.  You can see our other threads that were running each tried to break as well, but since Break() was called on the element with a value of 11, it decides which elements (0-10) must be processed. If this behavior is not desirable, there is another option.  Instead of calling ParallelLoopState.Break(), you can call ParallelLoopState.Stop().  The Stop() method requests that the runtime terminate as soon as possible , without guaranteeing that any other elements are processed.  Stop() will not stop the processing within an element, so elements already being processed will continue to be processed.  It will prevent new elements, even ones found earlier in the collection, from being processed.  Also, when Stop() is called, the ParallelLoopState’s IsStopped property will return true.  This lets longer running processes poll for this value, and return after performing any necessary cleanup. The basic rule of thumb for choosing between Break() and Stop() is the following. Use ParallelLoopState.Stop() when possible, since it terminates more quickly.  This is particularly useful in situations where you are searching for an element or a condition in the collection.  Once you’ve found it, you do not need to do any other processing, so Stop() is more appropriate. Use ParallelLoopState.Break() if you need to more closely match the behavior of the C# break statement. Both methods behave differently than our C# break statement.  Unfortunately, when parallelizing a routine, more thought and care needs to be put into every aspect of your routine than you may otherwise expect.  This is due to my second observation: Parallelizing a routine will almost always change its behavior. This sounds crazy at first, but it’s a concept that’s so simple its easy to forget.  We’re purposely telling the system to process more than one thing at the same time, which means that the sequence in which things get processed is no longer deterministic.  It is easy to change the behavior of your routine in very subtle ways by introducing parallelism.  Often, the changes are not avoidable, even if they don’t have any adverse side effects.  This leads to my final observation for this post: Parallelization is something that should be handled with care and forethought, added by design, and not just introduced casually.

    Read the article

< Previous Page | 29 30 31 32 33 34 35 36 37 38 39 40  | Next Page >