Search Results

Search found 1775 results on 71 pages for 'parallel foreeach'.

Page 4/71 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • Dump debugging with Parallel Stacks

    One of the areas where we fixed many bugs for Beta2 in our parallel debugging windows is with regards to managed dump debugging. So it was really cool to see Tess use the Parallel Stacks window in that scenario in her video demo with Scott.Other than the neat ability to open managed dumps in VS2010, Parallel Stacks was the only debugging feature she needed for diagnosing the issue! Check out the video, definitely worth 10 minutes of your time. Comments about this post welcome at the original blog.

    Read the article

  • How can I assign a name to a task in TPL

    - by mehrandvd
    I'm going to use lots of tasks running on my application. Each bunch of tasks is running for some reason. I would like to name these tasks so when I watch the Parallel Tasks window, I could recognize them easily. With another point of view, consider I'm using tasks at the framework level to populate a list. A developer that use my framework is also using tasks for her job. If she looks at the Parallel Tasks Window she will find some tasks having no idea about. I want to name tasks so she can distinguish the framework tasks from her tasks. It would be very convenient if there was such API: var task = new Task(action, "Growth calculation task") or maybe: var task = Task.Factory.StartNew(action, "Populating the datagrid") or even while working with Parallel.ForEach Parallel.ForEach(list, action, "Salary Calculation Task" Is it possible to name a task? Is it possible to give ???Parallel.ForEach a naming structure (maybe using a lambda) so it creates tasks with that naming? Is there such API somewhere that I'm missing? I've also tried to use an inherited task to override it's ToString(). But unfortunately the Parallel Tasks window doesn't use ToString()! class NamedTask : Task { private string TaskName { get; set; } public NamedTask(Action action, string taskName):base(action) { TaskName = taskName; } public override string ToString() { return TaskName; } }

    Read the article

  • Parallel.For System.OutOfMemoryException

    - by Martin Neal
    We have a fairly simple program that's used for creating backups. I'm attempting to parallelize it but am getting an OutofMemoryException within an AggregateExcption. Some of the source folders are quite large, and the program doesn't crash for about 40 minutes after it starts. I don't know where to start looking so the below code is a near exact dump of all code the code sans directory structure and Exception logging code. Any advise as to where to start looking? using System; using System.Diagnostics; using System.IO; using System.Threading.Tasks; namespace SelfBackup { class Program { static readonly string[] saSrc = { "\\src\\dir1\\", //... "\\src\\dirN\\", //this folder is over 6 GB }; static readonly string[] saDest = { "\\dest\\dir1\\", //... "\\dest\\dirN\\", }; static void Main(string[] args) { Parallel.For(0, saDest.Length, i => { try { if (Directory.Exists(sDest)) { //Delete directory first so old stuff gets cleaned up Directory.Delete(sDest, true); } //recursive function clsCopyDirectory.copyDirectory(saSrc[i], sDest); } catch (Exception e) { //standard error logging CL.EmailError(); } }); } } /////////////////////////////////////// using System.IO; using System.Threading.Tasks; namespace SelfBackup { static class clsCopyDirectory { static public void copyDirectory(string Src, string Dst) { Directory.CreateDirectory(Dst); /* Copy all the files in the folder If and when .NET 4.0 is installed, change Directory.GetFiles to Directory.Enumerate files for slightly better performance.*/ Parallel.ForEach<string>(Directory.GetFiles(Src), file => { /* An exception thrown here may be arbitrarily deep into this recursive function there's also a good chance that if one copy fails here, so too will other files in the same directory, so we don't want to spam out hundreds of error e-mails but we don't want to abort all together. Instead, the best solution is probably to throw back up to the original caller of copy directory an move on to the next Src/Dst pair by not catching any possible exception here.*/ File.Copy(file, //src Path.Combine(Dst, Path.GetFileName(file)), //dest true);//bool overwrite }); //Call this function again for every directory in the folder. Parallel.ForEach(Directory.GetDirectories(Src), dir => { copyDirectory(dir, Path.Combine(Dst, Path.GetFileName(dir))); }); } }

    Read the article

  • NHybernate, the Parallel Framework, and SQL Server

    - by andy
    hey guys, we have a loop that: 1.Loops over several thousand xml files. Altogether we're parsing millions of "user" nodes. 2.In each iteration we parse a "user" xml, do custom deserialization 3.finally, in each iteration, we send our object to nhibernate for saving. We use: .SaveOrUpdateAndFlush(user); This is a lengthy process, and we thought it would be a perfect candidate for testing out the .NET 4.0 Parallel libraries. So we wrapped the loop in a: Parallel.ForEach(); After doing this, we start getting "random" Timeout Exceptions from SQL Server, and finally, after leaving it running all night, OutOfMemory unhandled exceptions. I haven't done deep debugging on this yet, but what do you guys think. Is this simply a limitation of SQL Server, or could it be our NHibernate setup, or what? cheers andy

    Read the article

  • NHibernate, the Parallel Framework, and SQL Server

    - by andy
    hey guys, we have a loop that: 1.Loops over several thousand xml files. Altogether we're parsing millions of "user" nodes. 2.In each iteration we parse a "user" xml, do custom deserialization 3.finally, in each iteration, we send our object to nhibernate for saving. We use: .SaveOrUpdateAndFlush(user); This is a lengthy process, and we thought it would be a perfect candidate for testing out the .NET 4.0 Parallel libraries. So we wrapped the loop in a: Parallel.ForEach(); After doing this, we start getting "random" Timeout Exceptions from SQL Server, and finally, after leaving it running all night, OutOfMemory unhandled exceptions. I haven't done deep debugging on this yet, but what do you guys think. Is this simply a limitation of SQL Server, or could it be our NHibernate setup, or what? cheers andy

    Read the article

  • Using Parallel Extensions In Web Applications

    - by Greg
    I'd like to hear some opinions as to what role, if any, parallel computing approaches, including the potential use of the parallel extensions (June CTP for example), have a in web applications. What scenarios does this approach fit and/or not fit for? My understanding of how exactly IIS and web browsers thread tasks is fairly limited. I would appreciate some insight on that if someone out there has a good understanding. I'm more curious to know if the way that IIS and web browsers work limits the ROI of creating threaded and/or asynchronous tasks in web applications in general. Thanks in advance.

    Read the article

  • Mock Assertions on objects inside Parallel ForEach's???

    - by jacko
    Any idea how we can assert a mock object was called when it is being accessed inside Parallel.ForEach via a closure? I assume that because each invocation is on a different thread that Rhino Mocks loses track of the object? Pseudocode: var someStub = MockRepository.GenerateStub() Parallel.Foreach(collectionOfInts, anInt => someStub.DoSomething(anInt)) someStub.AssertWasCalled(s => s.DoSomething, Repeat.Five.Times) This test will return an expectation violation, expecting the stub to be called 5 times but being actually called 0 times. Any ideas how we can tell the lambdas to keep track of the thread-local stub object?

    Read the article

  • Parallel.For maintain input list order on output list

    - by romeozor
    I'd like some input on keeping the order of a list during heavy-duty operations that I decided to try to do in a parallel manner to see if it boosts performance. (It did!) I came up with a solution, but since this was my first attempt at anything parallel, I'd need someone to slap my hands if I did something very stupid. There's a query that returns a list of card owners, sorted by name, then by date of birth. This needs to be rendered in a table on a web page (ASP.Net WebForms). The original coder decided he would construct the table cell-by-cell (TableCell), add them to rows (TableRow), then each row to the table. So no GridView, allegedly its performance is bad, but the performance was very poor regardless :). The database query returns in no time, the most time is spent on looping through the results and adding table cells etc. I made the following method to maintain the original order of the list: private TableRow[] ComposeRows(List<CardHolder> queryResult) { int queryElementsCount = queryResult.Count(); // array with the query's size var rowArray = new TableRow[queryElementsCount]; Parallel.For(0, queryElementsCount, i => { var row = new TableRow(); var cell = new TableCell(); // various operations, including simple ones such as: cell.Text = queryResult[i].Name; row.Cells.Add(cell); // here I'm adding the current item to it's original index // to maintain order in the output list rowArray[i] = row; }); return rowArray; } So as you can see, because I'm returning a very different type of data (List<CardHolder> -> TableRow[]), I can't just simply omit the ordering from the original query to do it after the operations. Also, I also thought it would be a good idea to Dispose() the objects at the end of each loop, because the query can return a huge list and letting cell and row objects pile up in the heap could impact performance.(?) How badly did I do? Does anyone have a better solution in case mine is flawed?

    Read the article

  • Parallel.For(): Update variable outside of loop

    - by TSS
    I'm just looking in to the new .NET 4.0 features. With that, I'm attempting a simple calculation using Parallel.For and a normal for(x;x;x) loop. However, I'm getting different results about 50% of the time. long sum = 0; Parallel.For(1, 10000, y => { sum += y; } ); Console.WriteLine(sum.ToString()); sum = 0; for (int y = 1; y < 10000; y++) { sum += y; } Console.WriteLine(sum.ToString()); My guess is that the threads are trying to update "sum" at the same time. Is there an obvious way around it?

    Read the article

  • Parallel.For(): Different results for simple addition

    - by TSS
    I'm just looking in to the new .NET 4.0 features. With that, I'm attempting a simple calculation using Parallel.For and a normal for(x;x;x) loop. However, I'm getting different results about 50% of the time (no code change). long sum = 0; Parallel.For(1, 10000, y => { sum += y; }); Console.WriteLine(sum.ToString()); sum = 0; for (int y = 1; y < 10000; y++) { sum += y; } Console.WriteLine(sum.ToString()); Surely I'm missing something simple or do not completely understand the concept. My guess is that the threads are trying to update "sum" at the same time. If so, is there an obvious way around it?

    Read the article

  • Solving embarassingly parallel problems using Python multiprocessing

    - by gotgenes
    How does one use multiprocessing to tackle embarrassingly parallel problems? Embarassingly parallel problems typically consist of three basic parts: Read input data (from a file, database, tcp connection, etc.). Run calculations on the input data, where each calculation is independent of any other calculation. Write results of calculations (to a file, database, tcp connection, etc.). We can parallelize the program in two dimensions: Part 2 can run on multiple cores, since each calculation is independent; order of processing doesn't matter. Each part can run independently. Part 1 can place data on an input queue, part 2 can pull data off the input queue and put results onto an output queue, and part 3 can pull results off the output queue and write them out. This seems a most basic pattern in concurrent programming, but I am still lost in trying to solve it, so let's write a canonical example to illustrate how this is done using multiprocessing. Here is the example problem: Given a CSV file with rows of integers as input, compute their sums. Separate the problem into three parts, which can all run in parallel: Process the input file into raw data (lists/iterables of integers) Calculate the sums of the data, in parallel Output the sums Below is traditional, single-process bound Python program which solves these three tasks: #!/usr/bin/env python # -*- coding: UTF-8 -*- # basicsums.py """A program that reads integer values from a CSV file and writes out their sums to another CSV file. """ import csv import optparse import sys def make_cli_parser(): """Make the command line interface parser.""" usage = "\n\n".join(["python %prog INPUT_CSV OUTPUT_CSV", __doc__, """ ARGUMENTS: INPUT_CSV: an input CSV file with rows of numbers OUTPUT_CSV: an output file that will contain the sums\ """]) cli_parser = optparse.OptionParser(usage) return cli_parser def parse_input_csv(csvfile): """Parses the input CSV and yields tuples with the index of the row as the first element, and the integers of the row as the second element. The index is zero-index based. :Parameters: - `csvfile`: a `csv.reader` instance """ for i, row in enumerate(csvfile): row = [int(entry) for entry in row] yield i, row def sum_rows(rows): """Yields a tuple with the index of each input list of integers as the first element, and the sum of the list of integers as the second element. The index is zero-index based. :Parameters: - `rows`: an iterable of tuples, with the index of the original row as the first element, and a list of integers as the second element """ for i, row in rows: yield i, sum(row) def write_results(csvfile, results): """Writes a series of results to an outfile, where the first column is the index of the original row of data, and the second column is the result of the calculation. The index is zero-index based. :Parameters: - `csvfile`: a `csv.writer` instance to which to write results - `results`: an iterable of tuples, with the index (zero-based) of the original row as the first element, and the calculated result from that row as the second element """ for result_row in results: csvfile.writerow(result_row) def main(argv): cli_parser = make_cli_parser() opts, args = cli_parser.parse_args(argv) if len(args) != 2: cli_parser.error("Please provide an input file and output file.") infile = open(args[0]) in_csvfile = csv.reader(infile) outfile = open(args[1], 'w') out_csvfile = csv.writer(outfile) # gets an iterable of rows that's not yet evaluated input_rows = parse_input_csv(in_csvfile) # sends the rows iterable to sum_rows() for results iterable, but # still not evaluated result_rows = sum_rows(input_rows) # finally evaluation takes place as a chain in write_results() write_results(out_csvfile, result_rows) infile.close() outfile.close() if __name__ == '__main__': main(sys.argv[1:]) Let's take this program and rewrite it to use multiprocessing to parallelize the three parts outlined above. Below is a skeleton of this new, parallelized program, that needs to be fleshed out to address the parts in the comments: #!/usr/bin/env python # -*- coding: UTF-8 -*- # multiproc_sums.py """A program that reads integer values from a CSV file and writes out their sums to another CSV file, using multiple processes if desired. """ import csv import multiprocessing import optparse import sys NUM_PROCS = multiprocessing.cpu_count() def make_cli_parser(): """Make the command line interface parser.""" usage = "\n\n".join(["python %prog INPUT_CSV OUTPUT_CSV", __doc__, """ ARGUMENTS: INPUT_CSV: an input CSV file with rows of numbers OUTPUT_CSV: an output file that will contain the sums\ """]) cli_parser = optparse.OptionParser(usage) cli_parser.add_option('-n', '--numprocs', type='int', default=NUM_PROCS, help="Number of processes to launch [DEFAULT: %default]") return cli_parser def main(argv): cli_parser = make_cli_parser() opts, args = cli_parser.parse_args(argv) if len(args) != 2: cli_parser.error("Please provide an input file and output file.") infile = open(args[0]) in_csvfile = csv.reader(infile) outfile = open(args[1], 'w') out_csvfile = csv.writer(outfile) # Parse the input file and add the parsed data to a queue for # processing, possibly chunking to decrease communication between # processes. # Process the parsed data as soon as any (chunks) appear on the # queue, using as many processes as allotted by the user # (opts.numprocs); place results on a queue for output. # # Terminate processes when the parser stops putting data in the # input queue. # Write the results to disk as soon as they appear on the output # queue. # Ensure all child processes have terminated. # Clean up files. infile.close() outfile.close() if __name__ == '__main__': main(sys.argv[1:]) These pieces of code, as well as another piece of code that can generate example CSV files for testing purposes, can be found on github. I would appreciate any insight here as to how you concurrency gurus would approach this problem. Here are some questions I had when thinking about this problem. Bonus points for addressing any/all: Should I have child processes for reading in the data and placing it into the queue, or can the main process do this without blocking until all input is read? Likewise, should I have a child process for writing the results out from the processed queue, or can the main process do this without having to wait for all the results? Should I use a processes pool for the sum operations? If yes, what method do I call on the pool to get it to start processing the results coming into the input queue, without blocking the input and output processes, too? apply_async()? map_async()? imap()? imap_unordered()? Suppose we didn't need to siphon off the input and output queues as data entered them, but could wait until all input was parsed and all results were calculated (e.g., because we know all the input and output will fit in system memory). Should we change the algorithm in any way (e.g., not run any processes concurrently with I/O)?

    Read the article

  • Perl Parallel::ForkManager wait_all_children() takes excessively long time

    - by zhang18
    I have a script that uses Parallel::ForkManager. However, the wait_all_children() process takes incredibly long time even after all child-processes are completed. The way I know is by printing out some timestamps (see below). Does anyone have any idea what might be causing this (I have 16 CPU cores on my machine)? my $pm = Parallel::ForkManager->new(16) for my $i (1..16) { $pm->start($i) and next; ... do something within the child-process ... print (scalar localtime), " Process $i completed.\n"; $pm->finish(); } print (scalar localtime), " Waiting for some child process to finish.\n"; $pm->wait_all_children(); print (scalar localtime), " All processes finished.\n"; Clearly, I'll get the Waiting for some child process to finish message first, with a timestamp of, say, 7:08:35. Then I'll get a list of Process i completed messages, with the last one at 7:10:30. However, I do not receive the message All Processes finished until 7:16:33(!). Why is that 6-minute delay between 7:10:30 and 7:16:33? Thx!

    Read the article

  • [UNIX] Sort lines of massive file by number of words on line (ideally in parallel)

    - by conradlee
    I am working on a community detection algorithm for analyzing social network data from Facebook. The first task, detecting all cliques in the graph, can be done efficiently in parallel, and leaves me with an output like this: 17118 17136 17392 17064 17093 17376 17118 17136 17356 17318 12345 17118 17136 17356 17283 17007 17059 17116 Each of these lines represents a unique clique (a collection of node ids), and I want to sort these lines in descending order by the number of ids per line. In the case of the example above, here's what the output should look like: 17118 17136 17356 17318 12345 17118 17136 17356 17283 17118 17136 17392 17064 17093 17376 17007 17059 17116 (Ties---i.e., lines with the same number of ids---can be sorted arbitrarily.) What is the most efficient way of sorting these lines. Keep the following points in mind: The file I want to sort could be larger than the physical memory of the machine Most of the machines that I'm running this on have several processors, so a parallel solution would be ideal An ideal solution would just be a shell script (probably using sort), but I'm open to simple solutions in python or perl (or any language, as long as it makes the task simple) This task is in some sense very easy---I'm not just looking for any old solution, but rather for a simple and above all efficient solution

    Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >