Search Results

Search found 1774 results on 71 pages for 'parallel'.

Page 6/71 | < Previous Page | 2 3 4 5 6 7 8 9 10 11 12 13 | Next Page >

Creating parallel selenium tests in C# and using Nunit as the runner application

- by damianmartin

I am writing a new test suite for the company to test a very complex ASP.NET application which is heavily AJAX driven. We have decided to use Selenium (Grid & Remote Control) and Nunit to run these tests. The actually tests are dynamically created at run time from a spreadsheet. Each Column in an excel spreadsheet relates to a new test and each row relates to a selenium command (but in plain English and the dll converts this into Selenium code). My problem i have at the moment is getting the tests running in parallel. There will be 1000+ tests so it is too time consuming to have 1 test run at a time. Selenium Grid and Selenium Remote Control(s) are setup correctly because I can run there demo. From what i have read I need to use Punit but i can not find any documentation on what a test in punit should look like. Nunit tests are [SetUp] [TearDown] [Test]. Can anyone point me in the right direction. Thanks in advance.

Read the article
How do i write tasks? (parallel code)

- by acidzombie24

I am impressed with intel thread building blocks. I like how i should write task and not thread code and i like how it works under the hood with my limited understanding (task are in a pool, there wont be 100 threads on 4cores, a task is not guaranteed to run because it isnt on its own thread and may be far into the pool. But it may be run with another related task so you cant do bad things like typical thread unsafe code). I wanted to know more about writing task. I like the 'Task-based Multithreading - How to Program for 100 cores' video here http://www.gdcvault.com/sponsor.php?sponsor_id=1 (currently second last link. WARNING it isnt 'great'). My fav part was 'solving the maze is better done in parallel' which is around the 48min mark (you can click the link on the left side. That part is really all you need to watch if any). However i like to see more code examples and some API of how to write task. Does anyone have a good resource? I have no idea how a class or pieces of code may look after pushing it onto a pool or how weird code may look when you need to make a copy of everything and how much of everything is pushed onto a pool.

Read the article
How do i write task? (parallel code)

- by acidzombie24

I am impressed with intel thread building blocks. I like how i should write task and not thread code and i like how it works under the hood with my limited understanding (task are in a pool, there wont be 100 threads on 4cores, a task is not guaranteed to run because it isnt on its own thread and may be far into the pool. But it may be run with another related task so you cant do bad things like typical thread unsafe code). I wanted to know more about writing task. I like the 'Task-based Multithreading - How to Program for 100 cores' video here http://www.gdcvault.com/sponsor.php?sponsor_id=1 (currently second last link. WARNING it isnt 'great'). My fav part was 'solving the maze is better done in parallel' which is around the 48min mark (you can click the link on the left side). However i like to see more code examples and some API of how to write task. Does anyone have a good resource? I have no idea how a class or pieces of code may look after pushing it onto a pool or how weird code may look when you need to make a copy of everything and how much of everything is pushed onto a pool.

Read the article
Parallel Tasking Concurrency with Dependencies on Python like GNU Make

- by Brian Bruggeman

I'm looking for a method or possibly a philosophical approach for how to do something like GNU Make within python. Currently, we utilize makefiles to execute processing because the makefiles are extremely good at parallel runs with changing single option: -j x. In addition, gnu make already has the dependency stacks built into it, so adding a secondary processor or the ability to process more threads just means updating that single option. I want that same power and flexibility in python, but I don't see it. As an example: all: dependency_a dependency_b dependency_c dependency_a: dependency_d stuff dependency_b: dependency_d stuff dependency_c: dependency_e stuff dependency_d: dependency_f stuff dependency_e: stuff dependency_f: stuff If we do a standard single thread operation (-j 1), the order of operation might be: dependency_f -> dependency_d -> dependency_a -> dependency_b -> dependency_e \ -> dependency_c For two threads (-j 2), we might see: 1: dependency_f -> dependency_d -> dependency_a -> dependency_b 2: dependency_e -> dependency_c Does anyone have any suggestions on either a package already built or an approach? I'm totally open, provided it's a pythonic solution/approach. Please and Thanks in advance!

Read the article
Links and code from session on Entity Framework 4, Parallel and C# 4.0 new features

- by Eric Nelson

Last week (12th May 2010) I did a session in the city on lot of the new .NET 4.0 Stuff. My demo code and links below. Code Parallel demos http://gist.github.com/364522 C# 4.0 new features http://gist.github.com/403826 EF4 Links Entity Framework 4 Resources http://bit.ly/ef4resources Entity Framework Team Blog http://blogs.msdn.com/adonet Entity Framework Design Blog http://blogs.msdn.com/efdesign/ Parallel Links Parallel Computing Dev Center http://msdn.com/concurrency Code samples http://code.msdn.microsoft.com/ParExtSamples Managed blog http://blogs.msdn.com/pfxteam Tools blog http://blogs.msdn.com/visualizeparallel C# 4.0 New features http://bit.ly/baq3aU New in .NET 4.0 Coevolution http://bit.ly/axglst New in C# 4.0 http://bit.ly/bG1U2Y

Read the article
How to Implement a Parallel Workflow

- by Paul

I'm trying to implement a parallel split task using a workflow system. I'm using .NET but my process is very simple and I don't want to use WF or anything heavy like that. I've tried using Stateless. So far is was easy to set up and run, but I may be using the wrong tool for the job because I'm not sure how you're supposed to model parallel split workflows, where you have multiple sub-tasks required before you can advance to the next state, but the steps don't require being performed in any particular order. I can easily use the dynamic configuration options to check my data model manually to see if the model is in the correct state (all sub-tasks completed) and can transition to the next state, but this seems to completely break the workflow paradigm. What is the proper, orthodox way to implement a parallel split process? Thanks

Read the article
how to deal with parallel programming

- by nkint

Hi. I know that parallel programming is a big resource in computer graphics, with moder machines, and mayebe a computing model that will be grow up in the near future (is this trend true?). I want to know what is the best way to deal with it. there is some practical general purpose usefulness in studying processor n-dimensional mesh, or bitonic sort in p-ram machines or it's only theory for domain specific hardware used in real particular signal elaborations of scientific simulations? Is this the best way to acquire the know how for how to become acquainted with cuda or opencl? (i'm interested in computer graphics applications) and why functional programming is so important to understand parallel computing? ps: as someone has advice me i have forked this discussion from http://stackoverflow.com/questions/4908677/how-to-deal-with-parallel-programming

Read the article
C# Monte Carlo Incremental Risk Calculation optimisation, random numbers, parallel execution

- by m3ntat

My current task is to optimise a Monte Carlo Simulation that calculates Capital Adequacy figures by region for a set of Obligors. It is running about 10 x too slow for where it will need to be in production and number or daily runs required. Additionally the granularity of the result figures will need to be improved down to desk possibly book level at some stage, the code I've been given is basically a prototype which is used by business units in a semi production capacity. The application is currently single threaded so I'll need to make it multi-threaded, may look at System.Threading.ThreadPool or the Microsoft Parallel Extensions library but I'm constrained to .NET 2 on the server at this bank so I may have to consider this guy's port, http://www.codeproject.com/KB/cs/aforge_parallel.aspx. I am trying my best to get them to upgrade to .NET 3.5 SP1 but it's a major exercise in an organisation of this size and might not be possible in my contract time frames. I've profiled the application using the trial of dotTrace (http://www.jetbrains.com/profiler). What other good profilers exist? Free ones? A lot of the execution time is spent generating uniform random numbers and then translating this to a normally distributed random number. They are using a C# Mersenne twister implementation. I am not sure where they got it or if it's the best way to go about this (or best implementation) to generate the uniform random numbers. Then this is translated to a normally distributed version for use in the calculation (I haven't delved into the translation code yet). Also what is the experience using the following? http://quantlib.org http://www.qlnet.org (C# port of quantlib) or http://www.boost.org Any alternatives you know of? I'm a C# developer so would prefer C#, but a wrapper to C++ shouldn't be a problem, should it? Maybe even faster leveraging the C++ implementations. I am thinking some of these libraries will have the fastest method to directly generate normally distributed random numbers, without the translation step. Also they may have some other functions that will be helpful in the subsequent calculations. Also the computer this is on is a quad core Opteron 275, 8 GB memory but Windows Server 2003 Enterprise 32 bit. Should I advise them to upgrade to a 64 bit OS? Any links to articles supporting this decision would really be appreciated. Anyway, any advice and help you may have is really appreciated.

Read the article
Parallel Task Library WaitAny Design

- by colithium

I've just begun to explore the PTL and have a design question. My Scenario: I have a list of URLs that each refer to an image. I want each image to be downloaded in parallel. As soon as at least one image is downloaded, I want to execute a method that does something with the downloaded image. That method should NOT be parallelized -- it should be serial. I think the following will work but I'm not sure if this is the right way to do it. Because I have separate classes for collecting the images and for doing "something" with the collected images, I end up passing around an array of Tasks which seems wrong since it exposes the inner workings of how images are retrieved. But I don't know a way around it. In reality there is more to both of these methods but that's not important for this. Just know that they really shouldn't be lumped into one large method that both retrieves and does something with the image. Task<Image>[] downloadTasks = collector.RetrieveImages(listOfURLs); for (int i = 0; i < listOfURLs.Count; i++) { //Wait for any of the remaining downloads to complete int completedIndex = Task<Image>.WaitAny(downloadTasks); Image completedImage = downloadTasks[completedIndex].Result; //Now do something with the image (this "something" must happen serially) } /////////////////////////////////////////////////// public Task<Image>[] RetrieveImages(List<string> urls) { Task<Image>[] tasks = new Task<Image>[urls.Count]; int index = 0; foreach (string url in urls) { string lambdaVar = url; //Required... Bleh tasks[index] = Task<Image>.Factory.StartNew(() => { using (WebClient client = new WebClient()) { //TODO: Replace with live image locations string fileName = String.Format("{0}.png", i); client.DownloadFile(lambdaVar, Path.Combine(Application.StartupPath, fileName)); } return Image.FromFile(Path.Combine(Application.StartupPath, fileName)); }, TaskCreationOptions.LongRunning | TaskCreationOptions.AttachedToParent); index++; } return tasks; }

Read the article
parallel computation for an Iterator of elements in Java

- by Brian Harris

I've had the same need a few times now and wanted to get other thoughts on the right way to structure a solution. The need is to perform some operation on many elements on many threads without needing to have all elements in memory at once, just the ones under computation. As in, Iterables.partition is insufficient because it brings all elements into memory up front. Expressing it in code, I want to write a BulkCalc2 that does the same thing as BulkCalc1, just in parallel. Below is sample code that illustrates my best attempt. I'm not satisfied because it's big and ugly, but it does seem to accomplish my goals of keeping threads highly utilized until the work is done, propagating any exceptions during computation, and not having more than numThreads instances of BigThing necessarily in memory at once. I'll accept the answer which meets the stated goals in the most concise way, whether it's a way to improve my BulkCalc2 or a completely different solution. interface BigThing { int getId(); String getString(); } class Calc { // somewhat expensive computation double calc(BigThing bigThing) { Random r = new Random(bigThing.getString().hashCode()); double d = 0; for (int i = 0; i < 100000; i++) { d += r.nextDouble(); } return d; } } class BulkCalc1 { final Calc calc; public BulkCalc1(Calc calc) { this.calc = calc; } public TreeMap<Integer, Double> calc(Iterator<BigThing> in) { TreeMap<Integer, Double> results = Maps.newTreeMap(); while (in.hasNext()) { BigThing o = in.next(); results.put(o.getId(), calc.calc(o)); } return results; } } class SafeIterator<T> { final Iterator<T> in; SafeIterator(Iterator<T> in) { this.in = in; } synchronized T nextOrNull() { if (in.hasNext()) { return in.next(); } return null; } } class BulkCalc2 { final Calc calc; final int numThreads; public BulkCalc2(Calc calc, int numThreads) { this.calc = calc; this.numThreads = numThreads; } public TreeMap<Integer, Double> calc(Iterator<BigThing> in) { ExecutorService e = Executors.newFixedThreadPool(numThreads); List<Future<?>> futures = Lists.newLinkedList(); final Map<Integer, Double> results = new MapMaker().concurrencyLevel(numThreads).makeMap(); final SafeIterator<BigThing> it = new SafeIterator<BigThing>(in); for (int i = 0; i < numThreads; i++) { futures.add(e.submit(new Runnable() { @Override public void run() { while (true) { BigThing o = it.nextOrNull(); if (o == null) { return; } results.put(o.getId(), calc.calc(o)); } } })); } e.shutdown(); for (Future<?> future : futures) { try { future.get(); } catch (InterruptedException ex) { // swallowing is OK } catch (ExecutionException ex) { throw Throwables.propagate(ex.getCause()); } } return new TreeMap<Integer, Double>(results); } }

Read the article
PowerShell Script to Deploy Multiple VM on Azure in Parallel #azure #powershell

- by Marco Russo (SQLBI)

This blog is usually dedicated to Business Intelligence and SQL Server, but I didn’t found easily on the web simple PowerShell scripts to help me deploying a number of virtual machines on Azure that I use for testing and development. Since I need to deploy, start, stop and remove many virtual machines created from a common image I created (you know, Tabular is not part of the standard images provided by Microsoft…), I wanted to minimize the time required to execute every operation from my Windows Azure PowerShell console (but I suggest you using Windows PowerShell ISE), so I also wanted to fire the commands as soon as possible in parallel, without losing the result in the console. In order to execute multiple commands in parallel, I used the Start-Job cmdlet, and using Get-Job and Receive-Job I wait for job completion and display the messages generated during background command execution. This technique allows me to reduce execution time when I have to deploy, start, stop or remove virtual machines. Please note that a few operations on Azure acquire an exclusive lock and cannot be really executed in parallel, but only one part of their execution time is subject to this lock. Thus, you obtain a better response time also in these scenarios (this is the case of the provisioning of a new VM). Finally, when you remove the VMs you still have the disk containing the virtual machine to remove. This cannot be done just after the VM removal, because you have to wait that the removal operation is completed on Azure. So I wrote a script that you have to run a few minutes after VMs removal and delete disks (and VHD) no longer related to a VM. I just check that the disk were associated to the original image name used to provision the VMs (so I don’t remove other disks deployed by other batches that I might want to preserve). These examples are specific for my scenario, if you need more complex configurations you have to change and adapt the code. But if your need is to create multiple instances of the same VM running in a workgroup, these scripts should be good enough. I prepared the following PowerShell scripts: ProvisionVMs: Provision many VMs in parallel starting from the same image. It creates one service for each VM. RemoveVMs: Remove all the VMs in parallel – it also remove the service created for the VM StartVMs: Starts all the VMs in parallel StopVMs: Stops all the VMs in parallel RemoveOrphanDisks: Remove all the disks no longer used by any VMs. Run this script a few minutes after RemoveVMs script. ProvisionVMs # Name of subscription $SubscriptionName = "Copy the SubscriptionName property you get from Get-AzureSubscription" # Name of storage account (where VMs will be deployed) $StorageAccount = "Copy the Label property you get from Get-AzureStorageAccount" function ProvisionVM( [string]$VmName ) { Start-Job -ArgumentList $VmName { param($VmName) $Location = "Copy the Location property you get from Get-AzureStorageAccount" $InstanceSize = "A5" # You can use any other instance, such as Large, A6, and so on $AdminUsername = "UserName" # Write the name of the administrator account in the new VM $Password = "Password" # Write the password of the administrator account in the new VM $Image = "Copy the ImageName property you get from Get-AzureVMImage" # You can list your own images using the following command: # Get-AzureVMImage | Where-Object {$_.PublisherName -eq "User" } New-AzureVMConfig -Name $VmName -ImageName $Image -InstanceSize $InstanceSize | Add-AzureProvisioningConfig -Windows -Password $Password -AdminUsername $AdminUsername| New-AzureVM -Location $Location -ServiceName "$VmName" -Verbose } } # Set the proper storage - you might remove this line if you have only one storage in the subscription Set-AzureSubscription -SubscriptionName $SubscriptionName -CurrentStorageAccount $StorageAccount # Select the subscription - this line is fundamental if you have access to multiple subscription # You might remove this line if you have only one subscription Select-AzureSubscription -SubscriptionName $SubscriptionName # Every line in the following list provisions one VM using the name specified in the argument # You can change the number of lines - use a unique name for every VM - don't reuse names # already used in other VMs already deployed ProvisionVM "test10" ProvisionVM "test11" ProvisionVM "test12" ProvisionVM "test13" ProvisionVM "test14" ProvisionVM "test15" ProvisionVM "test16" ProvisionVM "test17" ProvisionVM "test18" ProvisionVM "test19" ProvisionVM "test20" # Wait for all to complete While (Get-Job -State "Running") { Get-Job -State "Completed" | Receive-Job Start-Sleep 1 } # Display output from all jobs Get-Job | Receive-Job # Cleanup of jobs Remove-Job * # Displays batch completed echo "Provisioning VM Completed" RemoveVMs # Name of subscription $SubscriptionName = "Copy the SubscriptionName property you get from Get-AzureSubscription" function RemoveVM( [string]$VmName ) { Start-Job -ArgumentList $VmName { param($VmName) Remove-AzureService -ServiceName $VmName -Force -Verbose } } # Select the subscription - this line is fundamental if you have access to multiple subscription # You might remove this line if you have only one subscription Select-AzureSubscription -SubscriptionName $SubscriptionName # Every line in the following list remove one VM using the name specified in the argument # You can change the number of lines - use a unique name for every VM - don't reuse names # already used in other VMs already deployed RemoveVM "test10" RemoveVM "test11" RemoveVM "test12" RemoveVM "test13" RemoveVM "test14" RemoveVM "test15" RemoveVM "test16" RemoveVM "test17" RemoveVM "test18" RemoveVM "test19" RemoveVM "test20" # Wait for all to complete While (Get-Job -State "Running") { Get-Job -State "Completed" | Receive-Job Start-Sleep 1 } # Display output from all jobs Get-Job | Receive-Job # Cleanup Remove-Job * # Displays batch completed echo "Remove VM Completed" StartVMs # Name of subscription $SubscriptionName = "Copy the SubscriptionName property you get from Get-AzureSubscription" function StartVM( [string]$VmName ) { Start-Job -ArgumentList $VmName { param($VmName) Start-AzureVM -Name $VmName -ServiceName $VmName -Verbose } } # Select the subscription - this line is fundamental if you have access to multiple subscription # You might remove this line if you have only one subscription Select-AzureSubscription -SubscriptionName $SubscriptionName # Every line in the following list starts one VM using the name specified in the argument # You can change the number of lines - use a unique name for every VM - don't reuse names # already used in other VMs already deployed StartVM "test10" StartVM "test11" StartVM "test11" StartVM "test12" StartVM "test13" StartVM "test14" StartVM "test15" StartVM "test16" StartVM "test17" StartVM "test18" StartVM "test19" StartVM "test20" # Wait for all to complete While (Get-Job -State "Running") { Get-Job -State "Completed" | Receive-Job Start-Sleep 1 } # Display output from all jobs Get-Job | Receive-Job # Cleanup Remove-Job * # Displays batch completed echo "Start VM Completed" StopVMs # Name of subscription $SubscriptionName = "Copy the SubscriptionName property you get from Get-AzureSubscription" function StopVM( [string]$VmName ) { Start-Job -ArgumentList $VmName { param($VmName) Stop-AzureVM -Name $VmName -ServiceName $VmName -Verbose -Force } } # Select the subscription - this line is fundamental if you have access to multiple subscription # You might remove this line if you have only one subscription Select-AzureSubscription -SubscriptionName $SubscriptionName # Every line in the following list stops one VM using the name specified in the argument # You can change the number of lines - use a unique name for every VM - don't reuse names # already used in other VMs already deployed StopVM "test10" StopVM "test11" StopVM "test12" StopVM "test13" StopVM "test14" StopVM "test15" StopVM "test16" StopVM "test17" StopVM "test18" StopVM "test19" StopVM "test20" # Wait for all to complete While (Get-Job -State "Running") { Get-Job -State "Completed" | Receive-Job Start-Sleep 1 } # Display output from all jobs Get-Job | Receive-Job # Cleanup Remove-Job * # Displays batch completed echo "Stop VM Completed" RemoveOrphanDisks $Image = "Copy the ImageName property you get from Get-AzureVMImage" # You can list your own images using the following command: # Get-AzureVMImage | Where-Object {$_.PublisherName -eq "User" } # Remove all orphan disks coming from the image specified in $ImageName Get-AzureDisk | Where-Object {$_.attachedto -eq $null -and $_.SourceImageName -eq $ImageName} | Remove-AzureDisk -DeleteVHD -Verbose

Read the article
Processing Kinect v2 Color Streams in Parallel

- by Chris Gardner

Originally posted on: http://geekswithblogs.net/freestylecoding/archive/2014/08/20/processing-kinect-v2-color-streams-in-parallel.aspxProcessing Kinect v2 Color Streams in Parallel I've really been enjoying being a part of the Kinect for Windows Developer's Preview. The new hardware has some really impressive capabilities. However, with great power comes great system specs. Unfortunately, my little laptop that could is not 100% up to the task; I've had to get a little creative. The most disappointing thing I've run into is that I can't always cleanly display the color camera stream in managed code. I managed to strip the code down to what I believe is the bear minimum: using( ColorFrame _ColorFrame = e.FrameReference.AcquireFrame() ) { if( null == _ColorFrame ) return; BitmapToDisplay.Lock(); _ColorFrame.CopyConvertedFrameDataToIntPtr( BitmapToDisplay.BackBuffer, Convert.ToUInt32( BitmapToDisplay.BackBufferStride * BitmapToDisplay.PixelHeight ), ColorImageFormat.Bgra ); BitmapToDisplay.AddDirtyRect( new Int32Rect( 0, 0, _ColorFrame.FrameDescription.Width, _ColorFrame.FrameDescription.Height ) ); BitmapToDisplay.Unlock(); } With this snippet, I'm placing the converted Bgra32 color stream directly on the BackBuffer of the WriteableBitmap. This gives me pretty smooth playback, but I still get the occasional freeze for half a second. After a bit of profiling, I discovered there were a few problems. The first problem is the size of the buffer along with the conversion on the buffer. At this time, the raw image format of the data from the Kinect is Yuy2. This is great for direct video processing. It would be ideal if I had a WriteableVideo object in WPF. However, this is not the case. Further digging led me to the real problem. It appears that the SDK is converting the input serially. Let's think about this for a second. The color camera is a 1080p camera. As we should all know, this give us a native resolution of 1920 x 1080. This produces 2,073,600 pixels. Yuy2 uses 4 bytes per 2 pixel, for a buffer size of 4,147,200 bytes. Bgra32 uses 4 bytes per pixel, for a buffer size of 8,294,400 bytes. The SDK appears to be doing this on one thread. I started wondering if I chould do this better myself. I mean, I have 8 cores in my system. Why can't I use them all? The first problem is converting a Yuy2 frame into a Bgra32 frame. It is NOT trivial. I spent a day of research of just how to do this. In the end, I didn't even produce the best algorithm possible, but it did work. After I managed to get that to work, I knew my next step was the get the conversion operation off the UI Thread. This was a simple process of throwing the work into a Task. Of course, this meant I had to marshal the final write to the WriteableBitmap back to the UI thread. Finally, I needed to vectorize the operation so I could run it safely in parallel. This was, mercifully, not quite as hard as I thought it would be. I had my loop return an index to a pair of pixels. From there, I had to tell the loop to do everything for this pair of pixels. If you're wondering why I did it for pairs of pixels, look back above at the specification for the Yuy2 format. I won't go into full detail on why each 4 bytes contains 2 pixels of information, but rest assured that there is a reason why the format is described in that way. The first working attempt at this algorithm successfully turned my poor laptop into a space heater. I very quickly brought and maintained all 8 cores up to about 97% usage. That's when I remembered that obscure option in the Task Parallel Library where you could limit the amount of parallelism used. After a little trial and error, I discovered 4 parallel tasks was enough for most cases. This yielded the follow code: private byte ClipToByte( int p_ValueToClip ) { return Convert.ToByte( ( p_ValueToClip < byte.MinValue ) ? byte.MinValue : ( ( p_ValueToClip > byte.MaxValue ) ? byte.MaxValue : p_ValueToClip ) ); } private void ColorFrameArrived( object sender, ColorFrameArrivedEventArgs e ) { if( null == e.FrameReference ) return; // If you do not dispose of the frame, you never get another one... using( ColorFrame _ColorFrame = e.FrameReference.AcquireFrame() ) { if( null == _ColorFrame ) return; byte[] _InputImage = new byte[_ColorFrame.FrameDescription.LengthInPixels * _ColorFrame.FrameDescription.BytesPerPixel]; byte[] _OutputImage = new byte[BitmapToDisplay.BackBufferStride * BitmapToDisplay.PixelHeight]; _ColorFrame.CopyRawFrameDataToArray( _InputImage ); Task.Factory.StartNew( () => { ParallelOptions _ParallelOptions = new ParallelOptions(); _ParallelOptions.MaxDegreeOfParallelism = 4; Parallel.For( 0, Sensor.ColorFrameSource.FrameDescription.LengthInPixels / 2, _ParallelOptions, ( _Index ) => { // See http://msdn.microsoft.com/en-us/library/windows/desktop/dd206750(v=vs.85).aspx int _Y0 = _InputImage[( _Index << 2 ) + 0] - 16; int _U = _InputImage[( _Index << 2 ) + 1] - 128; int _Y1 = _InputImage[( _Index << 2 ) + 2] - 16; int _V = _InputImage[( _Index << 2 ) + 3] - 128; byte _R = ClipToByte( ( 298 * _Y0 + 409 * _V + 128 ) >> 8 ); byte _G = ClipToByte( ( 298 * _Y0 - 100 * _U - 208 * _V + 128 ) >> 8 ); byte _B = ClipToByte( ( 298 * _Y0 + 516 * _U + 128 ) >> 8 ); _OutputImage[( _Index << 3 ) + 0] = _B; _OutputImage[( _Index << 3 ) + 1] = _G; _OutputImage[( _Index << 3 ) + 2] = _R; _OutputImage[( _Index << 3 ) + 3] = 0xFF; // A _R = ClipToByte( ( 298 * _Y1 + 409 * _V + 128 ) >> 8 ); _G = ClipToByte( ( 298 * _Y1 - 100 * _U - 208 * _V + 128 ) >> 8 ); _B = ClipToByte( ( 298 * _Y1 + 516 * _U + 128 ) >> 8 ); _OutputImage[( _Index << 3 ) + 4] = _B; _OutputImage[( _Index << 3 ) + 5] = _G; _OutputImage[( _Index << 3 ) + 6] = _R; _OutputImage[( _Index << 3 ) + 7] = 0xFF; } ); Application.Current.Dispatcher.Invoke( () => { BitmapToDisplay.WritePixels( new Int32Rect( 0, 0, Sensor.ColorFrameSource.FrameDescription.Width, Sensor.ColorFrameSource.FrameDescription.Height ), _OutputImage, BitmapToDisplay.BackBufferStride, 0 ); } ); } ); } } This seemed to yield a results I wanted, but there was still the occasional stutter. This lead to what I realized was the second problem. There is a race condition between the UI Thread and me locking the WriteableBitmap so I can write the next frame. Again, I'm writing approximately 8MB to the back buffer. Then, I started thinking I could cheat. The Kinect is running at 30 frames per second. The WPF UI Thread runs at 60 frames per second. This made me not feel bad about exploiting the Composition Thread. I moved the bulk of the code from the FrameArrived handler into CompositionTarget.Rendering. Once I was in there, I polled from a frame, and rendered it if it existed. Since, in theory, I'm only killing the Composition Thread every other hit, I decided I was ok with this for cases where silky smooth video performance REALLY mattered. This ode looked like this: private byte ClipToByte( int p_ValueToClip ) { return Convert.ToByte( ( p_ValueToClip < byte.MinValue ) ? byte.MinValue : ( ( p_ValueToClip > byte.MaxValue ) ? byte.MaxValue : p_ValueToClip ) ); } void CompositionTarget_Rendering( object sender, EventArgs e ) { using( ColorFrame _ColorFrame = FrameReader.AcquireLatestFrame() ) { if( null == _ColorFrame ) return; byte[] _InputImage = new byte[_ColorFrame.FrameDescription.LengthInPixels * _ColorFrame.FrameDescription.BytesPerPixel]; byte[] _OutputImage = new byte[BitmapToDisplay.BackBufferStride * BitmapToDisplay.PixelHeight]; _ColorFrame.CopyRawFrameDataToArray( _InputImage ); ParallelOptions _ParallelOptions = new ParallelOptions(); _ParallelOptions.MaxDegreeOfParallelism = 4; Parallel.For( 0, Sensor.ColorFrameSource.FrameDescription.LengthInPixels / 2, _ParallelOptions, ( _Index ) => { // See http://msdn.microsoft.com/en-us/library/windows/desktop/dd206750(v=vs.85).aspx int _Y0 = _InputImage[( _Index << 2 ) + 0] - 16; int _U = _InputImage[( _Index << 2 ) + 1] - 128; int _Y1 = _InputImage[( _Index << 2 ) + 2] - 16; int _V = _InputImage[( _Index << 2 ) + 3] - 128; byte _R = ClipToByte( ( 298 * _Y0 + 409 * _V + 128 ) >> 8 ); byte _G = ClipToByte( ( 298 * _Y0 - 100 * _U - 208 * _V + 128 ) >> 8 ); byte _B = ClipToByte( ( 298 * _Y0 + 516 * _U + 128 ) >> 8 ); _OutputImage[( _Index << 3 ) + 0] = _B; _OutputImage[( _Index << 3 ) + 1] = _G; _OutputImage[( _Index << 3 ) + 2] = _R; _OutputImage[( _Index << 3 ) + 3] = 0xFF; // A _R = ClipToByte( ( 298 * _Y1 + 409 * _V + 128 ) >> 8 ); _G = ClipToByte( ( 298 * _Y1 - 100 * _U - 208 * _V + 128 ) >> 8 ); _B = ClipToByte( ( 298 * _Y1 + 516 * _U + 128 ) >> 8 ); _OutputImage[( _Index << 3 ) + 4] = _B; _OutputImage[( _Index << 3 ) + 5] = _G; _OutputImage[( _Index << 3 ) + 6] = _R; _OutputImage[( _Index << 3 ) + 7] = 0xFF; } ); BitmapToDisplay.WritePixels( new Int32Rect( 0, 0, Sensor.ColorFrameSource.FrameDescription.Width, Sensor.ColorFrameSource.FrameDescription.Height ), _OutputImage, BitmapToDisplay.BackBufferStride, 0 ); } }

Read the article
Parallel Computing Features Tour in VS2010

Just realized that I have not linked from here to a screencast I recorded a couple weeks ago that shows the API, parallel debugger and concurrency visualizer in VS2010. Take a few minutes to watch the VS2010 Parallel Computing Features Tour. Comments about this post welcome at the original blog.

Read the article
Event Driven Behavior Tree: deterministic traversal order with parallel

- by Heisenbug

I've studied several articles and listen some talks about behavior trees (mostly the resources available on AIGameDev by Alex J. Champandard). I'm particularly interested on event driven behavior trees, but I have still some doubts on how to implement them correctly using a scheduler. Just a quick recap: Standard Behavior Tree Each execution tick the tree is traversed from the root in depth-first order The execution order is implicitly expressed by the tree structure. So in the case of behaviors parented to a parallel node, even if both children are executed during the same traversing, the first leaf is always evaluated first. Event Driven BT During the first traversal the nodes (tasks) are enqueued using a scheduler which is responsible for updating only running ones every update The first traversal implicitly produce a depth-first ordered queue in the scheduler Non leaf nodes stays suspended mostly of the time. When a leaf node terminate(either with success or fail status) the parent (observer) is waked up allowing the tree traversing to continue and new tasks will be enqueued in the scheduler Without parallel nodes in the tree there will be up to 1 task running in the scheduler Without parallel nodes, the tasks in the queue(excluding dynamic priority implementation) will be always ordered in a depth-first order (is this right?) Now, from what is my understanding of a possible implementation, there are 2 requirements I think must be respected(I'm not sure though): Now, some requirements I think needs to be guaranteed by a correct implementation are: The result of the traversing should be independent from which implementation strategy is used. The traversing result must be deterministic. I'm struggling trying to guarantee both in the case of parallel nodes. Here's an example: Parallel_1 -->Sequence_1 ---->leaf_A ---->leaf_B -->leaf_C Considering a FIFO policy of the scheduler, before leaf_A node terminates the tasks in the scheduler are: P1(suspended),S1(suspended),leaf_A(running),leaf_C(running) When leaf_A terminate leaf_B will be scheduled (at the end of the queue), so the queue will become: P1(suspended),S1(suspended),leaf_C(running),leaf_B(running) In this case leaf_B will be executed after leaf_C at every update, meanwhile with a non event-driven traversing from the root node, the leaf_B will always be evaluated before leaf_A. So I have a couple of question: do I have understand correctly how event driven BT work? How can I guarantee the depth first order is respected with such an implementation? is this a common issue or am I missing something?

Read the article
Materials from Parallel Programming Pattern Presentation at Charlottesville .NET User Group Meeting

- by John Blumenauer

On Thursday, May 27, I had the privilege of presenting “A Look at Parallel Programming Patterns” at the Charlottesville .NET User Group’s monthly meeting. Those folks in attendance had many great questions and were obviously very interested in what the Parallel Task Library has to offer. The code and slides can be found HERE. Thanks again to CHODOTNET for having me in town to speak. If you experience any problems downloading the slides or code, please let me know.

Read the article
Loop Modifications to Enhance Data-Parallel Performance

In data-parallel applications, the same independent operation is performed repeatedly on different data. Loops are usually the most compute-intensive segments of data parallel applications, so loop optimizations directly impact performance.

Read the article
Parallel Computing Features Tour in VS2010

Just realized that I have not linked from here to a screencast I recorded a couple weeks ago that shows the API, parallel debugger and concurrency visualizer in VS2010. Take a few minutes to watch the VS2010 Parallel Computing Features Tour. Comments about this post welcome at the original blog.

Read the article
Granularity and Parallel Performance

One key to attaining good parallel performance is choosing the right granularity for the application. The goal is to determine the right granularity (usually larger is better) for parallel tasks, while avoiding load imbalance and communication overhead to achieve the best performance.

Read the article
Parallel programming patterns for C#?

- by VoidDweller

With Intel's launch of a Hexa-Core processor for the desktop, it looks like we can no longer wait for Microsoft to make many-core programming "easy". I just order a copy of Joe Duffy's book Concurrent Programming on Windows. This looks like a great place to start, though, I am hoping some of you who have been targeting multi/many core systems would point me to some good resources that have or would have helped on your projects?

Read the article
Parallel port with C#

- by Michael S.

Hello, I am trying to send data to LPT1 port with a C# program, unfortunately with no success.. I am using windows 7 x64. I tried both x86 and x64 (inpoutx64.dll) dll's.. With the x64 dll when I send: Output(888, 255); It just continues the program as everything went ok, but i can't see anything on my multimeter (only the static 0.02V).. I also tried the following with C++: int main () { int val = 0; printf("Enter a value\n"); scanf("%d", &val); _outp(0x378, val); getchar(); _outp(0x378, 0); return 0; } But it throws an exception: Unhandled exception at 0x01281428 in ppac.exe: 0xC0000096: Privileged instruction. I remember once I made something like this work on xp, I hope it's possible on win7 too.. Please help me with this. Thanks.

Read the article
Multithreading or task parallel library

- by Bruce Adams

I have an application which performs 30 independent tasks simultaneously using multithreading, each task retrieves data over http, performs a calculation and returns a result to the ui thread. Can I use tpl to perform the same tasks? Does tpl create 30 new threads and spread them over all the available cores, or does it just split the tasks over the available cores and use one thread per core? Will there be a performance boost using tpl over multithreading in this case?

Read the article
parallel-python error: RuntimeError("Socket connection is broken")

- by user288558

I am using a simple program to send a function: import pp nodes=('mosura02','mosura03','mosura04','mosura05','mosura06', 'mosura09','mosura10','mosura11','mosura12') nodes=('miner:60001',) def pptester(): js=pp.Server(ppservers=nodes) js.set_ncpus(0) tmp=[] for i in range(200): tmp.append(js.submit(ppworktest,(),(),('os',))) return tmp def ppworktest(): import os return os.system("uname -a") the result is: wkerzend@mosura:/home/wkerzend/tmp/ppython_test>ssh miner "source ~/coala_python_setup.sh;ppserver.py -d -p 60001" 2010-04-12 00:50:48,162 - pp - INFO - Creating server instance (pp-1.6.0) 2010-04-12 00:50:52,732 - pp - INFO - pp local server started with 32 workers 2010-04-12 00:50:52,732 - pp - DEBUG - Strarting network server interface=0.0.0.0 port=60001 Exception in thread client_socket: Traceback (most recent call last): File "/usr/lib64/python2.6/threading.py", line 525, in __bootstrap_inner self.run() File "/usr/lib64/python2.6/threading.py", line 477, in run self.__target(*self.__args, **self.__kwargs) File "/home/wkerzend/python_coala/bin/ppserver.py", line 161, in crun ctype = mysocket.receive() File "/home/wkerzend/python_coala/lib/python2.6/site-packages/pptransport.py", line 178, in receive raise RuntimeError("Socket connection is broken") RuntimeError: Socket connection is broken

Read the article
Shared value in parallel python

- by Jonathan

Hey all- I'm using ParallelPython to develop a performance-critical script. I'd like to share one value between the 8 processes running on the system. Please excuse the trivial example but this illustrates my question. def findMin(listOfElements): for el in listOfElements: if el < min: min = el import pp min = 0 myList = range(100000) job_server = pp.Server() f1 = job_server.submit(findMin, myList[0:25000]) f2 = job_server.submit(findMin, myList[25000:50000]) f3 = job_server.submit(findMin, myList[50000:75000]) f4 = job_server.submit(findMin, myList[75000:100000]) The pp docs don't seem to describe a way to share data across processes. Is it possible? If so, is there a standard locking mechanism (like in the threading module) to confirm that only one update is done at a time? l = Lock() if(el < min): l.acquire if(el < min): min = el l.release I understand I could keep a local min and compare the 4 in the main thread once returned, but by sharing the value I can do some better pruning of my BFS binary tree and potentially save a lot of loop iterations. Thanks- Jonathan

Read the article
Parallel Dynamic Programming

- by adk

Are there any good papers discussing how to take a dynamic program and parallelize it?

Read the article
Recommendations for Open Source Parallel programming IDE

- by Andrew Bolster

What are the best IDE's / IDE plugins / Tools, etc for programming with CUDA / MPI etc? I've been working in these frameworks for a short while but feel like the IDE could be doing more heavy lifting in terms of scaling and job processing interactions. (I usually use Eclipse or Netbeans, and usually in C/C++ with occasional Java, and its a vague question but I can't think of any more specific way to put it)

Read the article

< Previous Page | 2 3 4 5 6 7 8 9 10 11 12 13 | Next Page >