Search Results

Search found 3521 results on 141 pages for 'parallel computing'.

Page 82/141 | < Previous Page | 78 79 80 81 82 83 84 85 86 87 88 89  | Next Page >

  • Open Cl.I just need to convert the code to using two work items in the for loop .Currentlly it uses one

    - by user1660282
    spmv_csr_scalar_kernel(const int num_rows , const int * ptr , const int * indices , const float * data , const float * x, float * y) { int row = get_global_id(0); if(row < num_rows) { float dot = 0; int row_start = ptr[row]; int row_end = ptr[row+1]; for (int jj = row_start; jj < row_end; jj++) { dot += data[jj] * x[indices[jj]]; } y[row] += dot; } } Above is the Open Cl code for multiplying a sparse matrix in CSR format with a Column vector.It uses one global work item per for loop.Can anybody help me in using two work items in each for loop.I am new to open cl and get a lot of issues if I modify even the smallest thing.Please help me.This a part of my project.I made it this parallel but I wanna make it more parallel.Please help me if you can.plzzzz A single work item executes the for loop from row_start to row_end.I want that this row or for loop is further divided into two parts each executed by a single work item.How do I go on accomplishing that? This is what I could come up with but its returning the wrong output.plzz help __kernel void mykernel(__global int* colvector,__global int* val,__global int* result,__global int* index,__global int* rowptr,__global int* sync) { __global int vals[8]={0,0,0,0,0,0,0,0}; for(int i=0;i<4;i++) { result[i]=0; } barrier(CLK_GLOBAL_MEM_FENCE); int thread_id=get_global_id(0); int warp_id=thread_id/2; int lane=(thread_id)&1; int row=warp_id; if(row<4) { int row_start = rowptr[row]; int row_end = rowptr[row+1]; vals[thread_id]=0; for (int i = row_start+lane; i<row_end; i+=2) { vals[thread_id]+=val[i]*colvector[index[i]]; } vals[thread_id]+=vals[thread_id+1]; if(lane==0){ result[row] += vals[thread_id]; } } }

    Read the article

  • Property transfers in soapui

    - by Scobal
    I'm trying to write parallel tests in soapui and need to transfer properties between the test steps I currently have 3 tests steps: Execute legacy request Execute new request XML diff the two responses in a groovy script I've found a lot of blogs about picking values out with xpaths, but nothing about passing the full response through. My questions is how do I fill out the source and target boxes in the property transfer editor?

    Read the article

  • Overwhelmed by design patterns... where to begin?

    - by Pete
    I am writing a simple prototype code to demonstrate & profile I/O schemes (HDF4, HDF5, HDF5 using parallel IO, NetCDF, etc.) for a physics code. Since focus is on IO, the rest of the program is very simple: class Grid { public: floatArray x,y,z; }; class MyModel { public: MyModel(const int &nip1, const int &njp1, const int &nkp1, const int &numProcs); Grid grid; map<string, floatArray> plasmaVariables; }; Where floatArray is a simple class that lets me define arbitrary dimensioned arrays and do mathematical operations on them (i.e. x+y is point-wise addition). Of course, I could use better encapsulation (write accessors/setters, etc.), but that's not the concept I'm struggling with. For the I/O routines, I am envisioning applying simple inheritance: Abstract I/O class defines read & write functions to fill in the "myModel" object HDF4 derived class HDF5 HDF5 using parallel IO NetCDF etc... The code should read data in any of these formats, then write out to any of these formats. In the past, I would add an AbstractIO member to myModel and create/destroy this object depending on which I/O scheme I want. In this way, I could do something like: myModelObj.ioObj->read('input.hdf') myModelObj.ioObj->write('output.hdf') I have a bit of OOP experience but very little on the Design Patterns front, so I recently acquired the Gang of Four book "Design Patterns: Elements of Reusable Object-Oriented Software". OOP designers: Which pattern(s) would you recommend I use to integrate I/O with the myModel object? I am interested in answering this for two reasons: To learn more about design patterns in general Apply what I learn to help refactor an large old crufty/legacy physics code to be more human-readable & extensible. I am leaning towards applying the Decerator pattern to myModel, so I can attach the I/O responsibilities dynamically to myModel (i.e. whether to use HDF4, HDF5, etc.). However, I don't feel very confident that this is the best pattern to apply. Reading the Gang of Four book cover-to-cover before I start coding feels like a good way to develop an unhealthy caffeine addiction. What patterns do you recommend?

    Read the article

  • Ruby LESS gem equivalent in Python

    - by Sean M
    The Ruby LESS gem looks awesome - and I am working on a Python/Pylons web project where it would be highly useful. CSS is, as someone we're all familiar with recently wrote about, clunky in some important ways. So I'd like to make it easier on myself. Is there an existing Python module or library that provides parallel functionality?

    Read the article

  • collection of system properties using web browser

    - by vishwa
    hi i am doing distributed computing environment........For the applications need to get distributed to different clients connected to the server in the network,i prefered to collect the client's system properties like free memory available in the client's system,so that i could distribute d application according to that efficiently......so kindly project me wth some idea.thanks in advance

    Read the article

  • Using MPI under VC++ MFC project?

    - by Mike
    Does any body know how can I use MS_MPI in my VC++ MFC project? I already have a big MFC project and I only want to use parallel processing in a part of it with MPI. (I know how to use MPI in a separate code, but I don't know how to integrate it with my VC++ MFC project)

    Read the article

  • How this pthread actually works?

    - by user289013
    I am actually on my project on compiler with SMP, and want to code with pthreads and heard about many parallel things open mpi and so on, So to start with how this thread is allocated to core while calling pthread,Is there any way to give threads to different cores by pthreads?

    Read the article

  • a script translatable to JavaScript with callback-hell automatic avoider :-)

    - by m1uan
    I looking for "translator" for JavaScript like already is CoffeScript, which will be work for example with forEach (inspired by Groovy) myArray.forEach() -> val, idx { // do something with value and idx } translate to JavaScript myArray.forEach(function(val, idx){ // do something with value and idx }); or something more usefull... function event(cb){ foo()-> err, data1; bar(data1)-> err, data2; cb(data2); } the method are encapsulated function event(cb){ foo(function(err,data1){ bar(data1, function(err, data2) { cb(data2); }); }); } I want ask if similar "compiler" to JavaScript like this even better already doesn't exists? What would be super cool... my code in nodejs looks mostly like this :-) function dealer(cb){ async.parallel([ function(pcb){ async.watterfall([function(wcb){ first(function(a){ wcb(a); }); }, function(a, wcb){ thirt(a, function(c){ wcb(c); }); fourth(a, function(d){ // dealing with “a” as well and nobody care my result }); }], function(err, array_with_ac){ pcb(array_with_ac); }); }, function(pcb){ second(function(b){ pcb(b);}); }], function(err, data){ cb(data[0][0]+data[1]+data[0][1]); // dealing with “a” “b” and “c” not with “d” }); } but, look how beautiful and readable the code could be: function dealer(cb){ first() -> a; second() -> b; third(a) -> c; // dealing with “a” fourth(a) -> d; // dealing with “a” as well and nobody care about my result cb(a+b+c); // dealing with “a” “b” and “c” not with “d” } yes this is ideal case when the translator auto-decide, method need to be run as parallel and method need be call after finish another method. I can imagine it's works Please, do you know about something similar? Thank you for any advice;-)

    Read the article

  • Rectangles Covering

    - by den bardadym
    I have N rectangles with sides parallel Ox and Oy. Exists another rectangele (model). I need create algorithm, which can tell: is model covered by N rectangles? and code him. I have some ideas. First I think need sort rectangles by left side (it can be done by O(n log n)). Then I think need use vertical sweeping line. Thanks.

    Read the article

  • Flex unit testing with ANT vs Flash Builder 4

    - by peterlindstrom21
    I have just tried setup unit testing in Flash Builder 4, and it working nicely. A setup of a parallel test source structure and using Flash Builder 4:s new TestCase and new TestSuite I was up and running with some testcases within minutes. But now I want to compile them from a ant flex task, the Flash Builder generates FlexUnitApplication.mxml and FlexUnitCompilerApplication.mxml. Is there a nice way to build the unit tests with ant using these? I cant find any sample where this is done.

    Read the article

  • When is the onPreExecute called on an AsyncTask running parallely or concurrently?

    - by Debarshi Dutta
    I am using Android HoneyComb.I need to execute some tasks parallely and I am using AsyncTask's public final AsyncTask executeOnExecutor (Executor exec, Params... params) method.In each separate thread I am computing some values and I need to store then in an ArrayList.I must then sort all the values in the arrayList and then display it in the UI.Now my question is if one of the thread gets completed earlier than the other then will it immediately call the onPostExecute method or onPostExecute method will be called after all the background threads have been completed?MY program implementation depends on what occurs here.

    Read the article

  • Asynchronous Delegates Vs Thread/ThreadPool?

    - by claws
    Hello, I need to execute 3 parallel tasks and after completion of each task they should call the same function which prints out the results. I don't understand in .net why we have Asychronous calling (delegate.BeginInvoke() & delegate.EndInvoke()) as well as Thread class? I'm little confused which one to use when? Now in this particular case, what should I use Asychronous calling or Thread class? I'm using C#.

    Read the article

  • Multithreading consulting service

    - by Gustavo Paulillo
    Hello. I am creating a service, that needs to perform the following tasks: consult bank services and persist data into DB. The dificult is: Its needed to execute each process in parallel. I mean the better choice is implementing a multithreading service, running each instance per thread. But how its done? Thanks

    Read the article

  • How do you raise a Java BigInteger to the power of a BigInteger without doing modular arithmetic?

    - by angstrom91
    I'm doing some large integer computing, and I need to raise a BigInteger to the power of another BigInteger. The .pow() method does what I want, but takes an int value as an argument. The .modPow method takes a BigInteger as an argument, but I do not want an answer congruent to the value I'm trying to compute. My BigInteger exponent is too large to be represented as an int, can someone suggest a way to work around this limitation?

    Read the article

  • Jave JIT compiler compiles at compile time or runtime ?

    - by Tony
    From wiki: In computing, just-in-time compilation (JIT), also known as dynamic translation, is a technique for improving the runtime performance of a computer program. So I guess JVM has another compiler, not javac, that only compiles bytecode to machine code at runtime, while javac compiles sources to bytecode,is that right?

    Read the article

  • how many processors can I get in a block on cuda GPU?

    - by Vickey
    hi all, I have two questions to ask 1) If I create only one block of threads in cuda and execute the my parallel program on it then is it possible that more than one processors would be given to single block so that my program get some benefit of multiprocessor platform ? 2) can I synchronize the threads of different blocks ? if yes please give some hints. Thanks in advance since I know I'll get replies as always I get.

    Read the article

  • List goals/targets in GNU make

    - by BitShifter
    I have a fairly large makefile that creates a number of targets on the fly by computing names from variables. (eg foo$(VAR) : $(PREREQS)). Is there any way that gnu make can be convinced to spit out a list of targets after it has expanded these variables?

    Read the article

  • OpenMP implementations in VC++ 2008, 2010

    - by John
    Depending on implementation, OMP can be quite useful to parallelize fairly arbitrary bits of code - e.g a parallel section inside a method that calls two independent methods - or it can be bad. It depends on how threads are created/cached, I think. How does the VC++ 2008 implementation work? And is the 2010 implementation significantly different in terms of features and performance/flexibility?

    Read the article

  • What is your favorite NumPy feature?

    - by Gökhan Sever
    Share your favourite NumPy features / tips & tricks. Please try to limit one feature per line. The question is posted in parallel at ask.scipy.org We welcome you to join the conversation there -with the main idea of collecting the Scientific Python related questions under one roof. Feel free to dual-post or post at your favourite site...

    Read the article

  • run multiple programs in linux

    - by Betamoo
    I am trying to write a .sh file that runs many programs simultaneously I tried this prog1 prog2 But that runs prog1 then waits until prog1 ends and then starts prog2... So how can I run them in parallel? Thanks

    Read the article

< Previous Page | 78 79 80 81 82 83 84 85 86 87 88 89  | Next Page >