Search Results

Search found 813 results on 33 pages for 'concurrency'.

Page 17/33 | < Previous Page | 13 14 15 16 17 18 19 20 21 22 23 24  | Next Page >

  • Limiting object allocation over multiple threads

    - by John
    I have an application which retrieves and caches the results of a clients query. The client then requests different chunks of data and the application sends the relevant results and removes them from the cache. A new requirement for this application is that there needs to be a run-time configurable maximum number of results which may be cached. I've taken the naive approach and implemented this by using a counter under a lock which is incremented every time a result is cached and decremented whenever a result is removed from the cache. Unfortunately, this has drastically reduced the applications performance when processing a large number of concurrent requests. I have tried both a critical section lock and spin-lock; the performance improves a bit with a spin-lock, but is still unacceptably slow. Is there a better way to solve this problem which may improve performance? Right now I have a thread pool that services requests and each request is tied to a Request object which stores that cached results for that particular request. Here is a simplified pseudo code version of my current implementation: void ResultCallback( Result result, Request *request ) { lock totalResultsCached lock cachedLimit if( totalResultsCached + 1 > cachedLimit ) { unlock cachedLimit unlock totalResultsCached //cancel the request return; } ++totalResultsCached; unlock cachedLimit unlock totalResultsCached request.add(result) } void SendResults( int resultsToSend, Request *request ) { while ( resultsToSend > 0 ) { send(request.remove()) lock totalResultsCached --totalResultsCached unlock totalResultsCached --resultsToSend; } }

    Read the article

  • LinkedBlockingQueue limit ignored?

    - by tgguy
    I created a Java LinkedBlockingQueue like new LinkedBlockingQueue(1) to limit the size of the queue to 1. However, in my testing, this seems to be ignored and there is often several things in the queue at any given time. Why is this?

    Read the article

  • is this class thread safe?

    - by flash
    consider this class,with no instance variables and only methods which are non-synchronous can we infer from this info that this class in Thread-safe? public class test{ public void test1{ // do something } public void test2{ // do something } public void test3{ // do something } }

    Read the article

  • How to address thread-safety of service data used for maintaining static local variables in C++?

    - by sharptooth
    Consider the following scenario. We have a C++ function with a static local variable: void function() { static int variable = obtain(); //blahblablah } the function needs to be called from multiple threads concurrently, so we add a critical section to avoid concurrent access to the static local: void functionThreadSafe() { CriticalSectionLockClass lock( criticalSection ); static int variable = obtain(); //blahblablah } but will this be enough? I mean there's some magic that makes the variable being initialized no more than once. So there's some service data maintained by the runtime that indicates whether each static local has already been initialized. Will the critical section in the above code protect that service data as well? Is any extra protection required for this scenario?

    Read the article

  • Real World Examples of read-write in concurrent software

    - by Richard Fabian
    I'm looking for real world examples of needing read and write access to the same value in concurrent systems. In my opinion, many semaphores or locks are present because there's no known alternative (to the implementer,) but do you know of any patterns where mutexes seem to be a requirement? In a way I'm asking for candidates for the standard set of HARD problems for concurrent software in the real world.

    Read the article

  • What scenarios/settings will result in a query on SQL Server (2008) return stale data

    - by s1mm0t
    Most applications rarely need to display 100% accurate data. For example if this stack overflow question displays that there have been 0 views, when there have really been 10, it doesn't really matter. This is one way that the (perceived) performance of applications can be improved, by caching results and therefore sometimes not showing 100% accurate results. There are some cases where the data does need to be 100% accurate though. So if I run the query select * from Foo I want to be sure that the results are not stale. Now depending on how my database is set up, other activity on the database, use of transactions and isolation levels etc this query may or may not be a true reflection of the world. What scenario's and settings can people think of that will result in this query returning stale results or given that another connection is part way through a transaction that has updated this table, how can I guarantee that when the above query returns, the results will be accurate.

    Read the article

  • java threads don't see shared boolean changes

    - by andymur
    Here the code class Aux implements Runnable { private Boolean isOn = false; private String statusMessage; private final Object lock; public Aux(String message, Object lock) { this.lock = lock; this.statusMessage = message; } @Override public void run() { for (;;) { synchronized (lock) { if (isOn && "left".equals(this.statusMessage)) { isOn = false; System.out.println(statusMessage); } else if (!isOn && "right".equals(this.statusMessage)) { isOn = true; System.out.println(statusMessage); } if ("left".equals(this.statusMessage)) { System.out.println("left " + isOn); } } } } } public class Question { public static void main(String [] args) { Object lock = new Object(); new Thread(new Aux("left", lock)).start(); new Thread(new Aux("right", lock)).start(); } } In this code I expect to see: left, right, left right and so on, but when Thread with "left" message changes isOn to false, Thread with "right" message don't see it and I get ("right true" and "left false" console messages), left thread don't get isOn in true, but right Thread can't change it cause it always see old isOn value (true). When i add volatile modifier to isOn nothing changes, but if I change isOn to some class with boolean field and change this field then threads are see changes and it works fine Thanks in advance.

    Read the article

  • Terminating a long-executing thread and then starting a new one in response to user changing parameters via UI in an applet

    - by user1817170
    I have an applet which creates music using the JFugue API and plays it for the user. It allows the user to input a music phrase which the piece will be based on, or lets them choose to have a phrase generated randomly. I had been using the following method (successfully) to simply stop and start the music, which runs in a thread using the Player class from JFugue. I generate the music using my classes and user input from the applet GUI...then... private playerThread pthread; private Thread threadPlyr; private Player player; (from variables declaration) public void startMusic(Pattern p) // pattern is a JFugue object which holds the generated music { if (pthread == null) { pthread = new playerThread(); } else { pthread = null; pthread = new playerThread(); } if (threadPlyr == null) { threadPlyr = new Thread(pthread); } else { threadPlyr = null; threadPlyr = new Thread(pthread); } pthread.setPattern(p); threadPlyr.start(); } class playerThread implements Runnable // plays midi using jfugue Player { private Pattern pt; public void setPattern(Pattern p) { pt = p; } @Override public void run() { try { player.play(pt); // takes a couple mins or more to execute resetGUI(); } catch (Exception exception) { } } } And the following to stop music when user presses the stop/start button while Player.isPlaying() is true: public void stopMusic() { threadPlyr.interrupt(); threadPlyr = null; pthread = null; player.stop(); } Now I want to implement a feature which will allow the user to change parameters while the music is playing, create an updated music pattern, and then play THAT pattern. Basically, the idea is to make it simulate "real time" adjustments to the generated music for the user. Well, I have been beating my head against the wall on this for a couple of weeks. I've read all the standard java documentation, researched, read, and searched forums, and I have tried many different ideas, none of which have succeeded. The problem I've run into with all approaches I've tried is that when I start the new thread with the new, updated musical pattern, all the old threads ALSO start, and there is a cacophony of unintelligible noise instead of my desired output. From what I've gathered, the issue seems to be that all the methods I've come across require that the thread is able to periodically check the value of a "flag" variable and then shut itself down from within its "run" block in response to that variable. However, since my thread makes a call that takes several minutes minimum to execute (playing the music), and I need to terminate it WHILE it is executing this, there is really no safe way to do so. So, I'm wondering if there is something I'm missing when it comes to threads, or if perhaps I can accomplish my goal using a totally different approach. Any ideas or guidance is greatly appreciated! Thank you!

    Read the article

  • definition of wait-free (referring to parallel programming)

    - by tecuhtli
    In Maurice Herlihy paper "Wait-free synchronization" he defines wait-free: "A wait-free implementation of a concurrent data object is one that guarantees that any process can complete any operation in a finite number of steps, regardless the execution speeds on the other processes." www.cs.brown.edu/~mph/Herlihy91/p124-herlihy.pdf Let's take one operation op from the universe. (1) Does the definition mean: "Every process completes a certain operation op in the same finite number n of steps."? (2) Or does it mean: "Every process completes a certain operation op in any finite number of steps. So that a process can complete op in k steps another process in j steps, where k != j."? Just by reading the definition i would understand meaning (2). However this makes no sense to me, since a process executing op in k steps and another time in k + m steps meets the definition, but m steps could be a waiting loop. If meaning (2) is right, can anybody explain to me, why this describes wait-free? In contrast to (2), meaning (1) would guarantee that op is executed in the same number of steps k. So there can't be any additional steps m that are necessary e.g. in a waiting loop. Which meaning is right and why? Thanks a lot, sebastian

    Read the article

  • What to do if one library is not multi-threaded ?

    - by LB
    Hi, I would like to multi-thread an application, however one library i'm using is not multi-thread capable (i don't know what's the right word ? synchronized ?). What are my options ? As far as i know there's nothing in between threads and processes (Runtime.exec) in java (no abstraction in the jvm to have something like an isolated "java process"). How would you deal with that ?

    Read the article

  • Why does java.util.concurrent.ArrayBlockingQueue use 'while' loops instead of 'if' around calls to

    - by theFunkyEngineer
    I have been playing with my own version of this, using 'if', and all seems to be working fine. Of course this will break down horribly if signalAll() is used instead of signal(), but if only one thread at a time is notified, how can this go wrong? Their code here - check out the put() and take() methods; a simpler and more-to-the-point implementation can be seen at the top of the JavaDoc for Condition. Relevant portion of my implementation below. public Object get() { lock.lock(); try { if( items.size() < 1 ) hasItems.await(); Object poppedValue = items.getLast(); items.removeLast(); hasSpace.signal(); return poppedValue; } catch (InterruptedException e) { e.printStackTrace(); return null; } finally { lock.unlock(); } } public void put(Object item) { lock.lock(); try { if( items.size() >= capacity ) hasSpace.await(); items.addFirst(item); hasItems.signal(); return; } catch (InterruptedException e) { e.printStackTrace(); } finally { lock.unlock(); } } P.S. I know that generally, particularly in lib classes like this, one should let the exceptions percolate up.

    Read the article

  • Do these methods have same output?

    - by devrimbaris
    protected synchronized boolean isTimeoutOccured(Duration timeoutDuration) { DateTime now = new DateTime(); if (timeoutOccured == false) { if (new Duration(requestTime.getMillis(), now.getMillis()).compareTo(timeoutDuration) > 0) { timeoutOccured = true; } } return timeoutOccured; } protected boolean isTimeoutOccured2(Duration timeoutDuration) { return atomicTimeOut.compareAndSet(false, new Duration(requestTime.getMillis(), new DateTime().getMillis()).compareTo(timeoutDuration) > 0); }

    Read the article

  • Why is my concurrency capacity so low for my web app on a LAMP EC2 instance?

    - by AMF
    I come from a web developer background and have been humming along building my PHP app, using the CakePHP framework. The problem arose when I began the ab (Apache Bench) testing on the Amazon EC2 instance in which the app resides. I'm getting pretty horrendous average page load times, even though I'm running a c1.medium instance (2 cores, 2GB RAM), and I think I'm doing everything right. I would run: ab -n 200 -c 20 http://localhost/heavy-but-view-cached-page.php Here are the results: Concurrency Level: 20 Time taken for tests: 48.197 seconds Complete requests: 200 Failed requests: 0 Write errors: 0 Total transferred: 392111200 bytes HTML transferred: 392047600 bytes Requests per second: 4.15 [#/sec] (mean) Time per request: 4819.723 [ms] (mean) Time per request: 240.986 [ms] (mean, across all concurrent requests) Transfer rate: 7944.88 [Kbytes/sec] received While the ab test is running, I run VMStat, which shows that Swap stays at 0, CPU is constantly at 80-100% (although I'm not sure I can trust this on a VM), RAM utilization ramps up to about 1.6G (leaving 400M free). Load goes up to about 8 and site slows to a crawl. Here's what I think I'm doing right on the code side: In Chrome browser uncached pages typically load in 800-1000ms, and cached pages load in 300-500ms. Not stunning, but not terrible either. Thanks to view caching, there might be at most one DB query per page-load to write session data. So we can rule out a DB bottleneck. I have APC on. I am using Memcached to serve the view cache and other site caches. xhprof code profiler shows that cached pages take up 10MB-40MB in memory and 100ms - 1000ms in wall time. Pages that would be the worst offenders would look something like this in xhprof: Total Incl. Wall Time (microsec): 330,143 microsecs Total Incl. CPU (microsecs): 320,019 microsecs Total Incl. MemUse (bytes): 36,786,192 bytes Total Incl. PeakMemUse (bytes): 46,667,008 bytes Number of Function Calls: 5,195 My Apache config: KeepAlive On MaxKeepAliveRequests 100 KeepAliveTimeout 3 <IfModule mpm_prefork_module> StartServers 5 MinSpareServers 5 MaxSpareServers 10 MaxClients 120 MaxRequestsPerChild 1000 </IfModule> Is there something wrong with the server? Some gotcha with the EC2? Or is it my code? Some obvious setting I should look into? Too many DNS lookups? What am I missing? I really want to get to 1,000 concurrency capacity, but at this rate, it ain't gonna happen.

    Read the article

  • How to solve concurrency problems in ASP.NET Windows-Workflow and ActiveRecord/NHibernate?

    - by Famous Nerd
    I have found that ActiveRecord uses the Session-Scope object within the ASP.NET application and that if the web-site is read-write we can have a tug-o-war between the Workflow's own Data-Access SessionScope and that of the ASP.NET site. I would really like to have the WindowsWorkflow Runtime use the same object session as the web-site however, they have different lifetimes. Sometimes, a web-request may save a very simple piece of data which would execute quickly however, if the web-site kicks off a workflow process.. how can that workflow make data-modifications while still allowing the Appliaction_EndRequest to dispose the ASP.NET SessionScope ... it's like ownership of the SessionScope should be shared between the workflow runtime and the ASP.NET website. Manual Workflow Scheduler may be the Savior... if a workflow is synchronous and merely uses CallExternalMethod to interact with the Host then we could constrain all the data-access to the host.. then the sessionScope can exist once. This however, won't solve the problem of a delay activity... if this delay fires, we could need to update data... in this case we'd need an isolated Session Scope and concurrency may arise. This however, differs from SharePoint workflows where it seems that the SharePoint workflow can save data from the web and the workflow and that concurrency is handled through other means. Can anyone offer any suggestions on how to allow the workflow to manage data and play nice with ASP.NET web sites?

    Read the article

  • Please suggest good book/website to for Threads and Concurrency?

    - by learner
    I have gone through Head First Java and some other sites but I couldn't find complete stuff related to Threads and additional concurrency packages at one place. Please suggest a book/website which covers complete Threads with more details like Synchronize and locking of objects More detailed about volatile Visibility issues in Threads java.util.concurrent package java.util.concurrent.atomic package

    Read the article

  • how to handle db concurrency in client-server application in C#?

    - by RAJ K
    I am developing an application in C# WPF which will have Client-Server architecture (Client will do products sales billing). I am novice in this area and I asked this question to start my development process. Click here to view question. So, ultimately I have selected MySQl, WCf & WPF. Now I have one silly question. Do i need to handle DB concurrency explicitly in my application (like 3 clients inserting data same time) or MySQl will handle this without any conflict? To accomplish my project i thought, I will create a service in WCf which will do DB queries from client application. Do you have any suggestion to improve my application performance.

    Read the article

  • SQL SERVER – Concurrancy Problems and their Relationship with Isolation Level

    - by pinaldave
    Concurrency is simply put capability of the machine to support two or more transactions working with the same data at the same time. This usually comes up with data is being modified, as during the retrieval of the data this is not the issue. Most of the concurrency problems can be avoided by SQL Locks. There are four types of concurrency problems visible in the normal programming. 1)      Lost Update – This problem occurs when there are two transactions involved and both are unaware of each other. The transaction which occurs later overwrites the transactions created by the earlier update. 2)      Dirty Reads – This problem occurs when a transactions selects data that isn’t committed by another transaction leading to read the data which may not exists when transactions are over. Example: Transaction 1 changes the row. Transaction 2 changes the row. Transaction 1 rolls back the changes. Transaction 2 has selected the row which does not exist. 3)      Nonrepeatable Reads – This problem occurs when two SELECT statements of the same data results in different values because another transactions has updated the data between the two SELECT statements. Example: Transaction 1 selects a row, which is later on updated by Transaction 2. When Transaction A later on selects the row it gets different value. 4)      Phantom Reads – This problem occurs when UPDATE/DELETE is happening on one set of data and INSERT/UPDATE is happening on the same set of data leading inconsistent data in earlier transaction when both the transactions are over. Example: Transaction 1 is deleting 10 rows which are marked as deleting rows, during the same time Transaction 2 inserts row marked as deleted. When Transaction 1 is done deleting rows, there will be still rows marked to be deleted. When two or more transactions are updating the data, concurrency is the biggest issue. I commonly see people toying around with isolation level or locking hints (e.g. NOLOCK) etc, which can very well compromise your data integrity leading to much larger issue in future. Here is the quick mapping of the isolation level with concurrency problems: Isolation Dirty Reads Lost Update Nonrepeatable Reads Phantom Reads Read Uncommitted Yes Yes Yes Yes Read Committed No Yes Yes Yes Repeatable Read No No No Yes Snapshot No No No No Serializable No No No No I hope this 400 word small article gives some quick understanding on concurrency issues and their relation to isolation level. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • Are there concurrency problems when using -performSelector:withObject:afterDelay: ?

    - by mystify
    For example, I often use this: [self performSelector:@selector(doSomethingAfterDelay:) withObject:someObject afterDelay:someDelay]; Now, lets say I call this 10 times to perform at the exact same delay, like: [self performSelector:@selector(doSomethingAfterDelay:) withObject:someObject afterDelay:2.0]; [self performSelector:@selector(doSomethingAfterDelay:) withObject:someObject afterDelay:2.0]; [self performSelector:@selector(doSomethingAfterDelay:) withObject:someObject afterDelay:2.0]; [self performSelector:@selector(doSomethingAfterDelay:) withObject:someObject afterDelay:2.0]; [self performSelector:@selector(doSomethingAfterDelay:) withObject:someObject afterDelay:2.0]; [self performSelector:@selector(doSomethingAfterDelay:) withObject:someObject afterDelay:2.0]; [self performSelector:@selector(doSomethingAfterDelay:) withObject:someObject afterDelay:2.0]; [self performSelector:@selector(doSomethingAfterDelay:) withObject:someObject afterDelay:2.0]; [self performSelector:@selector(doSomethingAfterDelay:) withObject:someObject afterDelay:2.0]; - (void)doSomethingAfterDelay:(id)someObject { /* access an array, read stuff, write stuff, do different things that would suffer in multithreaded environments .... all operations are nonatomic! */ } I have observed pretty strange behavior when doing things like this. For my understanding, this method schedules a timer to fire on the current thread, so in this case the main thread. But since it doesn't create new threads, it actually should not be possible to run into concurrency problems, right?

    Read the article

  • Are there concurrency problems when using -performSelector:withObject:afterDelay: ?

    - by mystify
    For example, I often use this: [self performSelector:@selector(doSomethingAfterDelay:) withObject:someObject afterDelay:someDelay]; Now, lets say I call this 10 times to perform at the exact same delay, like: [self performSelector:@selector(doSomethingAfterDelay:) withObject:someObject afterDelay:2.0]; [self performSelector:@selector(doSomethingAfterDelay:) withObject:someObject afterDelay:2.0]; [self performSelector:@selector(doSomethingAfterDelay:) withObject:someObject afterDelay:2.0]; [self performSelector:@selector(doSomethingAfterDelay:) withObject:someObject afterDelay:2.0]; [self performSelector:@selector(doSomethingAfterDelay:) withObject:someObject afterDelay:2.0]; [self performSelector:@selector(doSomethingAfterDelay:) withObject:someObject afterDelay:2.0]; [self performSelector:@selector(doSomethingAfterDelay:) withObject:someObject afterDelay:2.0]; [self performSelector:@selector(doSomethingAfterDelay:) withObject:someObject afterDelay:2.0]; [self performSelector:@selector(doSomethingAfterDelay:) withObject:someObject afterDelay:2.0]; - (void)doSomethingAfterDelay:(id)someObject { /* access an array, read stuff, write stuff, do different things that would suffer in multithreaded environments .... all operations are nonatomic! */ } I have observed pretty strange behavior when doing things like this. For my understanding, this method schedules a timer to fire on the current thread, so in this case the main thread. But since it doesn't create new threads, it actually should not be possible to run into concurrency problems, right?

    Read the article

  • Matrix Multiplication with C++ AMP

    - by Daniel Moth
    As part of our API tour of C++ AMP, we looked recently at parallel_for_each. I ended that post by saying we would revisit parallel_for_each after introducing array and array_view. Now is the time, so this is part 2 of parallel_for_each, and also a post that brings together everything we've seen until now. The code for serial and accelerated Consider a naïve (or brute force) serial implementation of matrix multiplication  0: void MatrixMultiplySerial(std::vector<float>& vC, const std::vector<float>& vA, const std::vector<float>& vB, int M, int N, int W) 1: { 2: for (int row = 0; row < M; row++) 3: { 4: for (int col = 0; col < N; col++) 5: { 6: float sum = 0.0f; 7: for(int i = 0; i < W; i++) 8: sum += vA[row * W + i] * vB[i * N + col]; 9: vC[row * N + col] = sum; 10: } 11: } 12: } We notice that each loop iteration is independent from each other and so can be parallelized. If in addition we have really large amounts of data, then this is a good candidate to offload to an accelerator. First, I'll just show you an example of what that code may look like with C++ AMP, and then we'll analyze it. It is assumed that you included at the top of your file #include <amp.h> 13: void MatrixMultiplySimple(std::vector<float>& vC, const std::vector<float>& vA, const std::vector<float>& vB, int M, int N, int W) 14: { 15: concurrency::array_view<const float,2> a(M, W, vA); 16: concurrency::array_view<const float,2> b(W, N, vB); 17: concurrency::array_view<concurrency::writeonly<float>,2> c(M, N, vC); 18: concurrency::parallel_for_each(c.grid, 19: [=](concurrency::index<2> idx) restrict(direct3d) { 20: int row = idx[0]; int col = idx[1]; 21: float sum = 0.0f; 22: for(int i = 0; i < W; i++) 23: sum += a(row, i) * b(i, col); 24: c[idx] = sum; 25: }); 26: } First a visual comparison, just for fun: The beginning and end is the same, i.e. lines 0,1,12 are identical to lines 13,14,26. The double nested loop (lines 2,3,4,5 and 10,11) has been transformed into a parallel_for_each call (18,19,20 and 25). The core algorithm (lines 6,7,8,9) is essentially the same (lines 21,22,23,24). We have extra lines in the C++ AMP version (15,16,17). Now let's dig in deeper. Using array_view and extent When we decided to convert this function to run on an accelerator, we knew we couldn't use the std::vector objects in the restrict(direct3d) function. So we had a choice of copying the data to the the concurrency::array<T,N> object, or wrapping the vector container (and hence its data) with a concurrency::array_view<T,N> object from amp.h – here we used the latter (lines 15,16,17). Now we can access the same data through the array_view objects (a and b) instead of the vector objects (vA and vB), and the added benefit is that we can capture the array_view objects in the lambda (lines 19-25) that we pass to the parallel_for_each call (line 18) and the data will get copied on demand for us to the accelerator. Note that line 15 (and ditto for 16 and 17) could have been written as two lines instead of one: extent<2> e(M, W); array_view<const float, 2> a(e, vA); In other words, we could have explicitly created the extent object instead of letting the array_view create it for us under the covers through the constructor overload we chose. The benefit of the extent object in this instance is that we can express that the data is indeed two dimensional, i.e a matrix. When we were using a vector object we could not do that, and instead we had to track via additional unrelated variables the dimensions of the matrix (i.e. with the integers M and W) – aren't you loving C++ AMP already? Note that the const before the float when creating a and b, will result in the underling data only being copied to the accelerator and not be copied back – a nice optimization. A similar thing is happening on line 17 when creating array_view c, where we have indicated that we do not need to copy the data to the accelerator, only copy it back. The kernel dispatch On line 18 we make the call to the C++ AMP entry point (parallel_for_each) to invoke our parallel loop or, as some may say, dispatch our kernel. The first argument we need to pass describes how many threads we want for this computation. For this algorithm we decided that we want exactly the same number of threads as the number of elements in the output matrix, i.e. in array_view c which will eventually update the vector vC. So each thread will compute exactly one result. Since the elements in c are organized in a 2-dimensional manner we can organize our threads in a two-dimensional manner too. We don't have to think too much about how to create the first argument (a grid) since the array_view object helpfully exposes that as a property. Note that instead of c.grid we could have written grid<2>(c.extent) or grid<2>(extent<2>(M, N)) – the result is the same in that we have specified M*N threads to execute our lambda. The second argument is a restrict(direct3d) lambda that accepts an index object. Since we elected to use a two-dimensional extent as the first argument of parallel_for_each, the index will also be two-dimensional and as covered in the previous posts it represents the thread ID, which in our case maps perfectly to the index of each element in the resulting array_view. The kernel itself The lambda body (lines 20-24), or as some may say, the kernel, is the code that will actually execute on the accelerator. It will be called by M*N threads and we can use those threads to index into the two input array_views (a,b) and write results into the output array_view ( c ). The four lines (21-24) are essentially identical to the four lines of the serial algorithm (6-9). The only difference is how we index into a,b,c versus how we index into vA,vB,vC. The code we wrote with C++ AMP is much nicer in its indexing, because the dimensionality is a first class concept, so you don't have to do funny arithmetic calculating the index of where the next row starts, which you have to do when working with vectors directly (since they store all the data in a flat manner). I skipped over describing line 20. Note that we didn't really need to read the two components of the index into temporary local variables. This mostly reflects my personal choice, in some algorithms to break down the index into local variables with names that make sense for the algorithm, i.e. in this case row and col. In other cases it may i,j,k or x,y,z, or M,N or whatever. Also note that we could have written line 24 as: c(idx[0], idx[1])=sum  or  c(row, col)=sum instead of the simpler c[idx]=sum Targeting a specific accelerator Imagine that we had more than one hardware accelerator on a system and we wanted to pick a specific one to execute this parallel loop on. So there would be some code like this anywhere before line 18: vector<accelerator> accs = MyFunctionThatChoosesSuitableAccelerators(); accelerator acc = accs[0]; …and then we would modify line 18 so we would be calling another overload of parallel_for_each that accepts an accelerator_view as the first argument, so it would become: concurrency::parallel_for_each(acc.default_view, c.grid, ...and the rest of your code remains the same… how simple is that? Comments about this post by Daniel Moth welcome at the original blog.

    Read the article

  • hosting simple python scripts in a container to handle concurrency, configuration, caching, etc.

    - by Justin Grant
    My first real-world Python project is to write a simple framework (or re-use/adapt an existing one) which can wrap small python scripts (which are used to gather custom data for a monitoring tool) with a "container" to handle boilerplate tasks like: fetching a script's configuration from a file (and keeping that info up to date if the file changes and handle decryption of sensitive config data) running multiple instances of the same script in different threads instead of spinning up a new process for each one expose an API for caching expensive data and storing persistent state from one script invocation to the next Today, script authors must handle the issues above, which usually means that most script authors don't handle them correctly, causing bugs and performance problems. In addition to avoiding bugs, we want a solution which lowers the bar to create and maintain scripts, especially given that many script authors may not be trained programmers. Below are examples of the API I've been thinking of, and which I'm looking to get your feedback about. A scripter would need to build a single method which takes (as input) the configuration that the script needs to do its job, and either returns a python object or calls a method to stream back data in chunks. Optionally, a scripter could supply methods to handle startup and/or shutdown tasks. HTTP-fetching script example (in pseudocode, omitting the actual data-fetching details to focus on the container's API): def run (config, context, cache) : results = http_library_call (config.url, config.http_method, config.username, config.password, ...) return { html : results.html, status_code : results.status, headers : results.response_headers } def init(config, context, cache) : config.max_threads = 20 # up to 20 URLs at one time (per process) config.max_processes = 3 # launch up to 3 concurrent processes config.keepalive = 1200 # keep process alive for 10 mins without another call config.process_recycle.requests = 1000 # restart the process every 1000 requests (to avoid leaks) config.kill_timeout = 600 # kill the process if any call lasts longer than 10 minutes Database-data fetching script example might look like this (in pseudocode): def run (config, context, cache) : expensive = context.cache["something_expensive"] for record in db_library_call (expensive, context.checkpoint, config.connection_string) : context.log (record, "logDate") # log all properties, optionally specify name of timestamp property last_date = record["logDate"] context.checkpoint = last_date # persistent checkpoint, used next time through def init(config, context, cache) : cache["something_expensive"] = get_expensive_thing() def shutdown(config, context, cache) : expensive = cache["something_expensive"] expensive.release_me() Is this API appropriately "pythonic", or are there things I should do to make this more natural to the Python scripter? (I'm more familiar with building C++/C#/Java APIs so I suspect I'm missing useful Python idioms.) Specific questions: is it natural to pass a "config" object into a method and ask the callee to set various configuration options? Or is there another preferred way to do this? when a callee needs to stream data back to its caller, is a method like context.log() (see above) appropriate, or should I be using yield instead? (yeild seems natural, but I worry it'd be over the head of most scripters) My approach requires scripts to define functions with predefined names (e.g. "run", "init", "shutdown"). Is this a good way to do it? If not, what other mechanism would be more natural? I'm passing the same config, context, cache parameters into every method. Would it be better to use a single "context" parameter instead? Would it be better to use global variables instead? Finally, are there existing libraries you'd recommend to make this kind of simple "script-running container" easier to write?

    Read the article

< Previous Page | 13 14 15 16 17 18 19 20 21 22 23 24  | Next Page >