Search Results

Search found 5026 results on 202 pages for 'blocked threads'.

Page 58/202 | < Previous Page | 54 55 56 57 58 59 60 61 62 63 64 65 | Next Page >

Haskell vs Erlang for web services

- by Zachary K

I am looking to start an experimental project using a functional language and am trying to decide beween Erlang and Haskell, and both have some points that I really like. I like Haskell's strong type system and purity. I have a feeling it will make it easier to write really reliable code. And I think that the power of haskell will make some of what I want to do much easier. On the minus side I get the feeling that some of the Frameworks for doing web stuff on Haskell such as Yesod are not as advanced as their Erlang counter parts. I rather like the Erlang approach to threads and to fault tollerence. I have a feeling that the scalability of Erlang could be a major plus. Which leeds to to my question, what has people's exerience been in implementing web application backends in both Haskell and Erlang. Are there packages for Haskell to provide some of the lightweight threads and actors that one has in Erlang?

Read the article
IBM "per core" comparisons for SPECjEnterprise2010

- by jhenning

I recently stumbled upon a blog entry from Roman Kharkovski (an IBM employee) comparing some SPECjEnterprise2010 results for IBM vs. Oracle. Mr. Kharkovski's blog claims that SPARC delivers half the transactions per core vs. POWER7. Prior to any argument, I should say that my predisposition is to like Mr. Kharkovski, because he says that his blog is intended to be factual; that the intent is to try to avoid marketing hype and FUD tactic; and mostly because he features a picture of himself wearing a bike helmet (me too). Therefore, in a spirit of technical argument, rather than FUD fight, there are a few areas in his comparison that should be discussed. Scaling is not free For any benchmark, if a small system scores 13k using quantity R1 of some resource, and a big system scores 57k using quantity R2 of that resource, then, sure, it's tempting to divide: is 13k/R1 > 57k/R2 ? It is tempting, but not necessarily educational. The problem is that scaling is not free. Building big systems is harder than building small systems. Scoring 13k/R1 on a little system provides no guarantee whatsoever that one can sustain that ratio when attempting to handle more than 4 times as many users. Choosing the denominator radically changes the picture When ratios are used, one can vastly manipulate appearances by the choice of denominator. In this case, lots of choices are available for the resource to be compared (R1 and R2 above). IBM chooses to put cores in the denominator. Mr. Kharkovski provides some reasons for that choice in his blog entry. And yet, it should be noted that the very concept of a core is: arbitrary: not necessarily comparable across vendors; fluid: modern chips shift chip resources in response to load; and invisible: unless you have a microscope, you can't see it. By contrast, one can actually see processor chips with the naked eye, and they are a bit easier to count. If we put chips in the denominator instead of cores, we get: 13161.07 EjOPS / 4 chips = 3290 EjOPS per chip for IBM vs 57422.17 EjOPS / 16 chips = 3588 EjOPS per chip for Oracle The choice of denominator makes all the difference in the appearance. Speaking for myself, dividing by chips just seems to make more sense, because: I can see chips and count them; and I can accurately compare the number of chips in my system to the count in some other vendor's system; and Tthe probability of being able to continue to accurately count them over the next 10 years of microprocessor development seems higher than the probability of being able to accurately and comparably count "cores". SPEC Fair use requirements Speaking as an individual, not speaking for SPEC and not speaking for my employer, I wonder whether Mr. Kharkovski's blog article, taken as a whole, meets the requirements of the SPEC Fair Use rule www.spec.org/fairuse.html section I.D.2. For example, Mr. Kharkovski's footnote (1) begins Results from http://www.spec.org as of 04/04/2013 Oracle SUN SPARC T5-8 449 EjOPS/core SPECjEnterprise2010 (Oracle's WLS best SPECjEnterprise2010 EjOPS/core result on SPARC). IBM Power730 823 EjOPS/core (World Record SPECjEnterprise2010 EJOPS/core result) The questionable tactic, from a Fair Use point of view, is that there is no such metric at the designated location. At www.spec.org, You can find the SPEC metric 57422.17 SPECjEnterprise2010 EjOPS for Oracle and You can also find the SPEC metric 13161.07 SPECjEnterprise2010 EjOPS for IBM. Despite the implication of the footnote, you will not find any mention of 449 nor anything that says 823. SPEC says that you can, under its fair use rule, derive your own values; but it emphasizes: "The context must not give the appearance that SPEC has created or endorsed the derived value." Substantiation and transparency Although SPEC disclaims responsibility for non-SPEC information (section I.E), it says that non-SPEC data and methods should be accurate, should be explained, should be substantiated. Unfortunately, it is difficult or impossible for the reader to independently verify the pricing: Were like units compared to like (e.g. list price to list price)? Were all components (hw, sw, support) included? Were all fees included? Note that when tpc.org shows IBM pricing, there are often items such as "PROCESSOR ACTIVATION" and "MEMORY ACTIVATION". Without the transparency of a detailed breakdown, the pricing claims are questionable. T5 claim for "Fastest Processor" Mr. Kharkovski several times questions Oracle's claim for fastest processor, writing You see, when you publish industry benchmarks, people may actually compare your results to other vendor's results. Well, as we performance people always say, "it depends". If you believe in performance-per-core as the primary way of looking at the world, then yes, the POWER7+ is impressive, spending its chip resources to support up to 32 threads (8 cores x 4 threads). Or, it just might be useful to consider performance-per-chip. Each SPARC T5 chip allows 128 hardware threads to be simultaneously executing (16 cores x 8 threads). The Industry Standard Benchmark that focuses specifically on processor chip performance is SPEC CPU2006. For this very well known and popular benchmark, SPARC T5: provides better performance than both POWER7 and POWER7+, for 1 chip vs. 1 chip, for 8 chip vs. 8 chip, for integer (SPECint_rate2006) and floating point (SPECfp_rate2006), for Peak tuning and for Base tuning. For example, at the 8-chip level, integer throughput (SPECint_rate2006) is: 3750 for SPARC 2170 for POWER7+. You can find the details at the March 2013 BestPerf CPU2006 page SPEC is a trademark of the Standard Performance Evaluation Corporation, www.spec.org. The two specific results quoted for SPECjEnterprise2010 are posted at the URLs linked from the discussion. Results for SPEC CPU2006 were verified at spec.org 1 July 2013, and can be rechecked here.

Read the article
Why lock statements don't scale

- by Alex.Davies

We are going to have to stop using lock statements one day. Just like we had to stop using goto statements. The problem is similar, they're pretty easy to follow in small programs, but code with locks isn't composable. That means that small pieces of program that work in isolation can't necessarily be put together and work together. Of course actors scale fine :) Why lock statements don't scale as software gets bigger Deadlocks. You have a program with lots of threads picking up lots of locks. You already know that if two of your threads both try to pick up a lock that the other already has, they will deadlock. Your program will come to a grinding halt, and there will be fire and brimstone. "Easy!" you say, "Just make sure all the threads pick up the locks in the same order." Yes, that works. But you've broken composability. Now, to add a new lock to your code, you have to consider all the other locks already in your code and check that they are taken in the right order. Algorithm buffs will have noticed this approach means it takes quadratic time to write a program. That's bad. Why lock statements don't scale as hardware gets bigger Memory bus contention There's another headache, one that most programmers don't usually need to think about, but is going to bite us in a big way in a few years. Locking needs exclusive use of the entire system's memory bus while taking out the lock. That's not too bad for a single or dual-core system, but already for quad-core systems it's a pretty large overhead. Have a look at this blog about the .NET 4 ThreadPool for some numbers and a weird analogy (see the author's comment). Not too bad yet, but I'm scared my 1000 core machine of the future is going to go slower than my machine today! I don't know the answer to this problem yet. Maybe some kind of per-core work queue system with hierarchical work stealing. Definitely hardware support. But what I do know is that using locks specifically prevents any solution to this. We should be abstracting our code away from the details of locks as soon as possible, so we can swap in whatever solution arrives when it does. NAct uses locks at the moment. But my advice is that you code using actors (which do scale well as software gets bigger). And when there's a better way of implementing actors that'll scale well as hardware gets bigger, only NAct needs to work out how to use it, and your program will go fast on it's own.

Read the article
A Method for Reducing Contention and Overhead in Worker Queues for Multithreaded Java Applications

- by Janice J. Heiss

A java.net article, rich in practical resources, by IBM India Labs’ Sathiskumar Palaniappan, Kavitha Varadarajan, and Jayashree Viswanathan, explores the challenge of writing code in a way that that effectively makes use of the resources of modern multicore processors and multiprocessor servers.As the article states: “Many server applications, such as Web servers, application servers, database servers, file servers, and mail servers, maintain worker queues and thread pools to handle large numbers of short tasks that arrive from remote sources. In general, a ‘worker queue’ holds all the short tasks that need to be executed, and the threads in the thread pool retrieve the tasks from the worker queue and complete the tasks. Since multiple threads act on the worker queue, adding tasks to and deleting tasks from the worker queue needs to be synchronized, which introduces contention in the worker queue.” The article goes on to explain ways that developers can reduce contention by maintaining one queue per thread. It also demonstrates a work-stealing technique that helps in effectively utilizing the CPU in multicore systems. Read the rest of the article here.

Read the article
Simple Concurrency with Dataflow Variables in C++

What are dataflow variables anyway? Programming - Languages - Dataflow - C++ - Threads

Read the article
matplotlib and python multithread file processing

- by Napseis

I have a large number of files to process. I have written a script that get, sort and plot the datas I want. So far, so good. I have tested it and it gives the desired result. Then I wanted to do this using multithreading. I have looked into the doc and examples on the internet, and using one thread in my program works fine. But when I use more, at some point I get random matplotlib error, and I suspect some conflict there, even though I use a function with names for the plots, and iI can't see where the problem could be. Here is the whole script should you need more comment, i'll add them. Thank you. #!/usr/bin/python import matplotlib matplotlib.use('GTKAgg') import numpy as np from scipy.interpolate import griddata import matplotlib.pyplot as plt import matplotlib.colors as mcl from matplotlib import rc #for latex import time as tm import sys import threading import Queue #queue in 3.2 and Queue in 2.7 ! import pdb #the debugger rc('text', usetex=True)#for latex map=0 #initialize the map index. It will be use to index the array like this: array[map,[x,y]] time=np.zeros(1) #an array to store the time middle_h=np.zeros((0,3)) #x phi c #for the middle of the box current_file=open("single_void_cyl_periodic_phi_c_middle_h_out",'r') for line in current_file: if line.startswith('# === time'): map+=1 np.append(time,[float(line.strip('# === time '))]) elif line.startswith('#'): pass else: v=np.fromstring(line,dtype=float,sep=' ') middle_h=np.vstack( (middle_h,v[[1,3,4]]) ) current_file.close() middle_h=middle_h.reshape((map,-1,3)) #3d array: map, x, phi,c ##### def load_and_plot(): #will load a map file, and plot it along with the corresponding profile loaded before while not exit_flag: print("fecthing work ...") #try: if not tasks_queue.empty(): map_index=tasks_queue.get() print("----> working on map: %s" %map_index) x,y,zp=np.loadtxt("single_void_cyl_growth_periodic_post_map_"+str(map_index),unpack=True, usecols=[1, 2,3]) for i,el in enumerate(zp): if el<0.: zp[i]=0. xv=np.unique(x) yv=np.unique(y) X,Y= np.meshgrid(xv,yv) Z = griddata((x, y), zp, (X, Y),method='nearest') figure=plt.figure(num=map_index,figsize=(14, 8)) ax1=plt.subplot2grid((2,2),(0,0)) ax1.plot(middle_h[map_index,:,0],middle_h[map_index,:,1],'*b') ax1.grid(True) ax1.axis([-15, 15, 0, 1]) ax1.set_title('Profiles') ax1.set_ylabel(r'$\phi$') ax1.set_xlabel('x') ax2=plt.subplot2grid((2,2),(1,0)) ax2.plot(middle_h[map_index,:,0],middle_h[map_index,:,2],'*r') ax2.grid(True) ax2.axis([-15, 15, 0, 1]) ax2.set_ylabel('c') ax2.set_xlabel('x') ax3=plt.subplot2grid((2,2),(0,1),rowspan=2,aspect='equal') sub_contour=ax3.contourf(X,Y,Z,np.linspace(0,1,11),vmin=0.) figure.colorbar(sub_contour,ax=ax3) figure.savefig('single_void_cyl_'+str(map_index)+'.png') plt.close(map_index) tasks_queue.task_done() else: print("nothing left to do, other threads finishing,sleeping 2 seconds...") tm.sleep(2) # except: # print("failed this time: %s" %map_index+". Sleeping 2 seconds") # tm.sleep(2) ##### exit_flag=0 nb_threads=2 tasks_queue=Queue.Queue() threads_list=[] jobs=list(range(map)) #each job is composed of a map print("inserting jobs in the queue...") for job in jobs: tasks_queue.put(job) print("done") #launch the threads for i in range(nb_threads): working_bee=threading.Thread(target=load_and_plot) working_bee.daemon=True print("starting thread "+str(i)+' ...') threads_list.append(working_bee) working_bee.start() #wait for all tasks to be treated tasks_queue.join() #flip the flag, so the threads know it's time to stop exit_flag=1 for t in threads_list: print("waiting for threads %s to stop..."%t) t.join() print("all threads stopped")

Read the article
Parallelism implies concurrency but not the other way round right?

- by Cedric Martin

I often read that parallelism and concurrency are different things. Very often the answerers/commenters go as far as writing that they're two entirely different things. Yet in my view they're related but I'd like some clarification on that. For example if I'm on a multi-core CPU and manage to divide the computation into x smaller computation (say using fork/join) each running in its own thread, I'll have a program that is both doing parallel computation (because supposedly at any point in time several threads are going to run on several cores) and being concurrent right? While if I'm simply using, say, Java and dealing with UI events and repaints on the Event Dispatch Thread plus running the only thread I created myself, I'll have a program that is concurrent (EDT + GC thread + my main thread etc.) but not parallel. I'd like to know if I'm getting this right and if parallelism (on a "single but multi-cores" system) always implies concurrency or not? Also, are multi-threaded programs running on multi-cores CPU but where the different threads are doing totally different computation considered to be using "parallelism"?

Read the article
Number crunching algo for learning multithreading?

- by Austin Henley

I have never really implemented anything dealing with threads; my only experience with them is reading about them in my undergrad. So I want to change that by writing a program that does some number crunching, but splits it up into several threads. My first ideas for this hopefully simple multithreaded program were: Beal's Conjecture brute force based on my SO question. Bailey-Borwein-Plouffe formula for calculating Pi. Prime number brute force search As you can see I have an interest in math and thought it would be fun to incorporate it into this, rather than coding something such as a server which wouldn't be nearly as fun! But the 3 ideas don't seem very appealing and I have already done some work on them in the past so I was curious if anyone had any ideas in the same spirit as these 3 that I could implement?

Read the article
Why isn't reflection on the SCJP / OCJP?

- by Nick Rosencrantz

I read through Kathy Sierra's SCJP study guide and I will read it again more throughly to improve myself as a Java programmer and be able to take the certification either Java 6 or wait for the Java 7 exam (I'm already employed as Java developer so I'm in no hurry to take the exam.) Now I wonder why reflection is not on the exam? The book it seems covers everything that should be on the exam and AFAIK reflection is at least as important as threads if not more used inpractice since many frameworks use reflection. Do you know why reflection is not part of the SCJP? Do you agree that it's at least important to know reflection as threads? Thanks for any answer

Read the article
Why should most logic be in the monitor objects and not in the thread objects when writing concurrent software in Java?

- by refuser

When I took the Realtime and Concurrent programming course our lecturer told us that when writing concurrent programs in Java and using monitors, most of the logic should be in the monitor and as little as possible in the threads that access it. I never really understood why and I really would like to. Let me clarify. In this particular case we had several classes. Lift extends Thread Person extends Thread LiftView Monitor, all methods synchronized. This is nothing we came up with, our task was to implement a lift simulation with persons waiting on different floors, and theses were the class skeletons that were given. Then our lecturer said to implement most of the logic in the monitor (he was talking about class Monitor as THE monitor) and as little as possible in the threads. Why would he make a statement like that?

Read the article
Unnecessary Java context switches

- by Paul Morrison

I have a network of Java Threads (Flow-Based Programming) communicating via fixed-capacity channels - running under WindowsXP. What we expected, based on our experience with "green" threads (non-preemptive), would be that threads would switch context less often (thus reducing CPU time) if the channels were made bigger. However, we found that increasing channel size does not make any difference to the run time. What seems to be happening is that Java decides to switch threads even though channels aren't full or empty (i.e. even though a thread doesn't have to suspend), which costs CPU time for no apparent advantage. Also changing Thread priorities doesn't make any observable difference. My question is whether there is some way of persuading Java not to make unnecessary context switches, but hold off switching until it is really necessary to switch threads - is there some way of changing Java's dispatching logic? Or is it reacting to something I didn't pay attention to?! Or are there other asynchronism mechanisms, e.g. Thread factories, Runnable(s), maybe even daemons (!). The answer appears to be non-obvious, as so far none of my correspondents has come up with an answer (including most recently two CS profs). Or maybe I'm missing something that's so obvious that people can't imagine my not knowing it... I've added the send and receive code here - not very elegant, but it seems to work...;-) In case you are wondering, I thought the goLock logic in 'send' might be causing the problem, but removing it temporarily didn't make any difference. I have added the code for send and receive... public synchronized Packet receive() { if (isDrained()) { return null; } while (isEmpty()) { try { wait(); } catch (InterruptedException e) { close(); return null; } if (isDrained()) { return null; } } if (isDrained()) { return null; } if (isFull()) { notifyAll(); // notify other components waiting to send } Packet packet = array[receivePtr]; array[receivePtr] = null; receivePtr = (receivePtr + 1) % array.length; //notifyAll(); // only needed if it was full usedSlots--; packet.setOwner(receiver); if (null == packet.getContent()) { traceFuncs("Received null packet"); } else { traceFuncs("Received: " + packet.toString()); } return packet; } synchronized boolean send(final Packet packet, final OutputPort op) { sender = op.sender; if (isClosed()) { return false; } while (isFull()) { try { wait(); } catch (InterruptedException e) { indicateOneSenderClosed(); return false; } sender = op.sender; } if (isClosed()) { return false; } try { receiver.goLock.lockInterruptibly(); } catch (InterruptedException ex) { return false; } try { packet.clearOwner(); array[sendPtr] = packet; sendPtr = (sendPtr + 1) % array.length; usedSlots++; // move this to here if (receiver.getStatus() == StatusValues.DORMANT || receiver.getStatus() == StatusValues.NOT_STARTED) { receiver.activate(); // start or wake up if necessary } else { notifyAll(); // notify receiver // other components waiting to send to this connection may also get // notified, // but this is handled by while statement } sender = null; Component.network.active = true; } finally { receiver.goLock.unlock(); } return true; }

Read the article
Achieving more fluent movement

- by Robin92

I'm working on my first OpenGL 2D game and I've just locked the framerate of my game. However, the way objects move is far from satisfying: they tend to lag, which is shown in this video. I've thought how more fluent animation can be achieved and started getting segmentation faults due to accessing the same object by two different threads. I've tried the following threads' setting: Drawing, creating new objects Moving player, moving objects, deleting objects Currently my application uses this setting: Drawing, creating new objects, moving objects, deleting object Moving player Any ideas would be appreciated. EDIT: I've tried increasing the FPS limit but lags are noticeable even at 200 fps.

Read the article
Why is multithreading often preferred for improving performance?

- by user1849534

I have a question, it's about why programmers seems to love concurrency and multi-threaded programs in general. I'm considering 2 main approaches here: an async approach basically based on signals, or just an async approach as called by many papers and languages like the new C# 5.0 for example, and a "companion thread" that manages the policy of your pipeline a concurrent approach or multi-threading approach I will just say that I'm thinking about the hardware here and the worst case scenario, and I have tested this 2 paradigms myself, the async paradigm is a winner at the point that I don't get why people 90% of the time talk about multi-threading when they want to speed up things or make a good use of their resources. I have tested multi-threaded programs and async program on an old machine with an Intel quad-core that doesn't offer a memory controller inside the CPU, the memory is managed entirely by the motherboard, well in this case performances are horrible with a multi-threaded application, even a relatively low number of threads like 3-4-5 can be a problem, the application is unresponsive and is just slow and unpleasant. A good async approach is, on the other hand, probably not faster but it's not worst either, my application just waits for the result and doesn't hangs, it's responsive and there is a much better scaling going on. I have also discovered that a context change in the threading world it's not that cheap in real world scenario, it's in fact quite expensive especially when you have more than 2 threads that need to cycle and swap among each other to be computed. On modern CPUs the situation it's not really that different, the memory controller it's integrated but my point is that an x86 CPUs is basically a serial machine and the memory controller works the same way as with the old machine with an external memory controller on the motherboard. The context switch is still a relevant cost in my application and the fact that the memory controller it's integrated or that the newer CPU have more than 2 core it's not bargain for me. For what i have experienced the concurrent approach is good in theory but not that good in practice, with the memory model imposed by the hardware, it's hard to make a good use of this paradigm, also it introduces a lot of issues ranging from the use of my data structures to the join of multiple threads. Also both paradigms do not offer any security abut when the task or the job will be done in a certain point in time, making them really similar from a functional point of view. According to the X86 memory model, why the majority of people suggest to use concurrency with C++ and not just an async approach ? Also why not considering the worst case scenario of a computer where the context switch is probably more expensive than the computation itself ?

Read the article
Ubuntu and Surface Pro, wifi issues not fixed?

- by Confused or too stupid

Just a quick question, from the numerous other threads I found, is the wifi still a no go straight out of installing Ubuntu? I tried this Dual boot Surface Pro with Ubuntu? with 13.04, the wifi is sporadic in working. From that and other threads I gathered that 13.10 would include the fix, so I tossed that onto the Surface as well, with the same results. Is there anything I am missing? Is there any other distro that has better luck with the Marvell chip?

Read the article
What could be a reason for cross-platform server applications developer to make his app work in multiple processes?

- by Kabumbus

So we consider a server app development - heavily loaded with messing with big data streams.An app will be running on one powerful server. a server app shall be developed in form of crossplatform application - so to work on Windows, Mac OS X and Linux. So same code many platforms for standing alone server architecture. We wonder what benefits does distributing applications not only over threads but over processes as wall would bring to programmers and to server end users and why? Some people sad to me that even having 48 cores, 4 process threads would be shared via OS throe all cores... is it true BTW?

Read the article
Parallel processing via multithreading in Java

- by Robz

There are certain algorithms whose running time can decrease significantly when one divides up a task and gets each part done in parallel. One of these algorithms is merge sort, where a list is divided into infinitesimally smaller parts and then recombined in a sorted order. I decided to do an experiment to test whether or not I could I increase the speed of this sort by using multiple threads. I am running the following functions in Java on a Quad-Core Dell with Windows Vista. One function (the control case) is simply recursive: // x is an array of N elements in random order public int[] mergeSort(int[] x) { if (x.length == 1) return x; // Dividing the array in half int[] a = new int[x.length/2]; int[] b = new int[x.length/2+((x.length%2 == 1)?1:0)]; for(int i = 0; i < x.length/2; i++) a[i] = x[i]; for(int i = 0; i < x.length/2+((x.length%2 == 1)?1:0); i++) b[i] = x[i+x.length/2]; // Sending them off to continue being divided mergeSort(a); mergeSort(b); // Recombining the two arrays int ia = 0, ib = 0, i = 0; while(ia != a.length || ib != b.length) { if (ia == a.length) { x[i] = b[ib]; ib++; } else if (ib == b.length) { x[i] = a[ia]; ia++; } else if (a[ia] < b[ib]) { x[i] = a[ia]; ia++; } else { x[i] = b[ib]; ib++; } i++; } return x; } The other is in the 'run' function of a class that extends thread, and recursively creates two new threads each time it is called: public class Merger extends Thread { int[] x; boolean finished; public Merger(int[] x) { this.x = x; } public void run() { if (x.length == 1) { finished = true; return; } // Divide the array in half int[] a = new int[x.length/2]; int[] b = new int[x.length/2+((x.length%2 == 1)?1:0)]; for(int i = 0; i < x.length/2; i++) a[i] = x[i]; for(int i = 0; i < x.length/2+((x.length%2 == 1)?1:0); i++) b[i] = x[i+x.length/2]; // Begin two threads to continue to divide the array Merger ma = new Merger(a); ma.run(); Merger mb = new Merger(b); mb.run(); // Wait for the two other threads to finish while(!ma.finished || !mb.finished) ; // Recombine the two arrays int ia = 0, ib = 0, i = 0; while(ia != a.length || ib != b.length) { if (ia == a.length) { x[i] = b[ib]; ib++; } else if (ib == b.length) { x[i] = a[ia]; ia++; } else if (a[ia] < b[ib]) { x[i] = a[ia]; ia++; } else { x[i] = b[ib]; ib++; } i++; } finished = true; } } It turns out that function that does not use multithreading actually runs faster. Why? Does the operating system and the java virtual machine not "communicate" effectively enough to place the different threads on different cores? Or am I missing something obvious?

Read the article
What are the pros and cons of a non-fixed-interval update loop?

- by akonsu

I am studying various approaches to implementing a game loop and I have found this article. In the article the author implements a loop which, if the processing falls behind in time, skips frame renderings and just updates the game in a loop (the last variant called "Constant Game Speed independent of Variable FPS"). I do not understand why it is acceptable to call update_game() in a loop without making sure the update function is called at a particular interval. I do not see any value in doing this. I would think that in my game I want to be sure the game is updated periodically with a known period. So maybe it is worthwhile to have two threads, one would call update periodically, and the other one would redraw the game, also periodically? Would this be a good and practical approach? Of course I would need to synchronise the threads.

Read the article
IPC linux huge transaction

- by poly

I'm building and application that requires huge transactions/sec of data and I need to use IPC to for the mutithreaded mutliprocceses communication, I know that there are a lot of methods to be used but not sure which one to choose for this application. This is what the application is gonna have, 4 processes, each process has 4 threads, the data chunk that needs to be transferred between two or more threads is around 400KB. I found that fifo is good choice except that it's 64K which is not that big so i'll need to modify and recompile the kernel but not sure if this is the right thing to do? Anyway, I'm open to any suggestions and I'd like to squeeze your experience in this :) and I appreciate it in advance.

Read the article
What could be a reason for cross-platform server applications developer to make his app work in multiple processes?

- by Kabumbus

We consider a server app development - heavily loaded with messing with big data streams. An app will be running on one powerful server. A server app will be developed in form of crossplatform application - working on Windows, Mac OS X and Linux. So same code, many platforms for stand alone server architecture. We wonder what are the benefits of distributing applications not only over threads but over processes as well, for programmers and server end users? Some people said to me that even having 48 cores, 4 process threads would be shared via OS through all cores, is that true?

Read the article
Performance impact of the new mtmalloc memory allocator

- by nospam(at)example.com (Joerg Moellenkamp)

I wrote at a number of occasions (here or here), that it could be really beneficial to use a different memory allocator for highly-threaded workloads, as the standard allocator is well ... the standard, however not very effective as soon as many threads comes into play. I didn't wrote about this as it was in my phase of silence but there was some change in the allocator area, Solaris 10 got a revamped mtmalloc allocator in version Solaris 10 08/11 (as described in "libmtmalloc Improvements"). The new memory allocator was introduced to Solaris development by the PSARC case 2010/212. But what's the effect of this new allocator and how does it works? Rickey C. Weisner wrote a nice article with "How Memory Allocation Affects Performance in Multithreaded Programs" explaining the inner mechanism of various allocators but he also publishes test results comparing Hoard, mtmalloc, umem, new mtmalloc and the libc malloc. Really interesting read and a must for people running applications on servers with a high number of threads.

Read the article
Game Development Resources? [closed]

- by stas

I want to make an iPhone game and I was wondering if somebody could point me to some resources. Before you rage and close this, I am not looking for just a tutorial on programming, I could just google that. I need to find a tutorial on how to actually make a game, as in how to organize your code, what type of methods to run in separate threads, how to manage these threads in a game, etc. I already know everything I need; I can write a physics engine from scratch, I can write a 3D graphics pipeline from scratch, and so on. What I cannot figure out is how to combine all of this knowledge into correctly and efficiently making a game. Obviously for this one would probably go to college, but seeing as I am still in high school, that is not an option. If anyone would know some tutorials or resources, any pointers would be appreciated.

Read the article
Why C++ people loves multithreading when it comes to performances?

- by user1849534

I have a question, it's about why programmers seems to love concurrency and multi-threaded programs in general. I'm considering 2 main approach here: an async approach basically based on signals, or just an async approach as called by many papers and languages like the new C# 5.0 for example, and a "companion thread" that maanges the policy of your pipeline a concurrent approach or multi-threading approach I will just say that I'm thinking about the hardware here and the worst case scenario, and I have tested this 2 paradigms myself, the async paradigm is a winner at the point that I don't get why people 90% of the time talk about concurrency when they wont to speed up things or make a good use of their resources. I have tested multi-threaded programs and async program on an old machine with an Intel quad-core that doesn't offer a memory controller inside the CPU, the memory is managed entirely by the motherboard, well in this case performances are horrible with a multi-threaded application, even a relatively low number of threads like 3-4-5 can be a problem, the application is unresponsive and is just slow and unpleasant. A good async approach is, on the other hand, probably not faster but it's not worst either, my application just waits for the result and doesn't hangs, it's responsive and there is a much better scaling going on. I have also discovered that a context change in the threading world it's not that cheap in real world scenario, it's infact quite expensive especially when you have more than 2 threads that need to cycle and swap among each other to be computed. On modern CPUs the situation it's not really that different, the memory controller it's integrated but my point is that an x86 CPUs is basically a serial machine and the memory controller works the same way as with the old machine with an external memory controller on the motherboard. The context switch is still a relevant cost in my application and the fact that the memory controller it's integrated or that the newer CPU have more than 2 core it's not bargain for me. For what i have experienced the concurrent approach is good in theory but not that good in practice, with the memory model imposed by the hardware, it's hard to make a good use of this paradigm, also it introduces a lot of issues ranging from the use of my data structures to the join of multiple threads. Also both paradigms do not offer any security abut when the task or the job will be done in a certain point in time, making them really similar from a functional point of view. According to the X86 memory model, why the majority of people suggest to use concurrency with C++ and not just an async aproach ? Also why not considering the worst case scenario of a computer where the context switch is probably more expensive than the computation itself ?

Read the article
process monitoring in c under linux

- by poly

I'm trying to write a multi threaded/processes application and it need to know how to monitor a process from another process all the time. So here is what I have, I have a 2 processes, each with multiple threads that handle the network part, then another 2 process also with multiple threads that interact with DB and with the network processes, what I need to do is that if for example one of the network processes goes down the DB process start sending to the live network process until the second one is up again. I'm using fifo between the DB and the network process. I was thinking of sending messages with message passing all the time but not sure whether this is a good idea or I need to use some other IPC for this issue, or probably neither is good and I need to use entirely something else? Thanks,

Read the article
What sort of things can cause a whole system to appear to hang for 100s-1000s of milliseconds?

- by Ogapo

I am working on a Windows game and while rendering, some computers will experience intermittent pauses ("hitches" for lack of a better term). When profiled they appear in seemingly random places in the code. Eventually I noticed that it wasn't just my process that was affected, but (seemingly) every process on the system. All of the threads in my application hitch at once. The CPU utilization drops during these hitches and it appears as if most processes make no progress. This leads me to believe this may be an Operating System or Driver issue, but it only occurs while playing the game (and only on some systems). What sort of operations might the operating system be doing that would require the kernel to pause all user threads and block. Some kind of I/O? At first I thought of paging but my impression is that would only affect a single process, no? Some systems in use: Windows, DirectX (3d), nVidia cards (unknown if replicates on ATI), using overlapped io for streaming

Read the article
Process monitoring in Linux environment?

- by poly

I'm trying to write a multi threaded/processes application and it need to know how to monitor a process from another process all the time. So here is what I have, I have a 2 processes, each with multiple threads that handle the network part, then another 2 process also with multiple threads that interact with DB and with the network processes, what I need to do is that if for example one of the network processes goes down the DB process start sending to the live network process until the second one is up again. I'm using fifo between the DB and the network process. I was thinking of sending messages with message passing all the time but not sure whether this is a good idea or I need to use some other IPC for this issue, or probably neither is good and I need to use entirely something else?

Read the article

< Previous Page | 54 55 56 57 58 59 60 61 62 63 64 65 | Next Page >