Search Results

Search found 913 results on 37 pages for 'haskell multicore'.

Page 22/37 | < Previous Page | 18 19 20 21 22 23 24 25 26 27 28 29 | Next Page >

Flowcharting functional programming languages

- by Sadface

Flowcharting. This ancient old practice that's been in use for over 1000 years now, being forced upon us poor students, without any usefulness (or so do I think). It might work well with imperative, sequentially running languages, but what about my beloved functional programming? Sadly, I'm forced to create a flow chart for my programm (that is written in Haskell). I imagine it being easy for something like this: main :: IO () main = do someInput <- getLine let upped = map toUpper someInput putStrLn upped Which is just 3 sequenced steps, fetching data, uppercasing it, outputting it. Things look worse this time: main :: IO () main = do someInput <- fmap toUpper getLine putStrLn someInput Or like this: main :: IO () main = interact (map toUpper) Okay, that was IO, you can handle that like an imperative language. What about pure functions? An actual example: onlyMatching :: String -> [FilePath] -> [FilePath] onlyMatching ext = filter f where f name = lower ('.' : ext) == (lower . takeExtension $ name) lower = map toLower How would you flowchart that last one?

Read the article
trouble with state monad composition

- by user1308560

I was trying out the example given at http://www.haskell.org/haskellwiki/State_Monad#Complete_and_Concrete_Example_1 How this makes the solution composible is beyond my understanding. Here is what I tried but I get compile errors as follows: Couldn't match expected type `GameValue -> StateT GameState Data.Functor.Identity.Identity b0' with actual type `State GameState GameValue' In the second argument of `(>>=)', namely `g2' In the expression: g1 >>= g2 In an equation for `g3': g3 = g1 >>= g2 Failed, modules loaded: none. Here is the code: See the end lines module StateGame where import Control.Monad.State type GameValue = Int type GameState = (Bool, Int) -- suppose I want to play one game after the other g1 = playGame "abcaaacbbcabbab" g2 = playGame "abcaaacbbcabb" g3 = g1 >>= g2 m2 = print $ evalState g3 startState playGame :: String -> State GameState GameValue playGame [] = do (_, score) <- get return score playGame (x:xs) = do (on, score) <- get case x of 'a' | on -> put (on, score + 1) 'b' | on -> put (on, score - 1) 'c' -> put (not on, score) _ -> put (on, score) playGame xs startState = (False, 0) main str = print $ evalState (playGame str) startState

Read the article
Monads with Join() instead of Bind()

- by MathematicalOrchid

Monads are usually explained in turns of return and bind. However, I gather you can also implement bind in terms of join (and fmap?) In programming languages lacking first-class functions, bind is excruciatingly awkward to use. join, on the other hand, looks quite easy. I'm not completely sure I understand how join works, however. Obviously, it has the [Haskell] type join :: Monad m = m (m x) - m x For the list monad, this is trivially and obviously concat. But for a general monad, what, operationally, does this method actually do? I see what it does to the type signatures, but I'm trying to figure out how I'd write something like this in, say, Java or similar. (Actually, that's easy: I wouldn't. Because generics is broken. ;-) But in principle the question still stands...) Oops. It looks like this has been asked before: Monad join function Could somebody sketch out some implementations of common monads using return, fmap and join? (I.e., not mentioning >>= at all.) I think perhaps that might help it to sink in to my dumb brain...

Read the article
Heap Algorithmic Issue

- by OberynMarDELL

I am having this algorithmic problem that I want to discuss about. Its not about find a solution but about optimization in terms of runtime. So here it is: Suppose we have a race court of Length L and a total of N cars that participate on the race. The race rules are simple. Once a car overtakes an other car the second car is eliminated from the race. The race ends when no more overtakes are possible to happen. The tricky part is that the k'th car has a starting point x[k] and a velocity v[k]. The points are given in an ascending order, but the velocities may differ. What I've done so far: Given that a car can get overtaken only by its previous, I calculated the time that it takes for each car to reach its next one t = (x[i] - x[i+1])/(v[i] - v[i+1]) and I insert these times onto a min heap in O(n log n). So in theory I have to pop the first element in O(logn), find its previous, pop it as well , update its time and insert it in the heap once more, much like a priority queue. My main problem is how I can access specific points of a heap in O(log n) or faster in order to keep the complexity in O(n log n) levels. This program should be written on Haskell so I would like to keep things simple as far as possible EDIT: I Forgot to write the actual point of the race. The goal is to find the order in which cars exit the game

Read the article
How can I bind the second argument in a function but not the first (in an elegant way)?

- by Frank Osterfeld

Is there a way in Haskell to bind the second argument but not the first of a function without using lambda functions or defining another "local" function? Example. I have a binary function like: sub :: Int -> Int -> Int sub x y = x - y Now if I want to bind the first argument, I can do so easily using (sub someExpression): mapSubFrom5 x = map (sub 5) x *Main> mapSubFrom5 [1,2,3,4,5] [4,3,2,1,0] That works fine if I want to bind the first n arguments without "gap". If I want to bind the second argument but not the first, the two options I am aware of are more verbose: Either via another, local, function: mapSub5 x = map sub5 x where sub5 x = sub x 5 *Main> mapSub5 [1,2,3,4,5] [-4,-3,-2,-1,0] Or using lambda: mapSub5 x = map (\x -> sub x 5) x While both are working fine, I like the elegance of "sub 5" and wonder if there is a similarly elegant way to bind the n-th (n 1) argument of a function?

Read the article
NedMalloc / DlMalloc experiences

- by Suma

I am currently evaluating a few of scalable memory allocators, namely nedmalloc and ptmalloc (both built on top of dlmalloc), as a replacement for default malloc / new because of significant contention seen in multithreaded environment. Their published performance seems to be good, however I would like to check what are experiences of other people who have really used them. Were your performance goals satisfied? Did you experience any unexpected or hard to solve issues (like heap corruption)? If you have tried both ptmaalloc and nedmalloc, which of the two would you recommend? Why (ease of use, performance)?

Read the article
Multi-part question about multi-threading, locks and multi-core processors (multi ^ 3)

- by MusiGenesis

I have a program with two methods. The first method takes two arrays as parameters, and performs an operation in which values from one array are conditionally written into the other, like so: void Blend(int[] dest, int[] src, int offset) { for (int i = 0; i < src.Length; i++) { int rdr = dest[i + offset]; dest[i + offset] = src[i] > rdr? src[i] : rdr; } } The second method creates two separate sets of int arrays and iterates through them such that each array of one set is Blended with each array from the other set, like so: void CrossBlend() { int[][] set1 = new int[150][75000]; // we'll pretend this actually compiles int[][] set2 = new int[25][10000]; // we'll pretend this actually compiles for (int i1 = 0; i1 < set1.Length; i1++) { for (int i2 = 0; i2 < set2.Length; i2++) { Blend(set1[i1], set2[i2], 0); // or any offset, doesn't matter } } } First question: Since this apporoach is an obvious candidate for parallelization, is it intrinsically thread-safe? It seems like no, since I can conceive a scenario (unlikely, I think) where one thread's changes are lost because a different threads ~simultaneous operation. If no, would this: void Blend(int[] dest, int[] src, int offset) { lock (dest) { for (int i = 0; i < src.Length; i++) { int rdr = dest[i + offset]; dest[i + offset] = src[i] > rdr? src[i] : rdr; } } } be an effective fix? Second question: If so, what would be the likely performance cost of using locks like this? I assume that with something like this, if a thread attempts to lock a destination array that is currently locked by another thread, the first thread would block until the lock was released instead of continuing to process something. Also, how much time does it actually take to acquire a lock? Nanosecond scale, or worse than that? Would this be a major issue in something like this? Third question: How would I best approach this problem in a multi-threaded way that would take advantage of multi-core processors (and this is based on the potentially wrong assumption that a multi-threaded solution would not speed up this operation on a single core processor)? I'm guessing that I would want to have one thread running per core, but I don't know if that's true.

Read the article
Surprising results with .NET multi-theading algorithm

- by Myles J

Hi, I've recently wrote a C# console time tabling algorithm that is based on a combination of a genetic algorithm with a few brute force routines thrown in. The initial results were promising but I figured I could improve the performance by splitting the brute force routines up to run in parallel on multi processor architectures. To do this I used the well documented Producer/Consumer model (as documented in this fantastic article http://www.albahari.com/threading/part2.aspx#_ProducerConsumerQWaitHandle). I changed my code to create one thread per logical processor during the brute force routines. The performance gains on my work station were very pleasing. I am running Windows XP on the following hardware: Intel Core 2 Quad CPU 2.33 GHz 3.49 GB RAM Initial tests indicated average performance gains of approx 40% when using 4 threads. The next step was to deploy the new multi-threading version of the algorithm to our higher spec UAT server. Here is the spec of our UAT server: Windows 2003 Server R2 Enterprise x64 8 cpu (Quad-Core) AMD Opteron 2.70 GHz 255 GB RAM After running the first round of tests we were all extremely surprised to find that the algorithm actually runs slower on the high spec W2003 server than on my local XP work station! In fact the tests seem to indicate that it doesn't matter how many threads are generated (tests were ran with the app spawning between 2 to 32 threads). The algorithm always runs significantly slower on the UAT W2003 server? How could this be? Surely the app should run faster on a 8 cpu (Quad-Core) than my 2 Quad work station? Why are we seeing no performance gains with the multi-threading on the W2003 server whilst the XP workstation tests show gains of up to 40%? Any help or pointers would be appreciated. Regards Myles

Read the article
Using Threading on Quad core speed up the code 65%?

- by Ahmed Said

This sample code compares serial method with threaded method, on Quad core processor. The code just reads 4 images from the resources. I found that the speed up is around 65%, why it does not equal 75% as I have 4 cores and all of them are fully utilized?

Read the article
How are interrupts handled by dual processor machines?

- by jeffD

I have an idea of how interrupts are handled by a dual core CPU. I was wondering about how interrupt handling is implemented on a board with more than one physical processor. Is any of the interrupt responsibility determined by the physical board's configuration? Each processor must be able to handle some types of interrupts, like disk I/O. Unless there is some circuitry to manage and dispatch interrupts to the appropriate processor? My guess is that the scheme must be processor neutral, so that any processor and core can run the interrupt handler. If a core is waiting on a disk read, will that core be the one to run the interrupt handler when the disk is ready?

Read the article
In a multithreaded app, would a multi-core or multiprocessor arrangement be better?

- by Michael

I've read a lot on this topic already both here (e.g., stackoverflow.com/questions/1713554/threads-processes-vs-multithreading-multi-core-multiprocessor-how-they-are or http://stackoverflow.com/questions/680684/multi-cpu-multi-core-and-hyper-thread) and elsewhere (e.g., ixbtlabs.com/articles2/cpu/rmmt-l2-cache.html or software.intel.com/en-us/articles/multi-core-introduction/), but I still am not sure about a couple things that seem very straightforward. So I thought I'd just ask. (1) Is a multi-core processor in which each core has dedicated cache effectively the same as a multiprocessor system (balanced of course for processor speed, cache size, and so on)? (2) Let's say I have some images to analyze (i.e., computer vision), and I have these images loaded into RAM. My app spawns a thread for each image that needs to be analyzed. Will this app on a shared cache multi-core processor run slower than on a dedicated cache multi-core processor, and would the latter run at the same speed as on an equivalent single-core multiprocessor machine? Thank you for the help!

Read the article
Do Scala and Erlang use green threads?

- by CHAPa

I've been reading a lot about how Scala and Erlang does lightweight threads and their concurrency model (actors). However, I have my doubts. Do Scala and Erlang use an approach similar to the old thread model used by Java (green threads) ? For example, suppose that there is a machine with 2 cores, so the Scala/Erlang environment will fork one thread per processor? The other threads will be scheduled by user-space (Scala VM / Erlang VM ) environment. Is this correct? Under the hood, how does this really work?

Read the article
Why is my multithreaded Java program not maxing out all my cores on my machine?

- by James B

Hi, I have a program that starts up and creates an in-memory data model and then creates a (command-line-specified) number of threads to run several string checking algorithms against an input set and that data model. The work is divided amongst the threads along the input set of strings, and then each thread iterates the same in-memory data model instance (which is never updated again, so there are no synchronization issues). I'm running this on a Windows 2003 64-bit server with 2 quadcore processors, and from looking at Windows task Manager they aren't being maxed-out, (nor are they looking like they are being particularly taxed) when I run with 10 threads. Is this normal behaviour? It appears that 7 threads all complete a similar amount of work in a similar amount of time, so would you recommend running with 7 threads instead? Should I run it with more threads?...Although I assume this could be detrimental as the JVM will do more context switching between the threads. Alternatively, should I run it with fewer threads? Alternatively, what would be the best tool I could use to measure this?...Would a profiling tool help me out here - indeed, is one of the several profilers better at detecting bottlenecks (assuming I have one here) than the rest? Note, the server is also running SQL Server 2005 (this may or may not be relevant), but nothing much is happening on that database when I am running my program. Note also, the threads are only doing string matching, they aren't doing any I/O or database work or anything else they may need to wait on. Thanks in advance, -James

Read the article
Scala/Erlang use something like greenThread or not ?

- by CHAPa

Hi all, Im reading a lot about how scala/Erlang does lightweight threads and your concurrency model ( Actor Model ). Off course, some doubts appear in my head. Scala/Erlang use a approach similar to the old thread model used by java (greenThread) ? for example, suppose that there is a machine with 2 cores, so the scala/erlang environment will fork one thread per processor ? The other threads will be scheduled by user-space( scala VM / erlang vm ) environment. is it correct ? how under the hood that really work ? thanks a lot.

Read the article
What Would You Do With 48 Cores?

- by jeroen.vangoey

The AMD Server team has announced a contest where they are seeking the best essays, videos, or blog posts documenting how you might use 48 cores. They are primarily looking for "what you can do to help society, to help others. That will give you an edge." So, what would you do with 48 cores? Disclaimer: I am not affiliated with AMD (I am even not eligible for the contest because I don't live in the US/Canada) but would love to see what the SO community can come up with.

Read the article
Python MD5 Hash Faster Calculation

- by balgan

Hi everyone. I will try my best to explain my problem and my line of thought on how I think I can solve it. I use this code for root, dirs, files in os.walk(downloaddir): for infile in files: f = open(os.path.join(root,infile),'rb') filehash = hashlib.md5() while True: data = f.read(10240) if len(data) == 0: break filehash.update(data) print "FILENAME: " , infile print "FILE HASH: " , filehash.hexdigest() and using start = time.time() elapsed = time.time() - start I measure how long it takes to calculate an hash. Pointing my code to a file with 653megs this is the result: root@Mars:/home/tiago# python algorithm-timer.py FILENAME: freebsd.iso FILE HASH: ace0afedfa7c6e0ad12c77b6652b02ab 12.624 root@Mars:/home/tiago# python algorithm-timer.py FILENAME: freebsd.iso FILE HASH: ace0afedfa7c6e0ad12c77b6652b02ab 12.373 root@Mars:/home/tiago# python algorithm-timer.py FILENAME: freebsd.iso FILE HASH: ace0afedfa7c6e0ad12c77b6652b02ab 12.540 Ok now 12 seconds +- on a 653mb file, my problem is I intend to use this code on a program that will run through multiple files, some of them might be 4/5/6Gb and it will take wayy longer to calculate. What am wondering is if there is a faster way for me to calculate the hash of the file? Maybe by doing some multithreading? I used a another script to check the use of the CPU second by second and I see that my code is only using 1 out of my 2 CPUs and only at 25% max, any way I can change this? Thank you all in advance for the given help.

Read the article
How do I get Java to use my multi-core processor?

- by Rudiger

I'm using a GZIPInputStream in my program, and I know that the performance would be helped if I could get Java running my program in parallel. In general, is there a command-line option for the standard VM to run on many cores? It's running on just one as it is. Thanks! Edit I'm running plain ol' Java SE 6 update 17 on Windows XP. Would putting the GZIPInputStream on a separate thread explicitly help? No! Do not put the GZIPInputStream on a separate thread! Do NOT multithread I/O! Edit 2 I suppose I/O is the bottleneck, as I'm reading and writing to the same disk... In general, though, is there a way to make GZIPInputStream faster? Or a replacement for GZIPInputStream that runs parallel? Edit 3 Code snippet I used: GZIPInputStream gzip = new GZIPInputStream(new FileInputStream(INPUT_FILENAME)); DataInputStream in = new DataInputStream(new BufferedInputStream(gzip));

Read the article
Is it too early to start designing for Task Parallel Library?

- by Joe Erickson

I have been following the development of the .NET Task Parallel Library (TPL) with great interest since Microsoft first announced it. There is no doubt in my mind that we will eventually take advantage of TPL. What I am questioning is whether it makes sense to start taking advantage of TPL when Visual Studio 2010 and .NET 4.0 are released, or whether it makes sense to wait a while longer. Why Start Now? The .NET 4.0 Task Parallel Library appears to be well designed and some relatively simple tests demonstrate that it works well on today's multi-core CPUs. I have been very interested in the potential advantages of using multiple lightweight threads to speed up our software since buying my first quad processor Dell Poweredge 6400 about seven years ago. Experiments at that time indicated that it was not worth the effort, which I attributed largely to the overhead of moving data between each CPU's cache (there was no shared cache back then) and RAM. Competitive advantage - some of our customers can never get enough performance and there is no doubt that we can build a faster product using TPL today. It sounds fun. Yes, I realize that some developers would rather poke themselves in the eye with a sharp stick, but we really enjoy maximizing performance. Why Wait? Are today's Intel Nehalem CPUs representative of where we are going as multi-core support matures? You can purchase a Nehalem CPU with 4 cores which share a single level 3 cache today, and most likely a 6 core CPU sharing a single level 3 cache by the time Visual Studio 2010 / .NET 4.0 are released. Obviously, the number of cores will go up over time, but what about the architecture? As the number of cores goes up, will they still share a cache? One issue with Nehalem is the fact that, even though there is a very fast interconnect between the cores, they have non-uniform memory access (NUMA) which can lead to lower performance and less predictable results. Will future multi-core architectures be able to do away with NUMA? Similarly, will the .NET Task Parallel Library change as it matures, requiring modifications to code to fully take advantage of it? Limitations Our core engine is 100% C# and has to run without full trust, so we are limited to using .NET APIs.

Read the article
Node.js or Erlang

- by gotts

I really like these tools when it comes to the concurrency level it can handle. Erlang looks like much more stable solution but requires much more learning and a lot of diving into functional language paradigm. And it looks like Erlang makes it much better when it comes to multi cores CPUs(fix me if I'm wrong). But which should I choose? Which one is better in the short/long term perspective?

Read the article
Any way to make this working dual core in C#?

- by Frantisek

Hi, I got a piece of code that loops through the array and looks for the similar and same strings in it - marking it whether it's unique or not. loop X array for I ( loop X array for Y ( If X is prefix of Y do. else if x is same length as Y and it's prefix do something. ) Here is the code to finilize everything for I and corresponding (found/not found) matches in Y. ) I'd like to make this for dual-core to multithread it. To my knowledge it is not possible, but it's highly probable that you may have some idea.

Read the article
multi-core processing in R on windows XP - via doMC and foreach

- by Jan

Hi guys, I'm posting this question to ask for advice on how to optimize the use of multiple processors from R on a Windows XP machine. At the moment I'm creating 4 scripts (each script with e.g. for (i in 1:100) and (i in 101:200), etc) which I run in 4 different R sessions at the same time. This seems to use all the available cpu. I however would like to do this a bit more efficient. One solution could be to use the "doMC" and the "foreach" package but this is not possible in R on a Windows machine. e.g. library("foreach") library("strucchange") library("doMC") # would this be possible on a windows machine? registerDoMC(2) # for a computer with two cores (processors) ## Nile data with one breakpoint: the annual flows drop in 1898 ## because the first Ashwan dam was built data("Nile") plot(Nile) ## F statistics indicate one breakpoint fs.nile <- Fstats(Nile ~ 1) plot(fs.nile) breakpoints(fs.nile) # , hpc = "foreach" --> It would be great to test this. lines(breakpoints(fs.nile)) Any solutions or advice? Thanks, Jan

Read the article
Parallel Programming. Boost's MPI, OpenMP, TBB, or something else?

- by unknownthreat

Hello, I am totally a novice in parallel programming, but I do know how to program C++. Now, I am looking around for parallel programming library. I just want to give it a try, just for fun, and right now, I found 3 APIs, but I am not sure which one should I stick with. Right now, I see Boost's MPI, OpenMP and TBB. For anyone who have experienced with any of these 3 API (or any other parallelism API), could you please tell me the difference between these? Are there any factor to consider, like AMD or Intel architecture?

Read the article
What parallel programming model do you recommend today to take advantage of the manycore processors

- by Doctor J

If you were writing a new application from scratch today, and wanted it to scale to all the cores you could throw at it tomorrow, what parallel programming model/system/language/library would you choose? Why? I am particularly interested in answers along these axes: Programmer productivity / ease of use (can mortals successfully use it?) Target application domain (what problems is it (not) good at?) Concurrency style (does it support tasks, pipelines, data parallelism, messages...?) Maintainability / future-proofing (will anybody still be using it in 20 years?) Performance (how does it scale on what kinds of hardware?) I am being deliberately vauge on the nature of the application in anticipation of getting good general answers useful for a variety of applications.

Read the article
some pointer to understanding GCC source code

- by user299570

hi, I'm student working on optimizing GCC for multi-core processor. I tried going through the source code, it is difficult to follow through it since I need to add some code to the back end. Can anyone suggest some good resource which explains the code flow through the different phases. Also suggest some development environment for debugging GCC mainly to step through the code. Is it possible on windows?

Read the article
What application domains are CPU bound and will tend to benefit from multi-core technologies?

- by Glomek

I hear a lot of people talking about the revolution that is coming in programming due to multi-core processors and parallelism, but I can't shake the feeling that for most of us, CPU cycles aren't the bottleneck. Pretty much all of my programs have been I/O bound in one way or another (database, filesystem, network, user interaction, etc.) for a very long time. Now I can think of a few areas where CPU cycles are a limiting factor, like code breaking, graphics, sound, some forms of simulation (weather, physics, etc.), and some forms of mathematical research, but they all seem like fairly specialized application domains. My general impression is that most programs are still I/O bound and that for most of our industry CPUs have been plenty fast for quite a while now. Am I off my rocker? What other application domains are CPU bound today? Do any of them include a large portion of the programming population? In essence, I'm wondering whether the multi-core CPUs will impact very many of us, and if so, how?

Read the article

< Previous Page | 18 19 20 21 22 23 24 25 26 27 28 29 | Next Page >