Search Results

Search found 156 results on 7 pages for 'multicore'.

Page 3/7 | < Previous Page | 1 2 3 4 5 6 7  | Next Page >

  • Surprising results with .NET multi-theading algorithm

    - by Myles J
    Hi, I've recently wrote a C# console time tabling algorithm that is based on a combination of a genetic algorithm with a few brute force routines thrown in. The initial results were promising but I figured I could improve the performance by splitting the brute force routines up to run in parallel on multi processor architectures. To do this I used the well documented Producer/Consumer model (as documented in this fantastic article http://www.albahari.com/threading/part2.aspx#_ProducerConsumerQWaitHandle). I changed my code to create one thread per logical processor during the brute force routines. The performance gains on my work station were very pleasing. I am running Windows XP on the following hardware: Intel Core 2 Quad CPU 2.33 GHz 3.49 GB RAM Initial tests indicated average performance gains of approx 40% when using 4 threads. The next step was to deploy the new multi-threading version of the algorithm to our higher spec UAT server. Here is the spec of our UAT server: Windows 2003 Server R2 Enterprise x64 8 cpu (Quad-Core) AMD Opteron 2.70 GHz 255 GB RAM After running the first round of tests we were all extremely surprised to find that the algorithm actually runs slower on the high spec W2003 server than on my local XP work station! In fact the tests seem to indicate that it doesn't matter how many threads are generated (tests were ran with the app spawning between 2 to 32 threads). The algorithm always runs significantly slower on the UAT W2003 server? How could this be? Surely the app should run faster on a 8 cpu (Quad-Core) than my 2 Quad work station? Why are we seeing no performance gains with the multi-threading on the W2003 server whilst the XP workstation tests show gains of up to 40%? Any help or pointers would be appreciated. Regards Myles

    Read the article

  • Do Scala and Erlang use green threads?

    - by CHAPa
    I've been reading a lot about how Scala and Erlang does lightweight threads and their concurrency model (actors). However, I have my doubts. Do Scala and Erlang use an approach similar to the old thread model used by Java (green threads) ? For example, suppose that there is a machine with 2 cores, so the Scala/Erlang environment will fork one thread per processor? The other threads will be scheduled by user-space (Scala VM / Erlang VM ) environment. Is this correct? Under the hood, how does this really work?

    Read the article

  • Why is my multithreaded Java program not maxing out all my cores on my machine?

    - by James B
    Hi, I have a program that starts up and creates an in-memory data model and then creates a (command-line-specified) number of threads to run several string checking algorithms against an input set and that data model. The work is divided amongst the threads along the input set of strings, and then each thread iterates the same in-memory data model instance (which is never updated again, so there are no synchronization issues). I'm running this on a Windows 2003 64-bit server with 2 quadcore processors, and from looking at Windows task Manager they aren't being maxed-out, (nor are they looking like they are being particularly taxed) when I run with 10 threads. Is this normal behaviour? It appears that 7 threads all complete a similar amount of work in a similar amount of time, so would you recommend running with 7 threads instead? Should I run it with more threads?...Although I assume this could be detrimental as the JVM will do more context switching between the threads. Alternatively, should I run it with fewer threads? Alternatively, what would be the best tool I could use to measure this?...Would a profiling tool help me out here - indeed, is one of the several profilers better at detecting bottlenecks (assuming I have one here) than the rest? Note, the server is also running SQL Server 2005 (this may or may not be relevant), but nothing much is happening on that database when I am running my program. Note also, the threads are only doing string matching, they aren't doing any I/O or database work or anything else they may need to wait on. Thanks in advance, -James

    Read the article

  • Scala/Erlang use something like greenThread or not ?

    - by CHAPa
    Hi all, Im reading a lot about how scala/Erlang does lightweight threads and your concurrency model ( Actor Model ). Off course, some doubts appear in my head. Scala/Erlang use a approach similar to the old thread model used by java (greenThread) ? for example, suppose that there is a machine with 2 cores, so the scala/erlang environment will fork one thread per processor ? The other threads will be scheduled by user-space( scala VM / erlang vm ) environment. is it correct ? how under the hood that really work ? thanks a lot.

    Read the article

  • What Would You Do With 48 Cores?

    - by jeroen.vangoey
    The AMD Server team has announced a contest where they are seeking the best essays, videos, or blog posts documenting how you might use 48 cores. They are primarily looking for "what you can do to help society, to help others. That will give you an edge." So, what would you do with 48 cores? Disclaimer: I am not affiliated with AMD (I am even not eligible for the contest because I don't live in the US/Canada) but would love to see what the SO community can come up with.

    Read the article

  • Python MD5 Hash Faster Calculation

    - by balgan
    Hi everyone. I will try my best to explain my problem and my line of thought on how I think I can solve it. I use this code for root, dirs, files in os.walk(downloaddir): for infile in files: f = open(os.path.join(root,infile),'rb') filehash = hashlib.md5() while True: data = f.read(10240) if len(data) == 0: break filehash.update(data) print "FILENAME: " , infile print "FILE HASH: " , filehash.hexdigest() and using start = time.time() elapsed = time.time() - start I measure how long it takes to calculate an hash. Pointing my code to a file with 653megs this is the result: root@Mars:/home/tiago# python algorithm-timer.py FILENAME: freebsd.iso FILE HASH: ace0afedfa7c6e0ad12c77b6652b02ab 12.624 root@Mars:/home/tiago# python algorithm-timer.py FILENAME: freebsd.iso FILE HASH: ace0afedfa7c6e0ad12c77b6652b02ab 12.373 root@Mars:/home/tiago# python algorithm-timer.py FILENAME: freebsd.iso FILE HASH: ace0afedfa7c6e0ad12c77b6652b02ab 12.540 Ok now 12 seconds +- on a 653mb file, my problem is I intend to use this code on a program that will run through multiple files, some of them might be 4/5/6Gb and it will take wayy longer to calculate. What am wondering is if there is a faster way for me to calculate the hash of the file? Maybe by doing some multithreading? I used a another script to check the use of the CPU second by second and I see that my code is only using 1 out of my 2 CPUs and only at 25% max, any way I can change this? Thank you all in advance for the given help.

    Read the article

  • How do I get Java to use my multi-core processor?

    - by Rudiger
    I'm using a GZIPInputStream in my program, and I know that the performance would be helped if I could get Java running my program in parallel. In general, is there a command-line option for the standard VM to run on many cores? It's running on just one as it is. Thanks! Edit I'm running plain ol' Java SE 6 update 17 on Windows XP. Would putting the GZIPInputStream on a separate thread explicitly help? No! Do not put the GZIPInputStream on a separate thread! Do NOT multithread I/O! Edit 2 I suppose I/O is the bottleneck, as I'm reading and writing to the same disk... In general, though, is there a way to make GZIPInputStream faster? Or a replacement for GZIPInputStream that runs parallel? Edit 3 Code snippet I used: GZIPInputStream gzip = new GZIPInputStream(new FileInputStream(INPUT_FILENAME)); DataInputStream in = new DataInputStream(new BufferedInputStream(gzip));

    Read the article

  • Node.js or Erlang

    - by gotts
    I really like these tools when it comes to the concurrency level it can handle. Erlang looks like much more stable solution but requires much more learning and a lot of diving into functional language paradigm. And it looks like Erlang makes it much better when it comes to multi cores CPUs(fix me if I'm wrong). But which should I choose? Which one is better in the short/long term perspective?

    Read the article

  • Is it too early to start designing for Task Parallel Library?

    - by Joe Erickson
    I have been following the development of the .NET Task Parallel Library (TPL) with great interest since Microsoft first announced it. There is no doubt in my mind that we will eventually take advantage of TPL. What I am questioning is whether it makes sense to start taking advantage of TPL when Visual Studio 2010 and .NET 4.0 are released, or whether it makes sense to wait a while longer. Why Start Now? The .NET 4.0 Task Parallel Library appears to be well designed and some relatively simple tests demonstrate that it works well on today's multi-core CPUs. I have been very interested in the potential advantages of using multiple lightweight threads to speed up our software since buying my first quad processor Dell Poweredge 6400 about seven years ago. Experiments at that time indicated that it was not worth the effort, which I attributed largely to the overhead of moving data between each CPU's cache (there was no shared cache back then) and RAM. Competitive advantage - some of our customers can never get enough performance and there is no doubt that we can build a faster product using TPL today. It sounds fun. Yes, I realize that some developers would rather poke themselves in the eye with a sharp stick, but we really enjoy maximizing performance. Why Wait? Are today's Intel Nehalem CPUs representative of where we are going as multi-core support matures? You can purchase a Nehalem CPU with 4 cores which share a single level 3 cache today, and most likely a 6 core CPU sharing a single level 3 cache by the time Visual Studio 2010 / .NET 4.0 are released. Obviously, the number of cores will go up over time, but what about the architecture? As the number of cores goes up, will they still share a cache? One issue with Nehalem is the fact that, even though there is a very fast interconnect between the cores, they have non-uniform memory access (NUMA) which can lead to lower performance and less predictable results. Will future multi-core architectures be able to do away with NUMA? Similarly, will the .NET Task Parallel Library change as it matures, requiring modifications to code to fully take advantage of it? Limitations Our core engine is 100% C# and has to run without full trust, so we are limited to using .NET APIs.

    Read the article

  • Any way to make this working dual core in C#?

    - by Frantisek
    Hi, I got a piece of code that loops through the array and looks for the similar and same strings in it - marking it whether it's unique or not. loop X array for I ( loop X array for Y ( If X is prefix of Y do. else if x is same length as Y and it's prefix do something. ) Here is the code to finilize everything for I and corresponding (found/not found) matches in Y. ) I'd like to make this for dual-core to multithread it. To my knowledge it is not possible, but it's highly probable that you may have some idea.

    Read the article

  • What parallel programming model do you recommend today to take advantage of the manycore processors

    - by Doctor J
    If you were writing a new application from scratch today, and wanted it to scale to all the cores you could throw at it tomorrow, what parallel programming model/system/language/library would you choose? Why? I am particularly interested in answers along these axes: Programmer productivity / ease of use (can mortals successfully use it?) Target application domain (what problems is it (not) good at?) Concurrency style (does it support tasks, pipelines, data parallelism, messages...?) Maintainability / future-proofing (will anybody still be using it in 20 years?) Performance (how does it scale on what kinds of hardware?) I am being deliberately vauge on the nature of the application in anticipation of getting good general answers useful for a variety of applications.

    Read the article

  • How difficult is Haskell multi-threading?

    - by mvid
    I have heard that in Haskell, creating a multi-threaded application is as easy as taking a standard Haskell application and compiling it with the -threaded flag. Other cases, however, have described the use of a par command within the actual source code. What is the state of Haskell multi-threading? How easy is it to introduce into programs? Is there a good multi-threading tutorial that goes over these different commands and their uses?

    Read the article

  • Parallel Programming. Boost's MPI, OpenMP, TBB, or something else?

    - by unknownthreat
    Hello, I am totally a novice in parallel programming, but I do know how to program C++. Now, I am looking around for parallel programming library. I just want to give it a try, just for fun, and right now, I found 3 APIs, but I am not sure which one should I stick with. Right now, I see Boost's MPI, OpenMP and TBB. For anyone who have experienced with any of these 3 API (or any other parallelism API), could you please tell me the difference between these? Are there any factor to consider, like AMD or Intel architecture?

    Read the article

  • Multi-Core Programming. Boost's MPI, OpenMP, TBB, or something else?

    - by unknownthreat
    Hello, I am totally a novice in Multi-Core Programming, but I do know how to program C++. Now, I am looking around for Multi-Core Programming library. I just want to give it a try, just for fun, and right now, I found 3 APIs, but I am not sure which one should I stick with. Right now, I see Boost's MPI, OpenMP and TBB. For anyone who have experienced with any of these 3 API (or any other API), could you please tell me the difference between these? Are there any factor to consider, like AMD or Intel architecture?

    Read the article

  • multi-core processing in R on windows XP - via doMC and foreach

    - by Jan
    Hi guys, I'm posting this question to ask for advice on how to optimize the use of multiple processors from R on a Windows XP machine. At the moment I'm creating 4 scripts (each script with e.g. for (i in 1:100) and (i in 101:200), etc) which I run in 4 different R sessions at the same time. This seems to use all the available cpu. I however would like to do this a bit more efficient. One solution could be to use the "doMC" and the "foreach" package but this is not possible in R on a Windows machine. e.g. library("foreach") library("strucchange") library("doMC") # would this be possible on a windows machine? registerDoMC(2) # for a computer with two cores (processors) ## Nile data with one breakpoint: the annual flows drop in 1898 ## because the first Ashwan dam was built data("Nile") plot(Nile) ## F statistics indicate one breakpoint fs.nile <- Fstats(Nile ~ 1) plot(fs.nile) breakpoints(fs.nile) # , hpc = "foreach" --> It would be great to test this. lines(breakpoints(fs.nile)) Any solutions or advice? Thanks, Jan

    Read the article

  • some pointer to understanding GCC source code

    - by user299570
    hi, I'm student working on optimizing GCC for multi-core processor. I tried going through the source code, it is difficult to follow through it since I need to add some code to the back end. Can anyone suggest some good resource which explains the code flow through the different phases. Also suggest some development environment for debugging GCC mainly to step through the code. Is it possible on windows?

    Read the article

  • What application domains are CPU bound and will tend to benefit from multi-core technologies?

    - by Glomek
    I hear a lot of people talking about the revolution that is coming in programming due to multi-core processors and parallelism, but I can't shake the feeling that for most of us, CPU cycles aren't the bottleneck. Pretty much all of my programs have been I/O bound in one way or another (database, filesystem, network, user interaction, etc.) for a very long time. Now I can think of a few areas where CPU cycles are a limiting factor, like code breaking, graphics, sound, some forms of simulation (weather, physics, etc.), and some forms of mathematical research, but they all seem like fairly specialized application domains. My general impression is that most programs are still I/O bound and that for most of our industry CPUs have been plenty fast for quite a while now. Am I off my rocker? What other application domains are CPU bound today? Do any of them include a large portion of the programming population? In essence, I'm wondering whether the multi-core CPUs will impact very many of us, and if so, how?

    Read the article

  • How can I use my multiple cored dedicated server to run my java application?

    - by Delta
    I have a game built in a java environment and I use JVM. I have 4 cores @ 2.4Ghz and my server is only using one of those cores... I've tried and searched and I still have no guides to setup multiple cores to run the game like, say 1 core for running the character saving + loading, and 1 core for the server itself, and 1 core for a helper to help other cores that need more power. I don't even know if this is possible but this is all in java the operating machine is windows server 2003 and I've tried so hard I just don't know what to do. May someone please help me! Thank you so much!

    Read the article

  • Can processor cores thrash each other's caches?

    - by Jørgen Fogh
    If more than one core on a processor is accessing the same memory address, will they thrash each other's caches or will some snooping protocol allow each to keep the data in L1-cache? I am interested in a general answer as well as answers for specific processors. How many layers of cache are invalidated? Will accessing another address within the same cache-line invalidate the entire line? What can you do to alleviate these problems?

    Read the article

  • Why does this Java code not utilize all CPU cores?

    - by ReneS
    The attached simple Java code should load all available cpu core when starting it with the right parameters. So for instance, you start it with java VMTest 8 int 0 and it will start 8 threads that do nothing else than looping and adding 2 to an integer. Something that runs in registers and not even allocates new memory. The problem we are facing now is, that we do not get a 24 core machine loaded (AMD 2 sockets with 12 cores each), when running this simple program (with 24 threads of course). Similar things happen with 2 programs each 12 threads or smaller machines. So our suspicion is that the JVM (Sun JDK 6u20 on Linux x64) does not scale well. Did anyone see similar things or has the ability to run it and report whether or not it runs well on his/her machine (= 8 cores only please)? Ideas? I tried that on Amazon EC2 with 8 cores too, but the virtual machine seems to run different from a real box, so the loading behaves totally strange. package com.test; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import java.util.concurrent.Future; import java.util.concurrent.TimeUnit; public class VMTest { public class IntTask implements Runnable { @Override public void run() { int i = 0; while (true) { i = i + 2; } } } public class StringTask implements Runnable { @Override public void run() { int i = 0; String s; while (true) { i++; s = "s" + Integer.valueOf(i); } } } public class ArrayTask implements Runnable { private final int size; public ArrayTask(int size) { this.size = size; } @Override public void run() { int i = 0; String[] s; while (true) { i++; s = new String[size]; } } } public void doIt(String[] args) throws InterruptedException { final String command = args[1].trim(); ExecutorService executor = Executors.newFixedThreadPool(Integer.valueOf(args[0])); for (int i = 0; i < Integer.valueOf(args[0]); i++) { Runnable runnable = null; if (command.equalsIgnoreCase("int")) { runnable = new IntTask(); } else if (command.equalsIgnoreCase("string")) { runnable = new StringTask(); } Future<?> submit = executor.submit(runnable); } executor.awaitTermination(1, TimeUnit.HOURS); } public static void main(String[] args) throws InterruptedException { if (args.length < 3) { System.err.println("Usage: VMTest threadCount taskDef size"); System.err.println("threadCount: Number 1..n"); System.err.println("taskDef: int string array"); System.err.println("size: size of memory allocation for array, "); System.exit(-1); } new VMTest().doIt(args); } }

    Read the article

  • The way cores, processes, and threads work exactly?

    - by unknownthreat
    I need a bit of an advice for understanding how this whole procedure work exactly. If I am incorrect in any part described below, please correct me. In a single core CPU, it runs each process in the OS, jumping around from one process to another to utilize the best of itself. A process can also have many threads, in which the CPU core runs through these threads when it is running on the respective process. Now, on a multiple core CPU, Do the cores run in every process together, or can the cores run separately in different processes at one particular point of time? For instance, you have program A running two threads, can a duo core CPU run both threads of this program? I think the answer should be yes if we are using something like OpenMP. But while the cores are running in this OpenMP-embedded process, can one of the core simply switch to other process? For programs that are created for single core, when running at 100%, why the CPU utilization of each core are distributed? (ex. A duo core CPU of 80% and 20%. The utilization percentage of all cores always add up to 100% for this case.) Do the cores try help each other run each thread of each process in some ways? Frankly, I'm not sure how this works exactly. Any advice is appreciated.

    Read the article

  • How to use all the cores in Windows 7?

    - by Anon
    I am not sure if this belongs to Stackoverflow or Superuser but I thought I would ask here. I have a console based application written in C which currently takes about an hour to terminate in Windows 7 64-bit OS. The task manager reports that the application is using only 25% of the available CPU. I would like to reduce the run time by increasing cpu usage. Is there any way to let the application use all four cores (the laptop has Core i5) instead of just one? I am assuming that task manager reports 25% because only one core is allocated to the program.

    Read the article

  • Fast inter-process (inter-threaded) communications IPC on large multi-cpu system.

    - by IPC
    What would be the fastest portable bi-directional communication mechanism for inter-process communication where threads from one application need to communicate to multiple threads in another application on the same computer, and the communicating threads can be on different physical CPUs). I assume that it would involve a shared memory and a circular buffer and shared synchronization mechanisms. But shared mutexes are very expensive (and there are limited number of them too) to synchronize when threads are running on different physical CPUs.

    Read the article

< Previous Page | 1 2 3 4 5 6 7  | Next Page >