Search Results

Search found 168 results on 7 pages for 'computational linguistics'.

Page 6/7 | < Previous Page | 2 3 4 5 6 7  | Next Page >

  • What technologies to use for a particle system with enormous calculation demand?

    - by Amir Rezaei
    I have a particle system with X particles. Each particle tests for collision with other particles. This gives X*X = X^2 collision tests per frame. For 60f/s, this corresponds to 60*X^2 collision detection per second. What is the best technological approach for these intensive calculations? Should I use F#, C, C++ or C#, or something else? The following are constraints The code is written in C# with the latest XNA Multi-threaded may be considered No special algorithm that tests the collision with the nearest neighbors or that reduces the problem The last constraint may be strange, so let me explain. Regardless constraint 3, given a problem with enormous computational requirement what would be the best approach to solve the problem. An algorithm reduces the problem; still the same algorithm may behave different depending on technology. Consider pros and cons of CLR vs native C.

    Read the article

  • Interactive Charts for web application

    - by user227290
    We are working on a web based application (implemented in JAVA) on commodity prices and one part of it is interactive charting. I provide a simplified example here. We have a table in Mysql database where we have information on commodity prices in US states and counties. One aspect of the application is to create interactive plots based on user choice. For example, if the user needs to see the price density in Oregon and Linn county then she chooses it from the menu in a webpage and it is rendered on fly with accompanying quantile information in a table. As the user changes state and county these plots and table change on fly.For our computational need we are using R (and use rjava to integrate it to our web application) and I know that if interactivity is not an issue this is a piece of cake in ggplot2, but I am not aware of any interactive version of R graphics framework (like lattice, ggplot2). We are exploring google visualization API but I am not sure we can have the statistical power we need in some of the plots.Please help.

    Read the article

  • ASP.NET Asynchronous Pages and when to use them

    - by rajbk
    There have been several articles posted about using  asynchronous pages in ASP.NET but none of them go into detail as to when you should use them. I finally found a great post by Thomas Marquardt that explains the process in depth. He addresses a key misconception also: So, in your ASP.NET application, when should you perform work asynchronously instead of synchronously? Well, only 1 thread per CPU can execute at a time.  Did you catch that?  A lot of people seem to miss this point...only one thread executes at a time on a CPU. When you have more than this, you pay an expensive penalty--a context switch. However, if a thread is blocked waiting on work...then it makes sense to switch to another thread, one that can execute now.  It also makes sense to switch threads if you want work to be done in parallel as opposed to in series, but up until a certain point it actually makes much more sense to execute work in series, again, because of the expensive context switch. Pop quiz: If you have a thread that is doing a lot of computational work and using the CPU heavily, and this takes a while, should you switch to another thread? No! The current thread is efficiently using the CPU, so switching will only incur the cost of a context switch. Ok, well, what if you have a thread that makes an HTTP or SOAP request to another server and takes a long time, should you switch threads? Yes! You can perform the HTTP or SOAP request asynchronously, so that once the "send" has occurred, you can unwind the current thread and not use any threads until there is an I/O completion for the "receive". Between the "send" and the "receive", the remote server is busy, so locally you don't need to be blocking on a thread, but instead make use of the asynchronous APIs provided in .NET Framework so that you can unwind and be notified upon completion. Again, it only makes sense to switch threads if the benefit from doing so out weights the cost of the switch. Read more about it in these posts: Performing Asynchronous Work, or Tasks, in ASP.NET Applications http://blogs.msdn.com/tmarq/archive/2010/04/14/performing-asynchronous-work-or-tasks-in-asp-net-applications.aspx ASP.NET Thread Usage on IIS 7.0 and 6.0 http://blogs.msdn.com/tmarq/archive/2007/07/21/asp-net-thread-usage-on-iis-7-0-and-6-0.aspx   PS: I generally do not write posts that simply link to other posts but think it is warranted in this case.

    Read the article

  • A* navigational mesh path finding

    - by theguywholikeslinux
    So I've been making this top down 2D java game in this framework called Greenfoot [1] and I've been working on the AI for the guys you are gonna fight. I want them to be able to move around the world realistically so I soon realized, amongst a couple of other things, I would need some kind of pathfinding. I have made two A* prototypes. One is grid based and then I made one that works with waypoints so now I need to work out a way to get from a 2d "map" of the obstacles/buildings to a graph of nodes that I can make a path from. The actual pathfinding seems fine, just my open and closed lists could use a more efficient data structure, but I'll get to that if and when I need to. I intend to use a navigational mesh for all the reasons out lined in this post on ai-blog.net [2]. However, the problem I have faced is that what A* thinks is the shortest path from the polygon centres/edges is not necessarily the shortest path if you travel through any part of the node. To get a better idea you can see the question I asked on stackoverflow [3]. I got a good answer concerning a visibility graph. I have since purchased the book (Computational Geometry: Algorithms and Applications [4]) and read further into the topic, however I am still in favour of a navigational mesh (See "Managing Complexity" [5] from Amit’s Notes about Path-Finding [6]). (As a side note, maybe I could possibly use Theta* to convert multiple waypoints into one straight line if the first and last are not obscured. Or each time I move back check to the waypoint before last to see if I can go straight from that to this) So basically what I want is a navigational mesh where once I have put it through a funnel algorithm (e.g. this one from Digesting Duck [7]) I will get the true shortest path, rather than get one that is the shortest path following node to node only, but not the actual shortest given that you can go through some polygons and skip nodes/edges. Oh and I also want to know how you suggest storing the information concerning the polygons. For the waypoint prototype example I made I just had each node as an object and stored a list of all the other nodes you could travel to from that node, I'm guessing that won't work with polygons? and how to I tell if a polygon is open/traversable or if it is a solid object? How do I store which nodes make up the polygon? Finally, for the record: I do want to programme this by myself from scratch even though there are already other solutions available and I don't intend to be (re) using this code in anything other than this game so it does not matter that it will inevitably be poor quality. http://greenfoot.org http://www.ai-blog.net/archives/000152.html http://stackoverflow.com/q/7585515/ http://www.cs.uu.nl/geobook/ http://theory.stanford.edu/~amitp/GameProgramming/MapRepresentations.html http://theory.stanford.edu/~amitp/GameProgramming/ http://digestingduck.blogspot.com/2010/03/simple-stupid-funnel-algorithm.html

    Read the article

  • Is commented out code really always bad?

    - by nikie
    Practically every text on code quality I've read agrees that commented out code is a bad thing. The usual example is that someone changed a line of code and left the old line there as a comment, apparently to confuse people who read the code later on. Of course, that's a bad thing. But I often find myself leaving commented out code in another situation: I write a computational-geometry or image processing algorithm. To understand this kind of code, and to find potential bugs in it, it's often very helpful to display intermediate results (e.g. draw a set of points to the screen or save a bitmap file). Looking at these values in the debugger usually means looking at a wall of numbers (coordinates, raw pixel values). Not very helpful. Writing a debugger visualizer every time would be overkill. I don't want to leave the visualization code in the final product (it hurts performance, and usually just confuses the end user), but I don't want to loose it, either. In C++, I can use #ifdef to conditionally compile that code, but I don't see much differnce between this: /* // Debug Visualization: draw set of found interest points for (int i=0; i<count; i++) DrawBox(pts[i].X, pts[i].Y, 5,5); */ and this: #ifdef DEBUG_VISUALIZATION_DRAW_INTEREST_POINTS for (int i=0; i<count; i++) DrawBox(pts[i].X, pts[i].Y, 5,5); #endif So, most of the time, I just leave the visualization code commented out, with a comment saying what is being visualized. When I read the code a year later, I'm usually happy I can just uncomment the visualization code and literally "see what's going on". Should I feel bad about that? Why? Is there a superior solution? Update: S. Lott asks in a comment Are you somehow "over-generalizing" all commented code to include debugging as well as senseless, obsolete code? Why are you making that overly-generalized conclusion? I recently read Robert Glass' "Clean Code", which says: Few practices are as odious as commenting-out code. Don't do this!. I've looked at the paragraph in the book again (p. 68), there's no qualification, no distinction made between different reasons for commenting out code. So I wondered if this rule is over-generalizing (or if I misunderstood the book) or if what I do is bad practice, for some reason I didn't know.

    Read the article

  • Windows Azure Recipe: High Performance Computing

    - by Clint Edmonson
    One of the most attractive ways to use a cloud platform is for parallel processing. Commonly known as high-performance computing (HPC), this approach relies on executing code on many machines at the same time. On Windows Azure, this means running many role instances simultaneously, all working in parallel to solve some problem. Doing this requires some way to schedule applications, which means distributing their work across these instances. To allow this, Windows Azure provides the HPC Scheduler. This service can work with HPC applications built to use the industry-standard Message Passing Interface (MPI). Software that does finite element analysis, such as car crash simulations, is one example of this type of application, and there are many others. The HPC Scheduler can also be used with so-called embarrassingly parallel applications, such as Monte Carlo simulations. Whatever problem is addressed, the value this component provides is the same: It handles the complex problem of scheduling parallel computing work across many Windows Azure worker role instances. Drivers Elastic compute and storage resources Cost avoidance Solution Here’s a sketch of a solution using our Windows Azure HPC SDK: Ingredients Web Role – this hosts a HPC scheduler web portal to allow web based job submission and management. It also exposes an HTTP web service API to allow other tools (including Visual Studio) to post jobs as well. Worker Role – typically multiple worker roles are enlisted, including at least one head node that schedules jobs to be run among the remaining compute nodes. Database – stores state information about the job queue and resource configuration for the solution. Blobs, Tables, Queues, Caching (optional) – many parallel algorithms persist intermediate and/or permanent data as a result of their processing. These fast, highly reliable, parallelizable storage options are all available to all the jobs being processed. Training Here is a link to online Windows Azure training labs where you can learn more about the individual ingredients described above. (Note: The entire Windows Azure Training Kit can also be downloaded for offline use.) Windows Azure HPC Scheduler (3 labs)  The Windows Azure HPC Scheduler includes modules and features that enable you to launch and manage high-performance computing (HPC) applications and other parallel workloads within a Windows Azure service. The scheduler supports parallel computational tasks such as parametric sweeps, Message Passing Interface (MPI) processes, and service-oriented architecture (SOA) requests across your computing resources in Windows Azure. With the Windows Azure HPC Scheduler SDK, developers can create Windows Azure deployments that support scalable, compute-intensive, parallel applications. See my Windows Azure Resource Guide for more guidance on how to get started, including links web portals, training kits, samples, and blogs related to Windows Azure.

    Read the article

  • ArchBeat Link-o-Rama for 2012-08-28

    - by Bob Rhubart
    You may be tempted by IaaS, but you should PaaS on that or your database cloud journey will be a short one "The better option [to IaaS] is to rationalize the deployment stack so that VMs are needed only for exceptional cases," says B. R. Clouse. "By settling on a standard operating system and patch level, you create an infrastructure that potentially all of your databases can share. Now, the building block will be database instances or possibly schemas within databases. These components are the platforms on which you will deploy workloads, hence this is known as Platform as a Service (PaaS)." 'Shadow IT' can be the cloud's best friend | David Linthicum "I do not advocate that IT give up control and allow business units to adopt any old technology they want," says Infoworld cloud computing blogger David Linthicum. "However, IT needs to face reality: For the past three decades or so, corporate IT has been slow on the uptake around the use of productive new technologies." Do you agree? 9 ways cloud will impact IT employment | ZDNet ZDNet blogger Joe McKendrick condenses information from a recent report on how cloud computing will impact IT jobs. Number one on the list: New categories of jobs arising from cloud computing, which include "private cloud developers and administrators, departmental liaisons, integration specialists, cloud architects, and compliance specialists." Yeah, that's right, cloud architects. For more on cloud architects, including what you need to up your game to thrive in the cloud, check out "The Role of the Cloud Architect" on the OTN ArchBeat Podcast. Decisions, Decisions: The art, science, and politics of technology selection "When the time comes for a solution architect to make the final decision about the technologies, standards, and other elements that are to be incorporated into a particular project, what factors weigh most heavily on that decision? It comes as no surprise that among the architects I contacted, business needs top the list." Managing Oracle Exalogic Elastic Cloud with Oracle Enterprise Manager Ops Center Anand Akela's byline is on this post, but "Dr. Jürgen Fleischer, Oracle Enterprise Manager Ops Center Engineering" appears at the end of the post, so it's anybody's guess as to who wrote this thing. But the content includes a complete listing of the Exalogic 2.0.1 Tea Break Snippets series written by a member of the Exalogic team who goes by the name "The Old Toxophilist." So maybe the best thing to do here is ignore the names and focus on the very useful conent. Boost your infrastructure with Coherence into the Cloud | Nino Guarnacci Nino Guarnacci describes a use case that involved managing a variety of data caches that process complex queries and parallel computational operations, in order to maintain the caches in a consistent state on different server instances. Thought for the Day "No one hates software more than software developers." — Jeff Atwood Source: SoftwareQuotes

    Read the article

  • design a model for a system of dependent variables

    - by dbaseman
    I'm dealing with a modeling system (financial) that has dozens of variables. Some of the variables are independent, and function as inputs to the system; most of them are calculated from other variables (independent and calculated) in the system. What I'm looking for is a clean, elegant way to: define the function of each dependent variable in the system trigger a re-calculation, whenever a variable changes, of the variables that depend on it A naive way to do this would be to write a single class that implements INotifyPropertyChanged, and uses a massive case statement that lists out all the variable names x1, x2, ... xn on which others depend, and, whenever a variable xi changes, triggers a recalculation of each of that variable's dependencies. I feel that this naive approach is flawed, and that there must be a cleaner way. I started down the path of defining a CalculationManager<TModel> class, which would be used (in a simple example) something like as follows: public class Model : INotifyPropertyChanged { private CalculationManager<Model> _calculationManager = new CalculationManager<Model>(); // each setter triggers a "PropertyChanged" event public double? Height { get; set; } public double? Weight { get; set; } public double? BMI { get; set; } public Model() { _calculationManager.DefineDependency<double?>( forProperty: model => model.BMI, usingCalculation: (height, weight) => weight / Math.Pow(height, 2), withInputs: model => model.Height, model.Weight); } // INotifyPropertyChanged implementation here } I won't reproduce CalculationManager<TModel> here, but the basic idea is that it sets up a dependency map, listens for PropertyChanged events, and updates dependent properties as needed. I still feel that I'm missing something major here, and that this isn't the right approach: the (mis)use of INotifyPropertyChanged seems to me like a code smell the withInputs parameter is defined as params Expression<Func<TModel, T>>[] args, which means that the argument list of usingCalculation is not checked at compile time the argument list (weight, height) is redundantly defined in both usingCalculation and withInputs I am sure that this kind of system of dependent variables must be common in computational mathematics, physics, finance, and other fields. Does someone know of an established set of ideas that deal with what I'm grasping at here? Would this be a suitable application for a functional language like F#? Edit More context: The model currently exists in an Excel spreadsheet, and is being migrated to a C# application. It is run on-demand, and the variables can be modified by the user from the application's UI. Its purpose is to retrieve variables that the business is interested in, given current inputs from the markets, and model parameters set by the business.

    Read the article

  • How employable am I as a programmer?

    - by dsimcha
    I'm currently a Ph.D. student in Biomedical Engineering with a concentration in computational biology and am starting to think about what I want to do after graduate school. I feel like I've accumulated a lot of programming skills while in grad school, but taken a very non-traditional path to learning all this stuff. I'm wondering whether I would have an easy time getting hired as a programmer and could fall back on that if I can't find a good job directly in my field, and if so whether I would qualify for a more prestigious position than "code monkey". Things I Have Going For Me Approximately 4 years of experience programming as part of my research. I believe I have a solid enough grasp of the fundamentals that I could pick up new languages and technologies pretty fast, and could demonstrate this in an interview. Good math and statistics skills. An extensive portfolio of open source work (and the knowledge that working on these projects implies): I wrote a statistics library in D, mostly from scratch. I wrote a parallelism library (parallel map, reduce, foreach, task parallelism, pipelining, etc.) that is currently in review for adoption by the D standard library. I wrote a 2D plotting library for D against the GTK Cairo backend. I currently use it for most of the figures I make for my research. I've contributed several major performance optimizations to the D garbage collector. (Most of these were low-hanging fruit, but it still shows my knowledge of low-level issues like memory management, pointers and bit twiddling.) I've contributed lots of miscellaneous bug fixes to the D standard library and could show the change logs to prove it. (This demonstrates my ability read other people's code.) Things I Have Going Against Me Most of my programming experience is in D and Python. I have very little to virtually no experience in the more established, "enterprise-y" languages like Java, C# and C++, though I have learned a decent amount about these languages from small, one-off projects and discussions about language design in the D community. In general I have absolutely no knowledge of "enterprise-y" technlogies. I've never used a framework before, possibly because most reusable code for scientific work and for D tends to call itself a "library" instead. I have virtually no formal computer science/software engineering training. Almost all of my knowledge comes from talking to programming geek friends, reading blogs, forums, StackOverflow, etc. I have zero professional experience with the official title of "developer", "software engineer", or something similar.

    Read the article

  • Join mp4 files in linux

    - by Jose Armando
    I want to join two mp4 files to create a single one. The video streams are encoded in h264 and the audio in aac. I can not re-encode the videos to another format due to computational reasons. Also, I cannot use any gui programs, all processing must be performed with linux command line utilities. FFmpeg cannot do this for mpeg4 files so instead I used MP4Box e.g. MP4Box -add video1.mp4 -cat video2.mp4 newvideo.mp4 unfortunately the audio gets all mixed up. I thought that the problem was that the audio was in aac so I transcoded it in mp3 and used again MP4Box. In this case the audio is fine for the first half of newvideo.mp4 (corresponding to video1.mp4) but then their is no audio and I cannot navigate in the video also. My next thought was that the audio and video streams had some small discrepancies in their lengths that I should fix. So for each input video I splitted the video and audio streams and then joined them with the -shortest option in ffmpeg. thus for the first video I ran avconv -y -i video1.mp4 -c copy -map 0:0 videostream1.mp4 avconv -y -i video1.mp4 -c copy -map 0:1 audiostream1.m4a avconv -y -i videostream1.mp4 -i audiostream1.m4a -c copy -shortest video1_aligned.mp4 similarly for the second video and then used MP4Box as previously. Unfortunately this didn't work either. The only success I had was when I joined the video streams separetely (i.e. videostream1.mp4 and videostream2.mp4) and the audio streams (i.e. audiostream1.m4a and audiostream2.m4a) and then joined the video and audio in a final file. However, the synchronization is lost for the second half of the video. Concretelly, there is a 1 sec delay of audio and video. Any suggestions are really welcome.

    Read the article

  • How to quickly check if two columns in Excel are equivalent in value?

    - by mindless.panda
    I am interested in taking two columns and getting a quick answer on whether they are equivalent in value or not. Let me show you what I mean: So its trivial to make another column (EQUAL) that does a simple compare for each pair of cells in the two columns. It's also trivial to use conditional formatting on one of the two, checking its value against the other. The problem is both of these methods require scanning the third column or the color of one of the columns. Often I am doing this for columns that are very, very long, and visual verification would take too long and neither do I trust my eyes. I could use a pivot table to summarize the EQUAL column and see if any FALSE entries occur. I could also enable filtering and click on the filter on EQUAL and see what entries are shown. Again, all of these methods are time consuming for what seems to be such a simple computational task. What I'm interested in finding out is if there is a single cell formula that answers the question. I attempted one above in the screenshot, but clearly it doesn't do what I expected, since A10 does not equal B10. Anyone know of one that works or some other method that accomplishes this?

    Read the article

  • What is the fastest cyclic synchronization in Java (ExecutorService vs. CyclicBarrier vs. X)?

    - by Alex Dunlop
    Which Java synchronization construct is likely to provide the best performance for a concurrent, iterative processing scenario with a fixed number of threads like the one outlined below? After experimenting on my own for a while (using ExecutorService and CyclicBarrier) and being somewhat surprised by the results, I would be grateful for some expert advice and maybe some new ideas. Existing questions here do not seem to focus primarily on performance, hence this new one. Thanks in advance! The core of the app is a simple iterative data processing algorithm, parallelized to the spread the computational load across 8 cores on a Mac Pro, running OS X 10.6 and Java 1.6.0_07. The data to be processed is split into 8 blocks and each block is fed to a Runnable to be executed by one of a fixed number of threads. Parallelizing the algorithm was fairly straightforward, and it functionally works as desired, but its performance is not yet what I think it could be. The app seems to spend a lot of time in system calls synchronizing, so after some profiling I wonder whether I selected the most appropriate synchronization mechanism(s). A key requirement of the algorithm is that it needs to proceed in stages, so the threads need to sync up at the end of each stage. The main thread prepares the work (very low overhead), passes it to the threads, lets them work on it, then proceeds when all threads are done, rearranges the work (again very low overhead) and repeats the cycle. The machine is dedicated to this task, Garbage Collection is minimized by using per-thread pools of pre-allocated items, and the number of threads can be fixed (no incoming requests or the like, just one thread per CPU core). V1 - ExecutorService My first implementation used an ExecutorService with 8 worker threads. The program creates 8 tasks holding the work and then lets them work on it, roughly like this: // create one thread per CPU executorService = Executors.newFixedThreadPool( 8 ); ... // now process data in cycles while( ...) { // package data into 8 work items ... // create one Callable task per work item ... // submit the Callables to the worker threads executorService.invokeAll( taskList ); } This works well functionally (it does what it should), and for very large work items indeed all 8 CPUs become highly loaded, as much as the processing algorithm would be expected to allow (some work items will finish faster than others, then idle). However, as the work items become smaller (and this is not really under the program's control), the user CPU load shrinks dramatically: blocksize | system | user | cycles/sec 256k 1.8% 85% 1.30 64k 2.5% 77% 5.6 16k 4% 64% 22.5 4096 8% 56% 86 1024 13% 38% 227 256 17% 19% 420 64 19% 17% 948 16 19% 13% 1626 Legend: - block size = size of the work item (= computational steps) - system = system load, as shown in OS X Activity Monitor (red bar) - user = user load, as shown in OS X Activity Monitor (green bar) - cycles/sec = iterations through the main while loop, more is better The primary area of concern here is the high percentage of time spent in the system, which appears to be driven by thread synchronization calls. As expected, for smaller work items, ExecutorService.invokeAll() will require relatively more effort to sync up the threads versus the amount of work being performed in each thread. But since ExecutorService is more generic than it would need to be for this use case (it can queue tasks for threads if there are more tasks than cores), I though maybe there would be a leaner synchronization construct. V2 - CyclicBarrier The next implementation used a CyclicBarrier to sync up the threads before receiving work and after completing it, roughly as follows: main() { // create the barrier barrier = new CyclicBarrier( 8 + 1 ); // create Runable for thread, tell it about the barrier Runnable task = new WorkerThreadRunnable( barrier ); // start the threads for( int i = 0; i < 8; i++ ) { // create one thread per core new Thread( task ).start(); } while( ... ) { // tell threads about the work ... // N threads + this will call await(), then system proceeds barrier.await(); // ... now worker threads work on the work... // wait for worker threads to finish barrier.await(); } } class WorkerThreadRunnable implements Runnable { CyclicBarrier barrier; WorkerThreadRunnable( CyclicBarrier barrier ) { this.barrier = barrier; } public void run() { while( true ) { // wait for work barrier.await(); // do the work ... // wait for everyone else to finish barrier.await(); } } } Again, this works well functionally (it does what it should), and for very large work items indeed all 8 CPUs become highly loaded, as before. However, as the work items become smaller, the load still shrinks dramatically: blocksize | system | user | cycles/sec 256k 1.9% 85% 1.30 64k 2.7% 78% 6.1 16k 5.5% 52% 25 4096 9% 29% 64 1024 11% 15% 117 256 12% 8% 169 64 12% 6.5% 285 16 12% 6% 377 For large work items, synchronization is negligible and the performance is identical to V1. But unexpectedly, the results of the (highly specialized) CyclicBarrier seem MUCH WORSE than those for the (generic) ExecutorService: throughput (cycles/sec) is only about 1/4th of V1. A preliminary conclusion would be that even though this seems to be the advertised ideal use case for CyclicBarrier, it performs much worse than the generic ExecutorService. V3 - Wait/Notify + CyclicBarrier It seemed worth a try to replace the first cyclic barrier await() with a simple wait/notify mechanism: main() { // create the barrier // create Runable for thread, tell it about the barrier // start the threads while( ... ) { // tell threads about the work // for each: workerThreadRunnable.setWorkItem( ... ); // ... now worker threads work on the work... // wait for worker threads to finish barrier.await(); } } class WorkerThreadRunnable implements Runnable { CyclicBarrier barrier; @NotNull volatile private Callable<Integer> workItem; WorkerThreadRunnable( CyclicBarrier barrier ) { this.barrier = barrier; this.workItem = NO_WORK; } final protected void setWorkItem( @NotNull final Callable<Integer> callable ) { synchronized( this ) { workItem = callable; notify(); } } public void run() { while( true ) { // wait for work while( true ) { synchronized( this ) { if( workItem != NO_WORK ) break; try { wait(); } catch( InterruptedException e ) { e.printStackTrace(); } } } // do the work ... // wait for everyone else to finish barrier.await(); } } } Again, this works well functionally (it does what it should). blocksize | system | user | cycles/sec 256k 1.9% 85% 1.30 64k 2.4% 80% 6.3 16k 4.6% 60% 30.1 4096 8.6% 41% 98.5 1024 12% 23% 202 256 14% 11.6% 299 64 14% 10.0% 518 16 14.8% 8.7% 679 The throughput for small work items is still much worse than that of the ExecutorService, but about 2x that of the CyclicBarrier. Eliminating one CyclicBarrier eliminates half of the gap. V4 - Busy wait instead of wait/notify Since this app is the primary one running on the system and the cores idle anyway if they're not busy with a work item, why not try a busy wait for work items in each thread, even if that spins the CPU needlessly. The worker thread code changes as follows: class WorkerThreadRunnable implements Runnable { // as before final protected void setWorkItem( @NotNull final Callable<Integer> callable ) { workItem = callable; } public void run() { while( true ) { // busy-wait for work while( true ) { if( workItem != NO_WORK ) break; } // do the work ... // wait for everyone else to finish barrier.await(); } } } Also works well functionally (it does what it should). blocksize | system | user | cycles/sec 256k 1.9% 85% 1.30 64k 2.2% 81% 6.3 16k 4.2% 62% 33 4096 7.5% 40% 107 1024 10.4% 23% 210 256 12.0% 12.0% 310 64 11.9% 10.2% 550 16 12.2% 8.6% 741 For small work items, this increases throughput by a further 10% over the CyclicBarrier + wait/notify variant, which is not insignificant. But it is still much lower-throughput than V1 with the ExecutorService. V5 - ? So what is the best synchronization mechanism for such a (presumably not uncommon) problem? I am weary of writing my own sync mechanism to completely replace ExecutorService (assuming that it is too generic and there has to be something that can still be taken out to make it more efficient). It is not my area of expertise and I'm concerned that I'd spend a lot of time debugging it (since I'm not even sure my wait/notify and busy wait variants are correct) for uncertain gain. Any advice would be greatly appreciated.

    Read the article

  • This Week in Geek History: Gmail Goes Public, Deep Blue Wins at Chess, and the Birth of Thomas Edison

    - by Jason Fitzpatrick
    Every week we bring you a snapshot of the week in Geek History. This week we’re taking a peek at the public release of Gmail, the first time a computer won against a chess champion, and the birth of prolific inventor Thomas Edison. Gmail Goes Public It’s hard to believe that Gmail has only been around for seven years and that for the first three years of its life it was invite only. In 2007 Gmail dropped the invite only requirement (although they would hold onto the “beta” tag for another two years) and opened its doors for anyone to grab a username @gmail. For what seemed like an entire epoch in internet history Gmail had the slickest web-based email around with constant innovations and features rolling out from Gmail Labs. Only in the last year or so have major overhauls at competitors like Hotmail and Yahoo! Mail brought other services up to speed. Can’t stand reading a Week in Geek History entry without a random fact? Here you go: gmail.com was originally owned by the Garfield franchise and ran a service that delivered Garfield comics to your email inbox. No, we’re not kidding. Deep Blue Proves Itself a Chess Master Deep Blue was a super computer constructed by IBM with the sole purpose of winning chess matches. In 2011 with the all seeing eye of Google and the amazing computational abilities of engines like Wolfram Alpha we simply take powerful computers immersed in our daily lives for granted. The 1996 match against reigning world chest champion Garry Kasparov where in Deep Blue held its own, but ultimately lost, in a  4-2 match shook a lot of people up. What did it mean if something that was considered such an elegant and quintessentially human endeavor such as chess was so easy for a machine? A series of upgrades helped Deep Blue outright win a match against Kasparov in 1997 (seen in the photo above). After the win Deep Blue was retired and disassembled. Parts of Deep Blue are housed in the National Museum of History and the Computer History Museum. Birth of Thomas Edison Thomas Alva Edison was one of the most prolific inventors in history and holds an astounding 1,093 US Patents. He is responsible for outright inventing or greatly refining major innovations in the history of world culture including the phonograph, the movie camera, the carbon microphone used in nearly every telephone well into the 1980s, batteries for electric cars (a notion we’d take over a century to take seriously), voting machines, and of course his enormous contribution to electric distribution systems. Despite the role of scientist and inventor being largely unglamorous, Thomas Edison and his tumultuous relationship with fellow inventor Nikola Tesla have been fodder for everything from books, to comics, to movies, and video games. Other Notable Moments from This Week in Geek History Although we only shine the spotlight on three interesting facts a week in our Geek History column, that doesn’t mean we don’t have space to highlight a few more in passing. This week in Geek History: 1971 – Apollo 14 returns to Earth after third Lunar mission. 1974 – Birth of Robot Chicken creator Seth Green. 1986 – Death of Dune creator Frank Herbert. Goodnight Dune. 1997 – Simpsons becomes longest running animated show on television. Have an interesting bit of geek trivia to share? Shoot us an email to [email protected] with “history” in the subject line and we’ll be sure to add it to our list of trivia. Latest Features How-To Geek ETC Here’s a Super Simple Trick to Defeating Fake Anti-Virus Malware How to Change the Default Application for Android Tasks Stop Believing TV’s Lies: The Real Truth About "Enhancing" Images The How-To Geek Valentine’s Day Gift Guide Inspire Geek Love with These Hilarious Geek Valentines RGB? CMYK? Alpha? What Are Image Channels and What Do They Mean? Clean Up Google Calendar’s Interface in Chrome and Iron The Rise and Fall of Kramerica? [Seinfeld Video] GNOME Shell 3 Live CDs for OpenSUSE and Fedora Available for Testing Picplz Offers Special FX, Sharing, and Backup of Your Smartphone Pics BUILD! An Epic LEGO Stop Motion Film [VIDEO] The Lingering Glow of Sunset over a Winter Landscape Wallpaper

    Read the article

  • Windows Azure Use Case: Web Applications

    - by BuckWoody
    This is one in a series of posts on when and where to use a distributed architecture design in your organization's computing needs. You can find the main post here: http://blogs.msdn.com/b/buckwoody/archive/2011/01/18/windows-azure-and-sql-azure-use-cases.aspx  Description: Many applications have a requirement to be located outside of the organization’s internal infrastructure control. For instance, the company website for a brick-and-mortar retail company may want to post not only static but interactive content to be available to their external customers, and not want the customers to have access inside the organization’s firewall. There are also cases of pure web applications used for a great many of the internal functions of the business. This allows for remote workers, shared customer/employee workloads and data and other advantages. Some firms choose to host these web servers internally, others choose to contract out the infrastructure to an “ASP” (Application Service Provider) or an Infrastructure as a Service (IaaS) company. In any case, the design of these applications often resembles the following: In this design, a server (or perhaps more than one) hosts the presentation function (http or https) access to the application, and this same system may hold the computational aspects of the program. Authorization and Access is controlled programmatically, or is more open if this is a customer-facing application. Storage is either placed on the same or other servers, hosted within an RDBMS or NoSQL database, or a combination of the options, all coded into the application. High-Availability within this scenario is often the responsibility of the architects of the application, and by purchasing more hosting resources which must be built, licensed and configured, and manually added as demand requires, although some IaaS providers have a partially automatic method to add nodes for scale-out, if the architecture of the application supports it. Disaster Recovery is the responsibility of the system architect as well. Implementation: In a Windows Azure Platform as a Service (PaaS) environment, many of these architectural considerations are designed into the system. The Azure “Fabric” (not to be confused with the Azure implementation of Application Fabric - more on that in a moment) is designed to provide scalability. Compute resources can be added and removed programmatically based on any number of factors. Balancers at the request-level of the Fabric automatically route http and https requests. The fabric also provides High-Availability for storage and other components. Disaster recovery is a shared responsibility between the facilities (which have the ability to restore in case of catastrophic failure) and your code, which should build in recovery. In a Windows Azure-based web application, you have the ability to separate out the various functions and components. Presentation can be coded for multiple platforms like smart phones, tablets and PC’s, while the computation can be a single entity shared between them. This makes the applications more resilient and more object-oriented, and lends itself to a SOA or Distributed Computing architecture. It is true that you could code up a similar set of functionality in a traditional web-farm, but the difference here is that the components are built into the very design of the architecture. The API’s and DLL’s you call in a Windows Azure code base contains components as first-class citizens. For instance, if you need storage, it is simply called within the application as an object.  Computation has multiple options and the ability to scale linearly. You also gain another component that you would either have to write or bolt-in to a typical web-farm: the Application Fabric. This Windows Azure component provides communication between applications or even to on-premise systems. It provides authorization in either person-based or claims-based perspectives. SQL Azure provides relational storage as another option, and can also be used or accessed from on-premise systems. It should be noted that you can use all or some of these components individually. Resources: Design Strategies for Scalable Active Server Applications - http://msdn.microsoft.com/en-us/library/ms972349.aspx  Physical Tiers and Deployment  - http://msdn.microsoft.com/en-us/library/ee658120.aspx

    Read the article

  • VirtualBox 3.2 is released! A Red Letter Day?

    - by Fat Bloke
    Big news today! A new release of VirtualBox packed full of innovation and improvements. Over the next few weeks we'll take a closer look at some of these new features in a lot more depth, but today we'll whet your appetite with the headline descriptions. To start with, we should point out that this is the first Oracle-branded version which makes today a real Red-letter day ;-)  Oracle VM VirtualBox 3.2 Version 3.2 moves VirtualBox forward in 3 main areas ( handily, all beginning with "P" ) : performance, power and supported guest operating system platforms.  Let's take a look: Performance New Latest Intel hardware support - Harnessing the latest in chip-level support for virtualization, VirtualBox 3.2 supports new Intel Core i5 and i7 processor and Intel Xeon processor 5600 Series support for Unrestricted Guest Execution bringing faster boot times for everything from Windows to Solaris guests; New Large Page support - Reducing the size and overhead of key system resources, Large Page support delivers increased performance by enabling faster lookups and shorter table creation times. New In-hypervisor Networking - Significant optimization of the networking subsystem has reduced context switching between guests and host, increasing network throughput by up to 25%. New New Storage I/O subsystem - VirtualBox 3.2 offers a completely re-worked virtual disk subsystem which utilizes asynchronous I/O to achieve high-performance whilst maintaining high data integrity; New Remote Video Acceleration - The unique built-in VirtualBox Remote Display Protocol (VRDP), which is primarily used in virtual desktop infrastructure deployments, has been enhanced to deliver video acceleration. This delivers a rich user experience coupled with reduced computational expense, which is vital when servers are running hundreds of virtual machines; Power New Page Fusion - Traditional Page Sharing techniques have suffered from long and expensive cache construction as pages are scrutinized as candidates for de-duplication. Taking a smarter approach, VirtualBox Page Fusion uses intelligence in the guest virtual machine to determine much more rapidly and accurately those pages which can be eliminated thereby increasing the capacity or vm density of the system; New Memory Ballooning- Ballooning provides another method to increase vm density by allowing the memory of one guest to be recouped and made available to others; New Multiple Virtual Monitors - VirtualBox 3.2 now supports multi-headed virtual machines with up to 8 virtual monitors attached to a guest. Each virtual monitor can be a host window, or be mapped to the hosts physical monitors; New Hot-plug CPU's - Modern operating systems such Windows Server 2008 x64 Data Center Edition or the latest Linux server platforms allow CPUs to be dynamically inserted into a system to provide incremental computing power while the system is running. Version 3.2 introduces support for Hot-plug vCPUs, allowing VirtualBox virtual machines to be given more power, with zero-downtime of the guest; New Virtual SAS Controller - VirtualBox 3.2 now offers a virtual SAS controller, enabling it to run the most demanding of high-end guests; New Online Snapshot Merging - Snapshots are powerful but can eat up disk space and need to be pruned from time to time. Historically, machines have needed to be turned off to delete or merge snapshots but with VirtualBox 3.2 this operation can be done whilst the machines are running. This allows sophisticated system management with minimal interruption of operations; New OVF Enhancements - VirtualBox has supported the OVF standard for virtual machine portability for some time. Now with 3.2, VirtualBox specific configuration data is also stored in the standard allowing richer virtual machine definitions without compromising portability; New Guest Automation - The Guest Automation APIs allow host-based logic to drive operations in the guest; Platforms New USB Keyboard and Mouse - Support more guests that require USB input devices; New Oracle Enterprise Linux 5.5 - Support for the latest version of Oracle's flagship Linux platform; New Ubuntu 10.04 ("Lucid Lynx") - Support for both the desktop and server version of the popular Ubuntu Linux distribution; And as a man once said, "just one more thing" ... New Mac OS X (experimental) - On Apple hardware only, support for creating virtual machines run Mac OS X. All in all this is a pretty powerful release packed full of innovation and speedups. So what are you waiting for?  -FB 

    Read the article

  • newline-ignoring diff / diff across multiple lines / reflow-ignoring diff

    - by Adam
    Does anybody know of a diff-like tool that can show me the changes between two text files, but ignore changes in whitespace including newlines? Here's an example: the quick brown fox jumped over the lazy bear. the quick brown fox jumped over the lazy bear. the quick brown fox jumped over the lazy bear. the quick brown fox jumped over the lazy bear. quick brown fox jumped over the lazy bear. the quick brown fox jumped over the lazy bear. the quick brown fox jumped over the lazy bear. the quick brown fox jumped over the lazy bear. All I did was delete one word and reflow it, but "diff -b" detects a change on every line (as it should; I'm not saying this is a bug in diff). But for large LaTeX files this is a major problem; change one word in a long paragraph and the diff you get back is basically useless. By the way, I'm aware that this requires way more computational power than the usual lines-are-atomic diff. I'm only doing this on small human-generated files and am happy to wait a long time if I have to.

    Read the article

  • Computer science textbooks

    - by Barrett Conrad
    I would like to try the book question a little bit differently. My goal is to know what the community thinks are the quintessential computer science textbooks. <beginsadstory>I lost all of my computer science and math books from college in Hurricane Katrina in 2005. I greatly miss having my familiar tomes to refer to when topics and problems come up, so I am looking to rebuild my library.<endsadstory> What are your recommendations for the best examples of academic caliber books for the field of computer science and its associated mathematics? I am looking for books on subjects like computational theory, programming languages, compilers, operating systems, algorithms and so on. Don't limit your suggestions to your textbooks only. If there is a book you have read that covers computer science or a related math in a formal way, but is not strictly a textbook, I would be love to hear about them as well. Finally, for the sake of creating a good reference for all of us, may I suggest posting one book per answer so they can be rated individually.

    Read the article

  • Would OpenID or OAuth work for authorization/authentication on a distributed web service?

    - by David Eyk
    We're in the early stages of designing a RESTful/resource-oriented web service API for a computational lingustics application. Because many of the resources we plan to serve are rights-encumbered, a key design decision has been to specify the platform so that each resource provider can expose their own web service that complies with the API spec. This way, the rights owner maintains control over their content (and thus the ability to throttle or deny access at will) and a direct relationship with the consumer, while still being able to participate in in the collaborative network. At the same time, to simplify the job of writing a client for this service, we want to allow a client access to the distributed service through one end-point, with the server handling content negotiation and retrieval from the appropriate providers. Right now, we're at an impasse on authentication/authorization schemes. One of our number has argued for the (technical) simplicity of a central authentication registry, but others are concerned about the organizational complexity of such a scheme. It seems to me, based on an albeit limited understanding of the technologies, that a combination of OpenID and OAuth would do the trick, with a client authenticating with the end-point via OpenID, and the server taking action on the user's behalf with the various content providers using OAuth. I've only ever seen implementations (e.g. stackoverflow, twitter, etc.) where a human was present to intervene, and I still need to do more research on these technologies. Would a scheme like this work for an automated web service, or would it make the client too difficult to implement and operate?

    Read the article

  • Scalable / Parallel Large Graph Analysis Library?

    - by Joel Hoff
    I am looking for good recommendations for scalable and/or parallel large graph analysis libraries in various languages. The problems I am working on involve significant computational analysis of graphs/networks with 1-100 million nodes and 10 million to 1+ billion edges. The largest SMP computer I am using has 256 GB memory, but I also have access to an HPC cluster with 1000 cores, 2 TB aggregate memory, and MPI for communication. I am primarily looking for scalable, high-performance graph libraries that could be used in either single or multi-threaded scenarios, but parallel analysis libraries based on MPI or a similar protocol for communication and/or distributed memory are also of interest for high-end problems. Target programming languages include C++, C, Java, and Python. My research to-date has come up with the following possible solutions for these languages: C++ -- The most viable solutions appear to be the Boost Graph Library and Parallel Boost Graph Library. I have looked briefly at MTGL, but it is currently slanted more toward massively multithreaded hardware architectures like the Cray XMT. C - igraph and SNAP (Small-world Network Analysis and Partitioning); latter uses OpenMP for parallelism on SMP systems. Java - I have found no parallel libraries here yet, but JGraphT and perhaps JUNG are leading contenders in the non-parallel space. Python - igraph and NetworkX look like the most solid options, though neither is parallel. There used to be Python bindings for BGL, but these are now unsupported; last release in 2005 looks stale now. Other topics here on SO that I've looked at have discussed graph libraries in C++, Java, Python, and other languages. However, none of these topics focused significantly on scalability. Does anyone have recommendations they can offer based on experience with any of the above or other library packages when applied to large graph analysis problems? Performance, scalability, and code stability/maturity are my primary concerns. Most of the specialized algorithms will be developed by my team with the exception of any graph-oriented parallel communication or distributed memory frameworks (where the graph state is distributed across a cluster).

    Read the article

  • Autodetect Presence of CSV Headers in a File

    - by banzaimonkey
    Short question: How do I automatically detect whether a CSV file has headers in the first row? Details: I've written a small CSV parsing engine that places the data into an object that I can access as (approximately) an in-memory database. The original code was written to parse third-party CSV with a predictable format, but I'd like to be able to use this code more generally. I'm trying to figure out a reliable way to automatically detect the presence of CSV headers, so the script can decide whether to use the first row of the CSV file as keys / column names or start parsing data immediately. Since all I need is a boolean test, I could easily specify an argument after inspecting the CSV file myself, but I'd rather not have to (go go automation). I imagine I'd have to parse the first 3 to ? rows of the CSV file and look for a pattern of some sort to compare against the headers. I'm having nightmares of three particularly bad cases in which: The headers include numeric data for some reason The first few rows (or large portions of the CSV) are null There headers and data look too similar to tell them apart If I can get a "best guess" and have the parser fail with an error or spit out a warning if it can't decide, that's OK. If this is something that's going to be tremendously expensive in terms of time or computation (and take more time than it's supposed to save me) I'll happily scrap the idea and go back to working on "important things". I'm working with PHP, but this strikes me as more of an algorithmic / computational question than something that's implementation-specific. If there's a simple algorithm I can use, great. If you can point me to some relevant theory / discussion, that'd be great, too. If there's a giant library that does natural language processing or 300 different kinds of parsing, I'm not interested.

    Read the article

  • Haskell Linear Algebra Matrix Library for Arbitrary Element Types

    - by Johannes Weiß
    I'm looking for a Haskell linear algebra library that has the following features: Matrix multiplication Matrix addition Matrix transposition Rank calculation Matrix inversion is a plus and has the following properties: arbitrary element (scalar) types (in particular element types that are not Storable instances). My elements are an instance of Num, additionally the multiplicative inverse can be calculated. The elements mathematically form a finite field (??2256). That should be enough to implement the features mentioned above. arbitrary matrix sizes (I'll probably need something like 100x100, but the matrix sizes will depend on the user's input so it should not be limited by anything else but the memory or the computational power available) as fast as possible, but I'm aware that a library for arbitrary elements will probably not perform like a C/Fortran library that does the work (interfaced via FFI) because of the indirection of arbitrary (non Int, Double or similar) types. At least one pointer gets dereferenced when an element is touched (written in Haskell, this is not a real requirement for me, but since my elements are no Storable instances the library has to be written in Haskell) I already tried very hard and evaluated everything that looked promising (most of the libraries on Hackage directly state that they wont work for me). In particular I wrote test code using: hmatrix, assumes Storable elements Vec, but the documentation states: Low Dimension : Although the dimensionality is limited only by what GHC will handle, the library is meant for 2,3 and 4 dimensions. For general linear algebra, check out the excellent hmatrix library and blas bindings I looked into the code and the documentation of many more libraries but nothing seems to suit my needs :-(. Update Since there seems to be nothing, I started a project on GitHub which aims to develop such a library. The current state is very minimalistic, not optimized for speed at all and only the most basic functions have tests and therefore should work. But should you be interested in using or helping out developing it: Contact me (you'll find my mail address on my web site) or send pull requests.

    Read the article

  • Small-o(n^2) implementation of Polynomial Multiplication

    - by AlanTuring
    I'm having a little trouble with this problem that is listed at the back of my book, i'm currently in the middle of test prep but i can't seem to locate anything regarding this in the book. Anyone got an idea? A real polynomial of degree n is a function of the form f(x)=a(n)x^n+?+a1x+a0, where an,…,a1,a0 are real numbers. In computational situations, such a polynomial is represented by a sequence of its coefficients (a0,a1,…,an). Assuming that any two real numbers can be added/multiplied in O(1) time, design an o(n^2)-time algorithm to compute, given two real polynomials f(x) and g(x) both of degree n, the product h(x)=f(x)g(x). Your algorithm should **not** be based on the Fast Fourier Transform (FFT) technique. Please note it needs to be small-o(n^2), which means it complexity must be sub-quadratic. The obvious solution that i have been finding is indeed the FFT, but of course i can't use that. There is another method that i have found called convolution, where if you take polynomial A to be a signal and polynomial B to be a filter. A passed through B yields a shifted signal that has been "smoothed" by A and the resultant is A*B. This is supposed to work in O(n log n) time. Of course i am completely unsure of implementation. If anyone has any ideas of how to achieve a small-o(n^2) implementation please do share, thanks.

    Read the article

  • A problem with connected points and determining geometry figures based on points' location analysis

    - by StolePopov
    In school we have a really hard problem, and still no one from the students has solved it yet. Take a look at the picture below: http://d.imagehost.org/0422/mreza.gif That's a kind of a network of connected points, which doesn't end and each point has its own number representing it. Let say the numbers are like this: 1-23-456-78910-etc. etc.. (You can't see the number 5 or 8,9... on the picture but they are there and their position is obvious, the point in middle of 4 and 6 is 5 and so on). 1 is connected to 2 and 3, 2 is connected to 1,3,5 and 4 etc. The numbers 1-2-3 indicate they represent a triangle on the picture, but the numbers 1-4-6 do not because 4 is not directly connected with 6. Let's look at 2-3-4-5, that's a parallelogram (you know why), but 4-6-7-9 is NOT a parallelogram because the in this problem there's a rule which says all the sides must be equal for all the figures - triangles and parallelograms. Also there are hexagons, for ex. 4-5-7-9-13-12 is a hexagon - all sides must be equal here too. 12345 - that doesn't represent anything, so we ignore it. I think i explained the problem well. The actual problem which is given to us by using an input of numbers like above to determine if that's a triangle/parallelogram/hexagon(according to the described rules). For ex: 1 2 3 - triangle 11 13 24 26 -parallelogram 1 2 3 4 5 - nothing 11 23 13 25 - nothing 3 2 5 - triangle I was reading computational geometry in order to solve this, but i gave up quickly, nothing seems to help here. One friend told me this site so i decided to give it a try. If you have any ideas about how to solve this, please reply, you can use pseudo code or c++ whatever. Thank you very much.

    Read the article

  • Floating point precision in Visual C++

    - by Luigi Giaccari
    HI, I am trying to use the robust predicates for computational geometry from Jonathan Richard Shewchuk. I am not a programmer, so I am not even sure of what I am saying, I may be doing some basic mistake. The point is the predicates should allow for precise aritmthetic with adaptive floating point precision. On my computer: Asus pro31/S (Core Due Centrino Processor) they do not work. The problem may stay in the fact the my computer may use some improvements in the floating point precision taht conflicts with the one used by Shewchuk. The author says: /* On some machines, the exact arithmetic routines might be defeated by the / / use of internal extended precision floating-point registers. Sometimes / / this problem can be fixed by defining certain values to be volatile, / / thus forcing them to be stored to memory and rounded off. This isn't / / a great solution, though, as it slows the arithmetic down. */ Now what I would like to know is that there is a way, maybe some compiler option, to turn off the internal extended precision floating-point registers. I really appriaciate your help

    Read the article

  • What is faster- Java or C# (Or good old C)?

    - by Rexsung
    I'm currently deciding on a platform to build a scientific computational product on, and am deciding on either C#, Java, or plain C with Intels compiler on Core2 Quad CPU's. It's mostly integer arithmetic. My benchmarks so far show Java and C are about on par with each other, and dotNET/C# trails by about 5%- however a number of my coworkers are claiming that dotNET with the right optimizations will beat both of these given enough time for the JIT to do its work. I always assume that the JIT would have done it's job within a few minutes of the app starting (Probably a few seconds in my case, as it's mostly tight loops), so I'm not sure whether to believe them Can anyone shed any light on the situation? Would dotNET beat Java? (Or am I best just sticking with C at this point?). The code is highly multithreaded and data sets are several terabytes in size. Haskell/erlang etc are not options in this case as there is a significant quantity of existing legacy C code that will be ported to the new system, and porting C to Java/C# is a lot simpler than to Haskell or Erlang. (Unless of course these provide a significant speedup). Edit: We are considering moving to C# or Java because they may, in theory, be faster. Every percent we can shave off our processing time saves us tens of thousands of dollars per year. At this point we are just trying to evaluate whether C, Java, or c# would be faster.

    Read the article

< Previous Page | 2 3 4 5 6 7  | Next Page >