Search Results

Search found 3311 results on 133 pages for 'computing theory'.

Page 34/133 | < Previous Page | 30 31 32 33 34 35 36 37 38 39 40 41  | Next Page >

  • Why isn't Hadoop implemented using MPI?

    - by artif
    Correct me if I'm wrong, but my understanding is that Hadoop does not use MPI for communication between different nodes. What are the technical reasons for this? I could hazard a few guesses, but I do not know enough of how MPI is implemented "under the hood" to know whether or not I'm right. Come to think of it, I'm not entirely familiar with Hadoop's internals either. I understand the framework at a conceptual level (map/combine/shuffle/reduce and how that works at a high level) but I don't know the nitty gritty implementation details. I've always assumed Hadoop was transmitting serialized data structures (perhaps GPBs) over a TCP connection, eg during the shuffle phase. Let me know if that's not true.

    Read the article

  • cheap way to scale a rails application

    - by VP
    I have an application, that is becoming big, but until now, its not giving me a good revenue. That means, short money to re-invest on that. In this scenario, i found a way to make a "cheap distributed rails" deployment. I've got 4 VPS. All of them are in the same physical server. I added a load balance server running HAproxy in one dedicated VPS. There i pointed my virtual ip address where my domain name is associated. Behind this HAproxy i have more two VPS running my rails APP, passenger and memcache. Both apps servers are looking to the same database server, my 4th VPS. So with $44/month, i mounted a distributed environment. It won't be my final choice, but now, that the budget is short, is that a good way to deploy a rails application? Any pros or cons? It worth my $44/month?

    Read the article

  • PVM terminates after Adding Host

    - by Tyug
    On Ubuntu 9.10 using PVM 3.4.5-12 (the PVM package when you use apt-get) The program terminates after adding a host. laptop> pvm pvm> add bowtie-slave add bowtie-slave terminated laptop> Current Configuration only $PVM_RSH = bin/usr/ssh I can ssh perfectly fine into the slave without a password, and run commands on it. Any ideas? Thanks in advance! Here are the sample logs: Laptop log [t80040000] 02/11 10:23:32 laptop (127.0.1.1:xxxxx) LINUX 3.4.5 [t80040000] 02/11 10:23:32 ready Thu Feb 11 10:23:32 2010 [t80040000] 02/11 10:23:32 netoutput() sendto: errno=22 [t80040000] 02/11 10:23:32 em=0x2c24f0 [t80040000] 02/11 10:23:32 [49/à][6e/à][76/à][61/à][6c/à][69/à][64/à][20/à][61/à][72/à] [t80040000] 02/11 10:23:32 netoutput() sendto: Invalid argument [t80040000] 02/11 10:23:32 pvmbailout(0) bowtie-log [t80080000] 02/11 10:23:25 bowtie-slave (xxx.x.x.xxx:xxxxx) LINUX64 3.4.5 [t80080000] 02/11 10:23:25 ready Thu Feb 11 10:23:25 2010 [t80080000] 02/11 10:28:26 work() run = STARTUP, timed out waiting for master [t80080000] 02/11 10:28:26 pvmbailout(0)

    Read the article

  • Are functional programming languages good for practical tasks?

    - by Clueless
    It seems to me from my experimenting with Haskell, Erlang and Scheme that functional programming languages are a fantastic way to answer scientific questions. For example, taking a small set of data and performing some extensive analysis on it to return a significant answer. It's great for working through some tough Project Euler questions or trying out the Google Code Jam in an original way. At the same time it seems that by their very nature, they are more suited to finding analytical solutions than actually performing practical tasks. I noticed this most strongly in Haskell, where everything is evaluated lazily and your whole program boils down to one giant analytical solution for some given data that you either hard-code into the program or tack on messily through Haskell's limited IO capabilities. Basically, the tasks I would call 'practical' such as Aceept a request, find and process requested data, and return it formatted as needed seem to translate much more directly into procedural languages. The most luck I have had finding a functional language that works like this is Factor, which I would liken to a reverse-polish-notation version of Python. So I am just curious whether I have missed something in these languages or I am just way off the ball in how I ask this question. Does anyone have examples of functional languages that are great at performing practical tasks or practical tasks that are best performed by functional languages?

    Read the article

  • C++ Numerical truncation error

    - by Andrew
    Hello everyone, sorry if dumb but could not find an answer. #include <iostream> using namespace std; int main() { double a(0); double b(0.001); cout << a - 0.0 << endl; for (;a<1.0;a+=b); cout << a - 1.0 << endl; for (;a<10.0;a+=b); cout << a - 10.0 << endl; cout << a - 10.0-b << endl; return 0; } Output: 0 6.66134e-16 0.001 -1.03583e-13 Tried compiling it with MSVC9, MSVC10, Borland C++ 2010. All of them arrive in the end to the error of about 1e-13. Is it normal to have such a significant error accumulation over only a 1000, 10000 increments?

    Read the article

  • open source gossip-based membership protocol?

    - by Aaron
    I am looking for a library which I can plug into a distributed application which implements any gossip-based membership protocol. Such a library would allow me to send/receive membership lists, merge received membership lists, etc... Even better would be if the library implemented a protocol with performance O(logn) performance guarantees. Does anyone know of any open source library like this? It doesn't need to meet all of the aforementioned requirements; even something partially implemented would be helpful.

    Read the article

  • Whats the difference between Paxos and W+R>=N in Cassandra?

    - by user1128016
    Dynamo-like databases (e.g. Cassandra) provide ability to enforce consistency by means of quorum, i.e. a number of synchronously written replicas (W) and a number of replicas to read (R) should be chosen in such a way that W+RN where N is a replication factor. On the other hand, PAXOS-based systems like Zookeeper are also used as a consistent fault-tolerant storage. What is the difference between these two approaches? Does PAXOS provide guarantees that are not provided by W+RN schema?

    Read the article

  • What should i do for accomodating large scale data storage and retrieval?

    - by kailashbuki
    There's two columns in the table inside mysql database. First column contains the fingerprint while the second one contains the list of documents which have that fingerprint. It's much like an inverted index built by search engines. An instance of a record inside the table is shown below; 34 "doc1, doc2, doc45" The number of fingerprints is very large(can range up to trillions). There are basically following operations in the database: inserting/updating the record & retrieving the record accoring to the match in fingerprint. The table definition python snippet is: self.cursor.execute("CREATE TABLE IF NOT EXISTS `fingerprint` (fp BIGINT, documents TEXT)") And the snippet for insert/update operation is: if self.cursor.execute("UPDATE `fingerprint` SET documents=CONCAT(documents,%s) WHERE fp=%s",(","+newDocId, thisFP))== 0L: self.cursor.execute("INSERT INTO `fingerprint` VALUES (%s, %s)", (thisFP,newDocId)) The only bottleneck i have observed so far is the query time in mysql. My whole application is web based. So time is a critical factor. I have also thought of using cassandra but have less knowledge of it. Please suggest me a better way to tackle this problem.

    Read the article

  • For distributed applications, which to use, ASIO vs. MPI?

    - by Rhubarb
    I am a bit confused about this. If you're building a distributed application, which in some cases may perform parallel operations (although not necessarily mathematical), should you use ASIO or something like MPI? I take it MPI is a higher level than ASIO, but it's not clear where in the stack one would begin.

    Read the article

  • Expanding Git SHA1 information into a checkin without archiving?

    - by Tim Lin
    Is there a way to include git commit hashes inside a file everytime I commit? I can only find out how to do this during archiving but I haven't been able to find out how to do this for every commit. I'm doing scientific programming with git as revision control, so this kind of functionality would be very helpful for reproducibility reasons (i.e., have the git hash automatically included in all result files and figures).

    Read the article

  • SDCC and malloc() - allocating much less memory than is available

    - by Duncan Bayne
    When I run compile this code with SDCC 3.1.0, and run it on an Amstrad CPC 464 (under emulation, with WinCPC 0.9.26 running on Wine): void _test_malloc() { long idx = 0; while (1) { if (malloc(5)) { printf("%ld\r\n", ++idx); } else { printf("done"); break; } } } ... it consistently taps out at 92 malloc()s. I make that 460 bytes, which leads me to a couple of questions: What is malloc() doing on this system? I was sort of hoping for an order of magnitude more storage even on a 64kB system The behaviour is consistent on 64kB systems and 128kB systems; do I have to perform some sort of magic to access the additional memory, like manual bank switching?

    Read the article

  • Are batch mutations atomic in Cassandra?

    - by user317459
    The Cassandra API supports batch mutations: batch_mutate(keyspace, mutation_map, consistency_level): Executes the specified mutations on the keyspace. mutation_map is a map; the outer map maps the key to the inner map, which maps the column family to the Mutation; can be read as: map. To be more specific, the outer map key is a row key, the inner map key is the column family name. A Mutation specifies either columns to insert or columns to delete. See Mutation and Deletion above for more details. Are all mutations that are executed in a batch executed atomically? So if one of the mutations fails, do the others fail too?

    Read the article

  • Preallocate memory for a program in Linux before it gets started

    - by Fyg
    Hi, folks, I have a program that repeatedly solves large systems of linear equations using cholesky decomposition. Characterising is that I sometimes need to store the complete factorisation which can exceed about 20 GB of memory. The factorisation happens inside a library that I call. Furthermore, this matrix and the resulting factorisation changes quite frequently and as such the memory requirements as well. I am not the only person to use this compute-node. Therefore, is there a way to start the program under Linux and preallocate free memory for the process? Something like: $: prealloc -m 25G ./program

    Read the article

  • Efficiency of manually written loops vs operator overloads (C++)

    - by Sagekilla
    Hi all, in the program I'm working on I have 3-element arrays, which I use as mathematical vectors for all intents and purposes. Through the course of writing my code, I was tempted to just roll my own Vector class with simple +, -, *, /, etc overloads so I can simplify statements like: for (int i = 0; i < 3; i++) r[i] = r1[i] - r2[i]; // becomes: r = r1 - r2; Which should be more or less identical in generated code. But when it comes to more complicated things, could this really impact my performance heavily? One example that I have in my code is this: Manually written version: for (int j = 0; j < 3; j++) { p.vel[j] = p.oldVel[j] + (p.oldAcc[j] + p.acc[j]) * dt2 + (p.oldJerk[j] - p.jerk[j]) * dt12; p.pos[j] = p.oldPos[j] + (p.oldVel[j] + p.vel[j]) * dt2 + (p.oldAcc[j] - p.acc[j]) * dt12; } Using a Vector class with operator overloads: p.vel = p.oldVel + (p.oldAcc + p.acc) * dt2 + (p.oldJerk - p.jerk) * dt12; p.pos = p.oldPos + (p.oldVel + p.vel) * dt2 + (p.oldAcc - p.acc) * dt12; I am compiling my code for maximum possible speed, as it's extremely important that this code runs quickly and calculates accurately. So will me relying on my Vector's for these sorts of things really affect me? For those curious, this is part of some numerical integration code which is not trivial to run in my program. Any insight would be appreciated, as would any idioms or tricks I'm unaware of.

    Read the article

  • What are the advantages / disadvantages of a Cloud-based / Web-based IDE?

    - by Gabe
    I'm writing this as DevConnections in Las Vegas is happening. Visual Studio 2010 has been released and I now have this 3GB beast installed to my machine. (I'll admit, it has some nice features.) However, while the install was monopolizing my computer's resources I began to wish that my IDE worked more like Google Documents (instantly available, available anywhere, easy to share, easy to collaborate, naturally versioned). A few Google (and StackOverflow) searches led me to : Coderun Bespin I'm well aware that these IDE's are missing a lot of what exists in VS 2010. However, that isn't my question. Instead, I'm wondering what benefits a web-based IDE might have? Assuming a company invests the time to create the missing features, what is the downside?

    Read the article

  • Recommendations for Open Source Parallel programming IDE

    - by Andrew Bolster
    What are the best IDE's / IDE plugins / Tools, etc for programming with CUDA / MPI etc? I've been working in these frameworks for a short while but feel like the IDE could be doing more heavy lifting in terms of scaling and job processing interactions. (I usually use Eclipse or Netbeans, and usually in C/C++ with occasional Java, and its a vague question but I can't think of any more specific way to put it)

    Read the article

  • Problem with Boost::Asio for C++

    - by Martin Lauridsen
    Hi there, For my bachelors thesis, I am implementing a distributed version of an algorithm for factoring large integers (finding the prime factorisation). This has applications in e.g. security of the RSA cryptosystem. My vision is, that clients (linux or windows) will download an application and compute some numbers (these are independant, thus suited for parallelization). The numbers (not found very often), will be sent to a master server, to collect these numbers. Once enough numbers have been collected by the master server, it will do the rest of the computation, which cannot be easily parallelized. Anyhow, to the technicalities. I was thinking to use Boost::Asio to do a socket client/server implementation, for the clients communication with the master server. Since I want to compile for both linux and windows, I thought windows would be as good a place to start as any. So I downloaded the Boost library and compiled it, as it said on the Boost Getting Started page: bootstrap .\bjam It all compiled just fine. Then I try to compile one of the tutorial examples, client.cpp, from Asio, found (here.. edit: cant post link because of restrictions). I am using the Visual C++ compiler from Microsoft Visual Studio 2008, like this: cl /EHsc /I D:\Downloads\boost_1_42_0 client.cpp But I get this error: /out:client.exe client.obj LINK : fatal error LNK1104: cannot open file 'libboost_system-vc90-mt-s-1_42.lib' Anyone have any idea what could be wrong, or how I could move forward? I have been trying pretty much all week, to get a simple client/server socket program for c++ working, but with no luck. Serious frustration kicking in. Thank you in advance.

    Read the article

  • Higher speed options for executing very large (20 GB) .sql file in MySQL

    - by Jonogan
    My firm was delivered a 20+ GB .sql file in reponse to a request for data from the gov't. I don't have many options for getting the data in a different format, so I need options for how to import it in a reasonable amount of time. I'm running it on a high end server (Win 2008 64bit, MySQL 5.1) using Navicat's batch execution tool. It's been running for 14 hours and shows no signs of being near completion. Does anyone know of any higher speed options for such a transaction? Or is this what I should expect given the large file size? Thanks

    Read the article

< Previous Page | 30 31 32 33 34 35 36 37 38 39 40 41  | Next Page >