Search Results

Search found 46833 results on 1874 pages for 'distributed system'.

Page 107/1874 | < Previous Page | 103 104 105 106 107 108 109 110 111 112 113 114  | Next Page >

  • Java Meta Search Engine API

    - by Loki
    I'm currently researching Java libraries to help in building a meta type search engine in the sense of being able to replace any given search engine in the back-end of the application or to simultaneously search using multiple search engines. I'm not interested in the GUI part here, just the generalization of search engine APIs and usage. I'd like to know about the common libraries used to achieve this task and if there are any common patterns used in this case. I imagined that this problem is common enough to be able to find plenty of stuff on Google, but it seems like search is a very proprietary domain and not much information is fed back to the community.

    Read the article

  • How do you deal with denormalization / secondary indexes in database sharding?

    - by Continuation
    Say I have a "message" table with 2 secondary indexes: "recipient_id" "sender_id" I want to shard the "message" table by "recipient_id". That way to retrieve all messages sent to a certain recipient I only need to query one shard. But at the same time, I want to be able to make a query that ask for all messages sent by a certain sender. Now I don't want to send that query to every single shard of the "message" table. One way to do this is to duplicate the data and have a "message_by_sender" table sharded by "sender_id". The problem with that approach is that every time a message has been sent, I need to insert the message into both "message" and "message_by_sender" tables. But what if after inserting into "message" the insertion into "message_by_sender" fail? In that case the message exists in "message" but not in "message_by_sender". How do I make sure that if a message exists in "message" then it also exists in "message_by_sender" without resorting to 2 phase commit? This must be a very common issue for anyone who shards their databases. How do you deal woth it?

    Read the article

  • Erlang: How to view output of io:format/2 calls in processes spawned on remote nodes.

    - by jkndrkn
    Hello, I am working on a decentralized Erlang application. I am currently working on a single PC and creating multiple nodes by initializing erl with the -sname flag. When I spawn a process using spawn/4 on its home node, I can see output generated by calls io:format/2 within that process in its home erl instance. When I spawn a process remotely by using spawn/4 in combination with register_name, output of io:format/2 is sometimes redirected back to the erl instance where the remote spawn/4 call was made, and sometimes remains completely invisible. Similarly, when I use rpc:call/4, output of io:format/2 calls is redirected back to the erl instance where the `rpc:call/4' call is made. How do you get a process to emit debugging output back to its parent erl instance?

    Read the article

  • Which DHT algorithm to use (if I want to join two separate DHTs)?

    - by webdreamer
    I've been looking into some DHT systems, specially Pastry and Chord. I've read some concerns about Chord's reaction to churn, though I believe that won't be a problem for the task I have at hands. I'm implementing some sort of social network service that doesn't rely on any central servers for a course project. I need the DHT for the lookups. Now I don't know of all the servers in the network in the beginning. As I've stated, there's no main tracker server. It works this way: each client has three dedicated servers. The three servers have the profile of the client, and it's wall, it's personal info, replicated. I only get to know about other group of servers when the user adds a friend (inputing the client's address). So I would create two separate DHTs on the two groups of three servers and when they friend each other I would like to join the DHTs. I would like to this consistently. I haven't had a lot of time to get all that familiar with the protocols, so I would like to know which one is better if I want to join the two separate DHTs?

    Read the article

  • Mercurial local branching and pushing to shared repository

    - by Steve Horn
    I created a branch on my local Mercurial repository. I want to push to the shared repository so my work can be backed up, but I don't want other project members to see the branch. What's the standard operating procedure in this case? I'd like to avoid having the repository get full of developer branches that I don't need to see.

    Read the article

  • Best strategy for moving data between physical tiers in ASP.net

    - by Pete Lunenfeld
    Building a new ASP.net application, and planning to separate DB, 'service' tier and Web/UI tier into separate physical layers. What is the best/easiest strategy to move serialized objects between the service tier and the UI tier? I was considering serializing POCOs into JSON using simple ASP.net pages to serve the middle tier. Meaning that the UI/Web tier will request data from a (hidden to the outside user) web server that will return a JSON string. This kind of JSON 'emitter' seems easily testable. It also seems easily compressible for efficiently moving data over the WAN between tiers. I know that some folks use .asmx webservices for this kind of task, but this seems like there is excess overhead with SOAP, and the package is not as human readable (testable) as POCOs serialized as JSON. Others are using more complex technology like WCF which we have never used. Does anyone have advice for choosing a method for moving data/objects between the data (db) tier and the web (UI) tier over the WAN using .net technologies? Thanks!!!

    Read the article

  • How to Profile R Code that Includes SNOW Cluster

    - by James
    Hi, I have a nested loop that I'm using foreach, DoSNOW, and a SNOW socket cluster to solve for. How should I go about profiling the code to make sure I'm not doing something grossly inefficient. Also is there anyway to measure the data flows going between the master and nodes in a Snow cluster? Thanks, James

    Read the article

  • Cross-database transactions from one SP

    - by Michael Bray
    I need to update multiple databases with a few simple SQL statement. The databases are configurared in SQL using 'Linked Servers', and the SQL versions are mixed (SQL 2008, SQL 2005, and SQL 2000). I intend to write a stored procedure in one of the databases, but I would like to do so using a transaction to make sure that each database gets updated consistently. Which of the following is the most accurate: Will a single BEGIN/COMMIT TRANSACTION work to guarantee that all statements across all databases are successful? Will I need multiple BEGIN TRANSACTIONS for each individual set of commands on a database? Are transactions even supported when updating remote databases? I would need to execute a remote SP with embedded transaction support. Note that I don't care about any kind of cross-database referential integrity; I'm just trying to update multiple databases at the same time from a single stored procedure if possible. Any other suggestions are welcome as well. Thanks!

    Read the article

  • Why isn't Hadoop implemented using MPI?

    - by artif
    Correct me if I'm wrong, but my understanding is that Hadoop does not use MPI for communication between different nodes. What are the technical reasons for this? I could hazard a few guesses, but I do not know enough of how MPI is implemented "under the hood" to know whether or not I'm right. Come to think of it, I'm not entirely familiar with Hadoop's internals either. I understand the framework at a conceptual level (map/combine/shuffle/reduce and how that works at a high level) but I don't know the nitty gritty implementation details. I've always assumed Hadoop was transmitting serialized data structures (perhaps GPBs) over a TCP connection, eg during the shuffle phase. Let me know if that's not true.

    Read the article

  • Is AMQP suitable as both an intra and inter-machine software bus?

    - by Bwooce
    I'm trying to get my head around AMQP. It looks great for inter-machine (cluster, LAN, WAN) communication between applications but I'm not sure if it is suitable (in architectural, and current implementation terms) for use as a software bus within one machine. Would it be worth pulling out a current high performance message passing framework to replace it with AMQP, or is this falling into the same trap as RPC by blurring the distinction between local and non-local communication? I'm also wary of the performance impacts of using a WAN technology for intra-machine communications, although this may be more of an implementation concern than architecture. War stories would be appreciated.

    Read the article

  • Is there an use case for non-blocking receive when I have threads?

    - by Gabriel Šcerbák
    I know non-blocking receive is not used as much in message passing, but still some intuition tells me, it is needed. Take for example GUI event driven applications, you need some way to wait for a message in a non-blocking way, so your program can execute some computations. One of the ways to solve this is to have a special thread with message queue. Is there some use case, where you would really need non-blocking receive even if you have threads?

    Read the article

  • Decentralized synchronized secure data storage

    - by Alberich
    Introduction Hi, I am going to ask a question which seems utopic for me, but I need to know if there is a way to achieve what I need. And if not, I need to know why not. The idea Suppose I have a database structure, in MySql. I want to create some solution to allow anyone (no matter who, no matter where) to have a synchronized copy (updated clone) of this database (with its content) Well, and it is not going to be just one synchronized copy, it could (and should) be a multiple replication (supposing the basic, this means, for example, ten copies all over the world) And, the most important thing: It must be secure. By secure I mean only real-accepted transactions will be synchronized with all the others (no matter how many) database copies/clones. Note: Since it would be quite difficult to make the synchronization in real-time, I will design everything to make this feature dispensable. So it is not required. My auto-suggestion This is how I am thinking to manage it: Time identifiers and Updates checking: Every action (insert, update, delete...) will be stored as the action instruction itself, associated to the time identifier. [I think better than a DATETIME field, it'll be an INT one, with the number of miliseconds passed from 1st january 2013 on, for example]. So each copy is going to ask to the "neighbour copy" for new actions done since last update, and execute them after checking they are allowed. Problem 1: the "neighbour copy" could be outdated too. Solution 1: do not ask just one neighbour, create a random list with some of the copies/clones and ask them for news (I could avoid the list and ask ALL the clones for updates, but this will be inefficient if clones number ascends too much). Problem 2: Real-time global synchronization is not active. What if... Someone at CLONE_ENTERPRISING inserts a row into TABLE. ... this row goes to every clone ... Someone at CLONE_FIXEMALL deletes this row. ... and at the same time, somewhere in an outdated clone ... Someone at CLONE_DROPOUT edits this row (now inexistent at the other clones) Solution 2: easy stuff, force a GLOBAL synchronization before doing any new "depending-on-third-data action" (edit, for example). This global synch. will be unnecessary when making an INSERT, for instance. Note: Well, someone could have some fun, and make the same insert in two clones... since they're not getting updated in real-time, this row will exist twice. But, it's the same as when we have one single database, in some needed cases we check if there is an existing same-row before doing the final action. Not a problem. Problem 3: It is possible to edit the code and do not filter actions, so someone could spread instructions to delete everything, or just make some trolling activity. This is not a problem, since good clones will always be somewhere. Those who got bad won't interest anymore. I really appreciate if you read. I know this is not the perfect solution, it has possibly hundred of holes, but it is my basic start. I will now appreciate anything you can teach me now. Thanks a lot. PS.: It could be that all this I am trying already exists and has its own name. Sorry for asking then (I'd anyway thank this name, if it exists)

    Read the article

  • How can I get a rails server to use the same databse that cucumber uses during a test?

    - by James
    The cucumber test first makes an entry in the database and posts a form to a second server. This second server does some processing in background and then hits the first app (where the test is being run) with some data that the cucumber test needs to know about. I've tried running the main server via script/server and script/server -e test while the cucumber test is running, but I can't seem to force the server to use the same database that cucumber is using when it runs its step definitions. That is, when the second server pushes some data to a controller in the main server, the main server doesn't know about any entries that cucumber has made in the database. How can I get cucumber and the main server to use the same database?

    Read the article

  • Connecting to an RMI object without registry

    - by Mark Probst
    I think I need to connect to a remote RMI object without going through the registry, but I don't know how. My situation is this: I'm implementing a simple job distribution service which consists of one distributor and multiple workers. The distributor has a registered RMI object to which clients connect to send jobs, and workers connect to accept jobs. Unfortunately the distributor and worker hosts are behind a firewall. To get to the distributor host I am tunneling two ports (one for the registry, one for the distributor object) via SSH, so I can get to the registry and the distributor from outside the firewall. To make that work I have to set "-Djava.rmi.server.hostname=localhost" on the distributor JVM so that the clients connect to their local, tunneled port, instead of the port on the actual distributor host, which is blocked. This creates a problem for the workers, though, because they need to connect to the distributor directly, but because of the "localhost" redirection they behave like clients and try to connect to a port on their own host, which is not available, because I'm not tunneling on the workers (it is impractical). Now, if I could connect to a remote object directly by giving the hostname and port, I could do away both with the registry on the distributor and the "localhost" hack, and make the workers connect properly. How do I do that? Or is there a different solution to this problem?

    Read the article

  • Whats the difference between Paxos and W+R>=N in Cassandra?

    - by user1128016
    Dynamo-like databases (e.g. Cassandra) provide ability to enforce consistency by means of quorum, i.e. a number of synchronously written replicas (W) and a number of replicas to read (R) should be chosen in such a way that W+RN where N is a replication factor. On the other hand, PAXOS-based systems like Zookeeper are also used as a consistent fault-tolerant storage. What is the difference between these two approaches? Does PAXOS provide guarantees that are not provided by W+RN schema?

    Read the article

  • What should i do for accomodating large scale data storage and retrieval?

    - by kailashbuki
    There's two columns in the table inside mysql database. First column contains the fingerprint while the second one contains the list of documents which have that fingerprint. It's much like an inverted index built by search engines. An instance of a record inside the table is shown below; 34 "doc1, doc2, doc45" The number of fingerprints is very large(can range up to trillions). There are basically following operations in the database: inserting/updating the record & retrieving the record accoring to the match in fingerprint. The table definition python snippet is: self.cursor.execute("CREATE TABLE IF NOT EXISTS `fingerprint` (fp BIGINT, documents TEXT)") And the snippet for insert/update operation is: if self.cursor.execute("UPDATE `fingerprint` SET documents=CONCAT(documents,%s) WHERE fp=%s",(","+newDocId, thisFP))== 0L: self.cursor.execute("INSERT INTO `fingerprint` VALUES (%s, %s)", (thisFP,newDocId)) The only bottleneck i have observed so far is the query time in mysql. My whole application is web based. So time is a critical factor. I have also thought of using cassandra but have less knowledge of it. Please suggest me a better way to tackle this problem.

    Read the article

  • wanting a good memory + disk caching solution

    - by brofield
    I'm currently storing generated HTML pages in a memcached in-memory cache. This works great, however I am wanting to increase the storage capacity of the cache beyond available memory. What I would really like is: memcached semantics (i.e. not reliable, just a cache) memcached api preferred (but not required) large in-memory first level cache (MRU) huge on-disk second level cache (main) evicted from on-disk cache at maximum storage using LRU or LFU proven implementation In searching for a solution I've found the following solutions but they all miss my marks in some way. Does anyone know of either: other options that I haven't considered a way to make memcachedb do evictions Already considered are: memcachedb best fit but doesn't do evictions: explicitly "not a cache" can't see any way to do evictions (either manual or automatic) tugela cache abandoned, no support don't want to recommend it to customers nmdb doesn't use memcache api new and unproven don't want to recommend it to customers

    Read the article

  • How do you export or release a Mac OS X program made in Xcode? Program does not load on other comput

    - by SolidSnake4444
    I made a program in Xcode, being a simple calculator that takes a first number, and a second number, and then either adds,subtracts,multiplies, or divides depending on the radio button. I build and run and the program comes up and works fine. When I went to show my friends on their macs, when you double click on the program the program pops in the tray for like .05 seconds and then disappears and we never can actually run the program. It still works perfect however on my computer. What am I doing wrong? How can I take the program I made, and run it on different macs? I have the release set to 10.5 but the active SDK to 10.6. It runs in both 10.5 and 10.6 simulators. One friend has 10.6.3 like me and the other has 10.5.x(cant remember the last part). To get the app, I changed from debug to release and set active SDK to 10.5. Then in the release folder I found the app and sent that over iChat. I feel this will be a problem in the future if I ever make a legit application to distribute. Thank you!

    Read the article

< Previous Page | 103 104 105 106 107 108 109 110 111 112 113 114  | Next Page >