Search Results

Search found 8824 results on 353 pages for 'cloud virtualization vmware density scalability'.

Page 114/353 | < Previous Page | 110 111 112 113 114 115 116 117 118 119 120 121  | Next Page >

  • Partitioning requests in code among several servers

    - by Jacques René Mesrine
    I have several forum servers (what they are is irrelevant) which stores posts from users and I want to be able to partition requests among these servers. I'm currently leaning towards partitioning them by geographic location. To improve the locality of data, users will be separated into regions e.g. North America, South America and so on. Is there any design pattern on how to implement the function that maps the partioning property to the server, so that this piece of code has high availability and would not become a single point of failure ? f( Region ) -> Server IP

    Read the article

  • Scalable Database Tagging Schema

    - by Longpoke
    EDIT: To people building tagging systems. Don't read this. It is not what you are looking for. I asked this when I wasn't aware that RDBMS all have their own optimization methods, just use a simple many to many scheme. I have a posting system that has millions of posts. Each post can have an infinite number of tags associated with it. Users can create tags which have notes, date created, owner, etc. A tag is almost like a post itself, because people can post notes about the tag. Each tag association has an owner and date, so we can see who added the tag and when. My question is how can I implement this? It has to be fast searching posts by tag, or tags by post. Also, users can add tags to posts by typing the name into a field, kind of like the google search bar, it has to fill in the rest of the tag name for you. I have 3 solutions at the moment, but not sure which is the best, or if there is a better way. Note that I'm not showing the layout of notes since it will be trivial once I get a proper solution for tags. Method 1. Linked list tagId in post points to a linked list in tag_assoc, the application must traverse the list until flink=0 post: id, content, ownerId, date, tagId, notesId tag_assoc: id, tagId, ownerId, flink tag: id, name, notesId Method 2. Denormalization tags is simply a VARCHAR or TEXT field containing a tab delimited array of tagId:ownerId. It cannot be a fixed size. post: id, content, ownerId, date, tags, notesId tag: id, name, notesId Method 3. Toxi (from: http://www.pui.ch/phred/archives/2005/04/tags-database-schemas.html, also same thing here: http://stackoverflow.com/questions/20856/how-do-you-recommend-implementing-tags-or-tagging) post: id, content, ownerId, date, notesId tag_assoc: ownerId, tagId, postId tag: id, name, notesId Method 3 raises the question, how fast will it be to iterate through every single row in tag_assoc? Methods 1 and 2 should be fast for returning tags by post, but for posts by tag, another lookup table must be made. The last thing I have to worry about is optimizing searching tags by name, I have not worked that out yet. I made an ASCII diagram here: http://pastebin.com/f1c4e0e53

    Read the article

  • How to make write operation idempotent?

    - by Morgan Cheng
    I'm reading article about recently release Gizzard sharding framework by twitter(http://engineering.twitter.com/2010/04/introducing-gizzard-framework-for.html). It mentions that all write operations must be idempotent to make sure high reliability. According to wikipedia, "Idempotent operations are operations that can be applied multiple times without changing the result." But, IMHO, in Gazzard case, idempotent write operation should be operations that sequence doesn't matter. Now, my question is: How to make write operation idempotent? The only thing I can image is to have a version number attached to each write. For example, in blog system. Each blog must have a $blog_id and $content. In application level, we always write a blog content like this write($blog_id, $content, $version). The $version is determined to be unique in application level. So, if application first try to set one blog to "Hello world" and second want it to be "Goodbye", the write is idempotent. We have such two write operations: write($blog_id, "Hello world", 1); write($blog_id, "Goodbye", 2); These two operations are supposed to changed two different records in DB. So, no matter how many times and what sequence these two operations executed, the results are same. This is just my understanding. Please correct me if I'm wrong.

    Read the article

  • Architecture for database analytics

    - by David Cournapeau
    Hi, We have an architecture where we provide each customer Business Intelligence-like services for their website (internet merchant). Now, I need to analyze those data internally (for algorithmic improvement, performance tracking, etc...) and those are potentially quite heavy: we have up to millions of rows / customer / day, and I may want to know how many queries we had in the last month, weekly compared, etc... that is the order of billions entries if not more. The way it is currently done is quite standard: daily scripts which scan the databases, and generate big CSV files. I don't like this solutions for several reasons: as typical with those kinds of scripts, they fall into the write-once and never-touched-again category tracking things in "real-time" is necessary (we have separate toolset to query the last few hours ATM). this is slow and non-"agile" Although I have some experience in dealing with huge datasets for scientific usage, I am a complete beginner as far as traditional RDBM go. It seems that using column-oriented database for analytics could be a solution (the analytics don't need most of the data we have in the app database), but I would like to know what other options are available for this kind of issues.

    Read the article

  • Searching Natural Language Sentence Structure

    - by Cerin
    What's the best way to store and search a database of natural language sentence structure trees? Using OpenNLP's English Treebank Parser, I can get fairly reliable sentence structure parsings for arbitrary sentences. What I'd like to do is create a tool that can extract all the doc strings from my source code, generate these trees for all sentences in the doc strings, store these trees and their associated function name in a database, and then allow a user to search the database using natural language queries. So, given the sentence "This uploads files to a remote machine." for the function upload_files(), I'd have the tree: (TOP (S (NP (DT This)) (VP (VBZ uploads) (NP (NNS files)) (PP (TO to) (NP (DT a) (JJ remote) (NN machine)))) (. .))) If someone entered the query "How can I upload files?", equating to the tree: (TOP (SBARQ (WHADVP (WRB How)) (SQ (MD can) (NP (PRP I)) (VP (VB upload) (NP (NNS files)))) (. ?))) how would I store and query these trees in a SQL database? I've written a simple proof-of-concept script that can perform this search using a mix of regular expressions and network graph parsing, but I'm not sure how I'd implement this in a scalable way. And yes, I realize my example would be trivial to retrieve using a simple keyword search. The idea I'm trying to test is how I might take advantage of grammatical structure, so I can weed-out entries with similar keywords, but a different sentence structure. For example, with the above query, I wouldn't want to retrieve the entry associated with the sentence "Checks a remote machine to find a user that uploads files." which has similar keywords, but is obviously describing a completely different behavior.

    Read the article

  • Can in-memory SQLite databases scale with concurrency?

    - by Kent Boogaart
    In order to prevent a SQLite in-memory database from being cleaned up, one must use the same connection to access the database. However, using the same connection causes SQLite to synchronize access to the database. Thus, if I have many threads performing reads against an in-memory database, it is slower on a multi-core machine than the exact same code running against a file-backed database. Is there any way to get the best of both worlds? That is, an in-memory database that permits multiple, concurrent calls to the database?

    Read the article

  • Cache layer for MVC - Model or controller?

    - by Industrial
    Hi everyone, I am having some second thoughts about where to implement the caching part. Where is the most appropriate place to implement it, you think? Inside every model, or in the controller? Approach 1 (psuedo-code): // mycontroller.php MyController extends Controller_class { function index () { $data = $this->model->getData(); echo $data; } } // myModel.php MyModel extends Model_Class{ function getData() { $data = memcached->get('data'); if (!$data) { $query->SQL_QUERY("Do query!"); } return $data; } } Approach 2: // mycontroller.php MyController extends Controller_class { function index () { $dataArray = $this->memcached->getMulti('data','data2'); foreach ($dataArray as $key) { if (!$key) { $data = $this->model->getData(); $this->memcached->set($key, $data); } } echo $data; } } // myModel.php MyModel extends Model_Class{ function getData() { $query->SQL_QUERY("Do query!"); return $data; } } Thoughts: Approach 1: No multiget/multi-set. If a high number of keys would be returned, overhead would be caused. Easier to maintain, all database/cache handling is in each model Approach 2: Better performancewise - multiset/multiget is used More code required Harder to maintain Tell me what you think!

    Read the article

  • Using AMQP to collect events

    - by synapse
    Does AMQP has any advantages over an ad-hoc implementation for a simple stats gathering scenario? It works like this - clients send events (more than we care to put into persistent storage) to (several) web workers, the workers aggrregate them and write to a single database. I don't think I should consider using AMQP for this because I'll still need web workers to receive events from clients through HTTP and to publish them. Am I missing something?

    Read the article

  • Why does this Java code not utilize all CPU cores?

    - by ReneS
    The attached simple Java code should load all available cpu core when starting it with the right parameters. So for instance, you start it with java VMTest 8 int 0 and it will start 8 threads that do nothing else than looping and adding 2 to an integer. Something that runs in registers and not even allocates new memory. The problem we are facing now is, that we do not get a 24 core machine loaded (AMD 2 sockets with 12 cores each), when running this simple program (with 24 threads of course). Similar things happen with 2 programs each 12 threads or smaller machines. So our suspicion is that the JVM (Sun JDK 6u20 on Linux x64) does not scale well. Did anyone see similar things or has the ability to run it and report whether or not it runs well on his/her machine (= 8 cores only please)? Ideas? I tried that on Amazon EC2 with 8 cores too, but the virtual machine seems to run different from a real box, so the loading behaves totally strange. package com.test; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import java.util.concurrent.Future; import java.util.concurrent.TimeUnit; public class VMTest { public class IntTask implements Runnable { @Override public void run() { int i = 0; while (true) { i = i + 2; } } } public class StringTask implements Runnable { @Override public void run() { int i = 0; String s; while (true) { i++; s = "s" + Integer.valueOf(i); } } } public class ArrayTask implements Runnable { private final int size; public ArrayTask(int size) { this.size = size; } @Override public void run() { int i = 0; String[] s; while (true) { i++; s = new String[size]; } } } public void doIt(String[] args) throws InterruptedException { final String command = args[1].trim(); ExecutorService executor = Executors.newFixedThreadPool(Integer.valueOf(args[0])); for (int i = 0; i < Integer.valueOf(args[0]); i++) { Runnable runnable = null; if (command.equalsIgnoreCase("int")) { runnable = new IntTask(); } else if (command.equalsIgnoreCase("string")) { runnable = new StringTask(); } Future<?> submit = executor.submit(runnable); } executor.awaitTermination(1, TimeUnit.HOURS); } public static void main(String[] args) throws InterruptedException { if (args.length < 3) { System.err.println("Usage: VMTest threadCount taskDef size"); System.err.println("threadCount: Number 1..n"); System.err.println("taskDef: int string array"); System.err.println("size: size of memory allocation for array, "); System.exit(-1); } new VMTest().doIt(args); } }

    Read the article

  • architecture - centraled location for different modules (cms, webapplications, ...) - best practise

    - by NicoJuicy
    Let's just say that i want to create a cms + other online applications. I want them all to integrate into a central location, but they also have to be available seperately (not everyone want's more than the cms solution). Would i create a huge central application that contains all the database, which communicates through a webserice with the "standalone - integrated" modules? Or would i create them seperately and the only thing that the "central" application would do is syncing the information (eg. the cms and another solution can have the same tables (eg. clients or employees). Or do you have another idea? (i know i'm a little vague, but i can't "give" a lot of details because of work - contract). If someone has all the "packages" it should be possible for the central application to integrate all the modules at one place! Or if someone has more than 1 module, it should combine this on the website. What i thought is best, is that the central location contains only the users and their rights (eg. cms - all rights, ...), and the information get synced with every change. (module cms, adding a new client - store locally and send data to the central location, central location - send to modules = table clients updated everywhere) This way it is easy if someone only "bought" a module, they can sync it easily through the complete architecture. I hope i made myself clear!

    Read the article

  • Scheduling jobs from a web environment on Linux

    - by Anders Feder
    Hi. I am developing an application in PHP on Linux/Apache. I want to be able to schedule PHP jobs (scripts) for execution at some specific time in the future from within the application. I know that many people will recommend cron and at, but first of all I don't need recurrence (cron) and secondly and most importantly, I need the solution to be able to scale. At was not designed with race condititions in mind, and if two users try to add a job at the same time one or both may fail. It's also important that jobs are executed at their specified time, and not just 'polled' once per minute or so. Can anyone please suggest solutions for this task? Thank you.

    Read the article

  • Application Engineering and Number of Users

    - by Kramii
    Apart from performance concerns, should web-based applications be built differently according to the number of (concurrent) users? If so, what are the main differences for (say) 4, 40, 400 and 4000 users? I'm particularly interested in how logging, error handling, design patterns etc. would be be used according to the number of concurrent users.

    Read the article

  • Web services or shared database for (game) server communication?

    - by jaaronfarr
    We have 2 server clusters: the first is made up of typical web applications backed by SQL databases. The second are highly optimized multiplayer game servers which keep all data in memory. Both clusters communicate with clients via HTTP (Ajax with JSON). There are a few cases in which we need to share data between the two server types, for example, reporting back and storing the results of a game (should ultimately end up in the database). We're considering several approaches for inter-server communication: Just share the MySQL databases between clusters (introduce SQL to the game servers) Sharing data in a distributed key-value store like Memcache, Redis, etc. Use an RPC technology like Google ProtoBufs or Apache Thrift Using RESTful web services (the game server would POST back to the web servers, for example) At the moment, we're leaning towards web services or just sharing the database. Sharing the database seems easy, but we're concerned this adds extra memory and a new dependency into the game servers. Web services provide good separation of concerns and fit with the existing Ajax we use, but add complexity, overhead and many more ways for communication to fail. Are there any other good reasons not to use one or the other approach? Which would be easier to scale?

    Read the article

  • How do I write a Java text file viewer for big log files

    - by Hannes de Jager
    I am working on a software product with an integrated log file viewer. Problem is, its slow and unstable for really large files because it reads the whole file into memory when you view a log file. I'm wanting to write a new log file viewer that addresses this problem. What are the best practices for writing viewers for large text files? How does editors like notepad++ and VIM acomplish this? I was thinking of using a buffered Bi-directional text stream reader together with Java's TableModel. Am I thinking along the right lines and are such stream implementations available for Java?

    Read the article

  • Improve SQL strategy - denormalize in object-children-images case

    - by fesja
    Hi, I have a Tour object which has many Place objects. For the list of tours, I want to show some info of the tour, the number of places of the tour, and three Place's images. Right one my SQL queries are (i'm using Doctrine with Symfony on MySQL) get Tour get Tour 1 places get Tour 2 places get Tour 3 places ... get Tour n places If I have a three Tour list, it's not so bad; but I'm sure it can get bad if I do a 10-20 tour-list. So, thinking on how to improve the queries I've thought of several measures: Having a place count cache Storing the urls of three images on a new tour field. The problem with 2. is that if I change the image, I have to check all the tours to update that image for another one. What solution do you think is best to scale the system in a near future? Any other suggestion. thanks!

    Read the article

  • I want to build a Google-friendly web app, where should I start?

    - by ronii
    I have only very basic experience with HTML/CSS and have quite a bit of experience with testing software and web apps from a consumer perspective. I'd love to launch a web application that plays nicely with Google services, similar to some of the apps you'd find on the Google Apps Marketplace, such as ManyMoon, time to note, Socialwok, etc. I'm a huge Google fan and would like to build something that's well integrated with other Google services. If you were a total beginner and wanted to build a complex app like one of examples above (project management, CRM, etc), where would you start? If you worked your ass off 18 hours a day, 24/7, how fast could you do it? I've dabbled into various languages and development frameworks, and read about which apps are using what languages but it's hard to figure out what would be most beneficial to jump into. Ruby on Rails, PHP, Google Web Toolkit, AppEngine. The list goes on and on. I want to be able to build and launch my own scalable web app. Thanks.

    Read the article

  • Retrieving Large Lists of Objects Using Java EE

    - by hallidave
    Is there a generally-accepted way to return a large list of objects using Java EE? For example, if you had a database ResultSet that had millions of objects how would you return those objects to a (remote) client application? Another example -- that is closer to what I'm actually doing -- would be to aggregate data from hundreds of sources, normalize it, and incrementally transfer it to a client system as a single "list". Since all the data cannot fit in memory, I was thinking that a combination of a stateful SessionBean and some sort of custom Iterator that called back to the server would do the trick. How have you successfully solved this problem in the past?

    Read the article

  • We failed trying database per custom installation. Plan to recover?

    - by Fedyashev Nikita
    There is a web application which is in production mode for 3 years or so by now. Historically, because of different reasons there was made a decision to use database-per customer installation. Now we came across the fact that now deployments are very slow. Should we ever consider moving all the databases back to single one to reduce environment complexity? Or is it too risky idea? The problem I see now is that it's very hard to merge these databases with saving referential integrity(primary keys of different database' tables can not be obviously differentiated). Databases are not that much big, so we don't have much benefits of reduced load by having multiple databases.

    Read the article

  • Optimizing landing pages

    - by Oleg Shaldybin
    In my current project (Rails 2.3) we have a collection of 1.2 million keywords, and each of them is associated with a landing page, which is effectively a search results page for a given keywords. Each of those pages is pretty complicated, so it can take a long time to generate (up to 2 seconds with a moderate load, even longer during traffic spikes, with current hardware). The problem is that 99.9% of visits to those pages are new visits (via search engines), so it doesn't help a lot to cache it on the first visit: it will still be slow for that visit, and the next visit could be in several weeks. I'd really like to make those pages faster, but I don't have too many ideas on how to do it. A couple of things that come to mind: build a cache for all keywords beforehand (with a very long TTL, a month or so). However, building and maintaing this cache can be a real pain, and the search results on the page might be outdated, or even no longer accessible; given the volatile nature of this data, don't try to cache anything at all, and just try to scale out to keep up with traffic. I'd really appreciate any feedback on this problem.

    Read the article

  • What goes in to making a web site that needs to scale?

    - by samoz
    I am planning to build an application that will get a large amount of traffic. (Please don't say I won't get traffic, this is for an internal network, so the traffic will be there. Just trying to avoid the 'You won't get that much traffic, don't worry about it.) What sorts of things do I need to do so that it doesn't simply crash under the load of a large amount of users? What becomes the limiting factors? Database stuff? I/O with front end? I've never really developed a serious web app before and am looking for some help.

    Read the article

  • Chroot within chroot

    - by Andy
    I'm using Centos 5.2 and when I try to make a chroot jail using the script, I get: Copying libraries for /usr/bin/scp. (0x00007fff17bfe000) cp: cannot stat `(0x00007fff17bfe000)': No such file or directory ... I am currently using on a rackspace cloud server so i suspect that these dependencies are outside of my own root. Does anyone have a better idea for jailing the sftp server on a cloud server using Centos 5.2?

    Read the article

  • vCenter appliance won't use mail relay server

    - by Safado
    tl;dr: - sendmail is configured to use a relay server but still insists on using 127.0.01 as the relay, which results in mail not being sent. We have the open source vCenter appliance (v 5.0) managing our ESXi cluster. When connected to it via vSphere Client, you can configure the SMTP relay server to use by going to Administration > vCenter Server Settings > MAIL. There you can set the SMTP Server value. I looked through their documentation and also confirmed on the phone with support that all you have to do to configure mail is to put in the relay IP or fqdn in that box and hit OK. Well, I had done that and mail still wasn't sending. So I SSH into the server (which is SuSE) and look at /var/log/mail and it looks like it's trying to relay the email through 127.0.0.1 and it's rejecting it. So looking through the config files, I see there's /etc/sendmail.cf and /etc/mail/submit.cf. You can configure items in /etc/sysconfig/sendmail and run SuSEconfig --module sendmail to generate those to .cf files based on what's in /etc/sysconfig/sendmail. So playing around, I see that when you set the SMTP Server value in the vCenter gui, all that it does is change the "DS" line in /etc/mail/submit.cf to have DS[myrelayserver.com]. Looking on the internet, it would appear that the DS line is really the only thing you need to change in order to use a relay server. I got on the phone with VMWare support and spent 2 hours trying to modify ANY setting that had anything to do with relays and we couldn't get it to NOT use 127.0.0.1 as the relay. Just to note, any time we made any sort of configuration change, we restarted the sendmail service. Does anyone know whats going on? Have any ideas on how I can fix this?

    Read the article

  • No VMKernel Dump File on PSOD for ESXi 4

    - by user66481
    On PSOD no VMKernel Dump File is written to disk and no message is written to screen (the screen is either blank or full of dashes). I need this data to understand why the system crashes; any help as to how to fix this to write a dump file would be appreciated. Thanks. Notes: VMKCore partition exists, is active, and is configured (esxcfg-dumppart -l). esxcfg-advcfg -g /Misc/PsodOnCosPanic = 1. esxcfg-advcfg -g /Misc/CosCoreFile = /var/core. esxcfg-dumppart -C -D /vmfs/devices/disks/ = "Error running command. Unable to copy the dump partition: Couldn't find a valid VMKernel dump file. Dump partition might be uninitialized." Hardware diagnostics (Dell) checks okay. Hardware: VMWare ESXi 4.1.0 (VMKernel Release Build 320137) Dell Inc. Optiplex 960 (2 Drives) Intel Core 2 Quad CPU Q9400 2.66GHz Configuration: 2 Virtual Machines: Windows Server 2003 R2 Enterprise Edition SP2 (1 on each drive) VM 1: Executes Batch Jobs (Has Internet Information Services 6) VM 2: Database Server (Has SQL Server 2000)

    Read the article

< Previous Page | 110 111 112 113 114 115 116 117 118 119 120 121  | Next Page >