Search Results

Search found 874 results on 35 pages for 'scalability'.

Page 29/35 | < Previous Page | 25 26 27 28 29 30 31 32 33 34 35 | Next Page >

Webcast On-Demand: Building Java EE Apps That Scale

- by jeckels

With some awesome work by one of our architects, Randy Stafford, we recently completed a webcast on scaling Java EE apps efficiently. Did you miss it? No problem. We have a replay available on-demand for you. Just hit the '+' sign drop-down for access.Topics include: Domain object caching Service response caching Session state caching JSR-107 HotCache and more! Further, we had several interesting questions asked by our audience, and we thought we'd share a sampling of those here for you - just in case you had the same queries yourself. Enjoy! What is the largest Coherence deployment out there? We have seen deployments with over 500 JVMs in the Coherence cluster, and deployments with over 1000 JVMs using the Coherence jar file, in one system. On the management side there is an ecosystem of monitoring tools from Oracle and third parties with dashboards graphing values from Coherence's JMX instrumentation. For lifecycle management we have seen a lot of custom scripting over the years, but we've also integrated closely with WebLogic to leverage its management ecosystem for deploying Coherence-based applications and managing process life cycles. That integration introduces a new Java EE archive type, the Grid Archive or GAR, which embeds in an EAR and can be seen by a WAR in WebLogic. That integration also doesn't require any extra WebLogic licensing if Coherence is licensed. How is Coherence different from a NoSQL Database like MongoDB? Coherence can be considered a NoSQL technology. It pre-dates the NoSQL movement, having been first released in 2001 whereas the term "NoSQL" was coined in 2009. Coherence has a key-value data model primarily but can also be used for document data models. Coherence manages data in memory currently, though disk persistence is in a future release currently in beta testing. Where the data is managed yields a few differences from the most well-known NoSQL products: access latency is faster with Coherence, though well-known NoSQL databases can manage more data. Coherence also has features that well-known NoSQL database lack, such as grid computing, eventing, and data source integration. Finally Coherence has had 15 years of maturation and hardening from usage in mission-critical systems across a variety of industries, particularly financial services. Can I use Coherence for local caching? Yes, you get additional features beyond just a java.util.Map: you get expiration capabilities, size-limitation capabilities, eventing capabilites, etc. Are there APIs available for GoldenGate HotCache? It's mostly a black box. You configure it, and it just puts objects into your caches. However you can treat it as a glass box, and use Coherence event interceptors to enhance its behavior - and there are use cases for that. Are Coherence caches updated transactionally? Coherence provides several mechanisms for concurrency control. If a project insists on full-blown JTA / XA distributed transactions, Coherence caches can participate as resources. But nobody does that because it's a performance and scalability anti-pattern. At finer granularity, Coherence guarantees strict ordering of all operations (reads and writes) against a single cache key if the operations are done using Coherence's "EntryProcessor" feature. And Coherence has a unique feature called "partition-level transactions" which guarantees atomic writes of multiple cache entries (even in different caches) without requiring JTA / XA distributed transaction semantics.

Read the article
"cloud architecture" concepts in a system architecture diagrams

- by markus

If you design a distributed application for easy scale-out, or you just want to make use of any of the new “cloud computing” offerings by Amazon, Google or Microsoft, there are some typical concepts or components you usually end up using: distributed blob storage (aka S3) asynchronous, durable message queues (aka SQS) non-Relational-/non-transactional databases (like SimpleDB, Google BigTable, Azure SQL Services) distributed background worker pool load-balanced, edge-service processes handling user requests (often virtualized) distributed caches (like memcached) CDN (content delivery network like Akamai) Now when it comes to design and sketch an architecture that makes use of such patterns, are there any commonly used symbols I could use? Or even a download with some cool Visio stencils? :) It doesn’t have to be a formal system like UML but I think it would be great if there were symbols that everyone knows and understands, like we have commonly used shapes for databases or a documents, for example. I think it would be important to not mix it up with traditional concepts like a normal file system (local or network server/SAN), or a relational database. Simply speaking, I want to be able to draw some conclusions about an application’s scalability or data consistency issues by just looking at the system architecture overview diagram. Update: Thank you very much for your answers. I like the idea of putting a small "cloud symbol" on the traditional symbols. However I leave this thread open just in case someone will find specific symbols (maybe in a book or so) - or uploaded some pimped up Visio stencils ;)

Read the article
Newbie, deciding Python or Erlang

- by Joe

Hi Guys, I'm a Administrator (unix, Linux and some windows apps such as Exchange) by experience and have never worked on any programming language besides C# and scripting on Bash and lately on powershell. I'm starting out as a service provider and using multiple network/server monitoring tools based on open source (nagios, opennms etc) in order to monitor them. At this moment, being inspired by a design that I came up with, to do more than what is available with the open source at this time, I would like to start programming and test some of these ideas. The requirement is that a server software that captures a stream of data and store them in a database(CouchDB or MongoDB preferably) and the client side (agent installed on a server) would be sending this stream of data on a schedule of every 10 minutes or so. For these two core ideas, I have been reading about Python and Erlang besides ruby. I do plan to use either Amazon or Rackspace where the server platform would run. This gives me the scalability needed when we have more customers with many servers. For that reason alone, I thought Erlang was a better fit(I could be totally wrong, new to this game) and I understand that Erlang has limited support in some ways compared to Ruby or Python. But also I'm totally new to the programming realm of things and any advise would be appreciated grately. Jo

Read the article
Accessing Yahoo realtime stock quotes

- by DVK

There's a fairly easy way of retrieving 15-minute delayed quotes off of Yahoo! Finance web site ("quotes.csv" API). However, so far I was unable to find any info on how to access real-time quotes. The hang-ups with real-time quotes are: Only available to logged-in user No API Non-obvious how to scrape the info - I'm somewhat convinced they are placed on the page by some weird Ajax call. So I was wondering if anyone had managed to develop a publically available solution to retrieve real-time quotes for a stock from Yahoo! Finance. Notes: Implementation language/framework need is flexible but Perl or Excel is highly preferred. Assume that security is not an issue - I'm willing to supply yahoo userid and pasword, even in cleartext. I'm not 100% hung up on Yahoo - they are merely the only provider of free realtime stock quotes I'm familiar with. if the same thing can be done with Google Finance, I'd be just as happy. This is for a personal project, so scalability/fault tolerance/etc... are not important. I'm looking for a "do the whole retrieval" library ideally, but if I'm pointed to partial solutions (e.g. how to retrieve info from Yahoo's user-logged-in pages; how to scrape realtime quotes from Yahoo's page) I can fill in the blanks. I saw Finance::YahooQuote but it does not seem to allow you to supply log-in information and appears to use the lagging quotes.csv API Thanks!

Read the article
Implementation review for a MVC.NET app with custom membership

- by mrjoltcola

I'd like to hear if anyone sees any problems with how I implemented the security in this Oracle based MVC.NET app, either security issues, concurrency issues or scalability issues. First, I implemented a CustomOracleMembershipProvider to handle the database interface to the membership store. I implemented a custom Principal named User which implements IPrincipal, and it has a hashtable of Roles. I also created a separate class named AuthCache which has a simple cache for User objects. Its purpose is simple to avoid return trips to the database, while decoupling the caching from either the web layer or the data layer. (So I can share the cache between MVC.NET, WCF, etc.) The MVC.NET stock MembershipService uses the CustomOracleMembershipProvider (configured in web.config), and both MembershipService and FormsService share access to the singleton AuthCache. My AccountController.LogOn() method: 1) Validates the user via the MembershipService.Validate() method, also loads the roles into the User.Roles container and then caches the User in AuthCache. 2) Signs the user into the Web context via FormsService.SignIn() which accesses the AuthCache (not the database) to get the User, sets HttpContext.Current.User to the cached User Principal. In global.asax.cs, Application_AuthenticateRequest() is implemented. It decrypts the FormsAuthenticationTicket, accesses the AuthCache by the ticket.Name (Username) and sets the Principal by setting Context.User = user from the AuthCache. So in short, all these classes share the AuthCache, and I have, for thread synchronization, a lock() in the cache store method. No lock in the read method. The custom membership provider doesn't know about the cache, the MembershipService doesn't know about any HttpContext (so could be used outside of a web app), and the FormsService doesn't use any custom methods besides accessing the AuthCache to set the Context.User for the initial login, so it isn't dependent on a specific membership provider. The main thing I see now is that the AuthCache will be sharing a User object if a user logs in from multiple sessions. So I may have to change the key from just UserId to something else (maybe using something in the FormsAuthenticationTicket for the key?).

Read the article
Could CouchDB benefit significantly from the use of BERT instead of JSON?

- by Victor Rodrigues

I appreciate a lot CouchDB attempt to use universal web formats in everything it does: RESTFUL HTTP methods in every interaction, JSON objects, javascript code to customize database and documents. CouchDB seems to scale pretty well, but the individual cost to make a request usually makes 'relational' people afraid of. Many small business applications should deal with only one machine and that's all. In this case the scalability talk doesn't say too much, we need more performance per request, or people will not use it. BERT (Binary ERlang Term http://bert-rpc.org/ ) has proven to be a faster and lighter format than JSON and it is native for Erlang, the language in which CouchDB is written. Could we benefit from that, using BERT documents instead of JSON ones? I'm not saying just for retrieving in views, but for everything CouchDB does, including syncing. And, as a consequence of it, use Erlang functions instead of javascript ones. This would modify some original CouchDB principles, because today it is very web oriented. Considering I imagine few people would make their database API public and usually its data is accessed by the users through an application, it would be a good deal to have the ability to configure CouchDB for working faster. HTTP+JSON calls could still be handled by CouchDB, considering an extra cost in these cases because of parsing.

Read the article
Crosstab/Cube/Pivot Components for Delphi

- by Anagoge

I'm looking for a Delphi VCL crosstab/cube/pivotcube/olap grid component for Delphi 2009, 2010, or XE. I'm willing to sacrifice advanced features to get something open/free (or very cheap if I must) to make it easier to collaborate with any future developers without anyone having to purchase more components than I already use, since this will just be used in one screen. If there isn't anything appropriate out there, I may try to implement something simple on my own. I can live with some fairly basic features: drag and drop to configure dimensions, sort by a column, allow totals/min/max for a column, and (optionally) expand/collapse or drill down to sub-categories. Blazing performance and enterprise scalability are not required, since there should be less than 2000 source rows. There appear to be several decent options in the commercial space (ExpressPivotCube, FastCube, HierCube), but they are all a few hundred dollars. This project already uses existing installations of Excel 2007 and SQL Server 2005/2008, so I might consider leveraging those, though I'd prefer a native Delphi component, if possible. There are also the very old Decision Cube components included in Delphi's Source\xtab directory, but they apparently no longer support unicode compilers (Delphi 2009+), since I got dozens of unicode-related compilation errors while test compiling that source in Delphi XE. Those components also still link to the long-deprecated BDE! Has anyone modified Decision Cube to support unicode/pure-TDataSet? The online tutorials I found were incomplete and silent on the dozens of BDE/unicode compilation errors I see, so I might have to tackle that on my own. Does anyone have suggestions where to start for a free/cheap basic crosstab/pivot grid component?

Read the article
what webserver / mod / technique should I use to serve everything from memory?

- by reinier

I've lots of lookuptables from which I'll generate my webresponse. I think IIS with Asp.net enables me to keep static lookuptables in memory which I can use to serve up my responses very fast. Are there however also non .net solutions which can do the same? I've looked at fastcgi, but I think this starts X processes, of which anyone can handle Y requests. But the processes are by definition shielded from eachother. I could configure fastcgi to use just 1 process, but does this have scalability implications? anything using PHP or any other interpreted language won't fly because it is also cgi or fastcgi bound right? I understand memcache could be an option, though this would require another (local) socket connection which I'd rather avoid since everything in memory would be much faster. The solution can work under WIndows or Unix... it doesn't matter too much. The only thing which matters is that there will be a lot of requests (100/sec now and growing to 500/sec in a year), and I want to reduce the amount of webservers needed to process it. The current solution is done using PHP and memcache (and the occasional hit to the SQL server backend). Although it is fast (for php anyway), Apache has real problems when the 50/sec is passed. I've put a bounty on this question since I've not seen enough responses to make a wise choice. At the moment I'm considering either Asp.net or fastcgi with C(++).

Read the article
Using Active Directory to authenticate users in a WWW facing website

- by Basiclife

Hi, I'm looking at starting a new web app which needs to be secure (if for no other reason than that we'll need PCI accreditation at some point). From previous experience working with PCI (on a domain), the preferred method is to use integrated windows authentication which is then passed all the way through the app to the database. This allows for better auditing as well as object-level permissions (ie an end user can't read the credit card table). There are advantages in that even if someone compromises the webserver, they won't be able to glean any additional information from the database. Also, the webserver isn't storing any database credentials (beyond perhaps a simple anonymous user with very few permissions) So, now I'm looking at the new web app which will be on the public internet. One suggestion is to have a Active Directory server and create windows accounts on the AD for each user of the site. These users will then be placed into the appropriate NT groups to decide which DB permissions they should have (and which pages they can access). ASP already provides the AD membership provider and role provider so this should be fairly simple to implement. There are a number of questions around this - Scalability, reliability, etc... and I was wondering if there is anyone out there with experience of this approach or, even better, some good reasons why to do it / not to do it. Any input appreciated Regards Basiclife

Read the article
Building a News-feed that comprises posts "created by user's connections" && "on the topics user is following"

- by aklin81

I am working on a project of Questions & Answers website that allows a user to follow questions on certain topics from his network. A user's news-feed wall comprises of only those questions that have been posted by his connections and tagged on the topics that he is following(his expertise topics). I am confused what database's datamodel would be most fitting for such an application. The project needs to consider the future provisions for scalability and high performance issues. I have been looking at Cassandra and MySQL solutions as of now. After my study of Cassandra I realized that Simple news-feed design that shows all the posts from network would be easy to design using Cassandra by executing fast writes to all followers of a user about the post from user. But for my kind of application where there is an additional filter of 'followed topics', (ie, the user receives posts "created by his network" && "on topics user is following"), I could not convince myself with a good schema design in Cassandra. I hope if I missed something because of my short understanding of cassandra, perhaps, can you please help me out with your suggestions of how this news-feed could be implemented in Cassandra ? Looking for a great project with Cassandra ! Edit: There are going to be maximum 5 tags allowed for tagging the question (ie, max 5 topics can be tagged on a question).

Read the article
"Work stealing" vs. "Work shrugging"?

- by John

Why is it that I can find lots of information on "work stealing" and nothing on "work shrugging" as a dynamic load-balancing strategy? By "work-shrugging" I mean busy processors pushing excessive work towards less loaded neighbours rather than idle processors pulling work from busy neighbours ("work-stealing"). I think the general scalability should be the same for both strategies. However I believe that it is much more efficient for busy processors to wake idle processors if and when there is definitely work for them to do than having idle processors spinning or waking periodically to speculatively poll all neighbours for possible work. Anyway a quick google didn't show up anything under the heading of "Work Shrugging" or similar so any pointers to prior-art and the jargon for this strategy would be welcome. Clarification/Confession In more detail:- By "Work Shrugging" I actually envisage the work submitting processor (which may or may not be the target processor) being responsible for looking around the immediate locality of the preferred target processor (based on data/code locality) to decide if a near neighbour should be given the new work instead because they don't have as much work to do. I am talking about an atomic read of the immediate (typically 2 to 4) neighbours' estimated q length here. I do not think this is any more coupling than implied by the thieves polling & stealing from their neighbours - just much less often - or rather - only when it makes economic sense to do so. (I am assuming "lock-free, almost wait-free" queue structures in both strategies). Thanks.

Read the article
Execute javascript on IIS server

- by James Westgate

I have the following situation. A customer uses JavaScript with jQuery to create a complex website. We would like to use JavaScript and jQuery on the server (IIS) for the following reasons: Skills transfer - we would like to use JavaScript and jQuery on the server and not have to use eg VB Script. / classic asp. .Net framework/Java etc is ruled out because of this. Improved options for search/accessibility. We would like to be able to use jQuery as a templating system, but this isn't viable for search engines and users with js turned off - unless we can selectively run this code on the server. There is significant investment in IIS and Windows Server, so changing that is not an option. I know you can run jScript on IIS using windows Script host, but am unsure of the scalability and the process surrounding this. I am also unsure whether this would have access to the DOM. Here is a diagram that hopefully explains the situation. I was wondering if anyone has done anything similar?

Read the article
Design Practice Retrieve Multiple Users - PHP

- by pws5068

Greetings all, I'm looking for a more efficient way to grab multiple users without too much code redundancy. I have a Users class, (here's a highly condensed representation) Class Users { function __construct() { ... ... } private static function getNew($id) { // This is all only example code $sql = "SELECT * FROM Users WHERE User_ID=?"; $user = new User(); $user->setID($id); .... return $user; } public static function getNewestUsers($count) { $sql = "SELECT * FROM Users ORDER BY Join_Date LIMIT ?"; $users = array(); // Prepared Statement Data Here while($stmt->fetch()) { $users[] = Users::getNew($id); ... } return $users; } // These have more sanity/security checks, demonstration only. function setID($id) { $this->id = $id; } function setName($name) { $this->name = $name; } ... } So clearly calling getNew() in a loop is inefficient, because it makes $count number of calls to the database each time. I would much rather select all users in a single SQL statement. The problem is that I have probably a dozen or so methods for getting arrays of users, so all of these would now need to know the names of the database columns which is repeating a lot of structure that could always change later. What are some other ways that I could approach this problem? My goals include efficiency, flexibility, scalability, and good design practice.

Read the article
Cache consistency & spawning a thread

- by Dave Keck

Background I've been reading through various books and articles to learn about processor caches, cache consistency, and memory barriers in the context of concurrent execution. So far though, I have been unable to determine whether a common coding practice of mine is safe in the strictest sense. Assumptions The following pseudo-code is executed on a two-processor machine: int sharedVar = 0; myThread() { print(sharedVar); } main() { sharedVar = 1; spawnThread(myThread); sleep(-1); } main() executes on processor 1 (P1), while myThread() executes on P2. Initially, sharedVar exists in the caches of both P1 and P2 with the initial value of 0 (due to some "warm-up code" that isn't shown above.) Question Strictly speaking – preferably without assuming any particular CPU – is myThread() guaranteed to print 1? With my newfound knowledge of processor caches, it seems entirely possible that at the time of the print() statement, P2 may not have received the invalidation request for sharedVar caused by P1's assignment in main(). Therefore, it seems possible that myThread() could print 0. References These are the related articles and books I've been reading. (It wouldn't allow me to format these as links because I'm a new user - sorry.) Shared Memory Consistency Models: A Tutorial hpl.hp.com/techreports/Compaq-DEC/WRL-95-7.pdf Memory Barriers: a Hardware View for Software Hackers rdrop.com/users/paulmck/scalability/paper/whymb.2009.04.05a.pdf Linux Kernel Memory Barriers kernel.org/doc/Documentation/memory-barriers.txt Computer Architecture: A Quantitative Approach amazon.com/Computer-Architecture-Quantitative-Approach-4th/dp/0123704901/ref=dp_ob_title_bk

Read the article
When is a good time to start thinking about scaling?

- by Slokun

I've been designing a site over the past couple days, and been doing some research into different aspects of scaling a site horizontally. If things go as planned, in a few months (years?) I know I'd need to worry about scaling the site up and out, since the resources it would end up consuming would be huge. So, this got me to thinking, when is the best time to start thinking about, and designing for, scalability? If you start too early on, you could easily over complicate your design, and make it impossible to actually build. You could also get too caught up in the details, the architecture, whatever, and wind up getting nothing done. Also, if you do get it working, but the site never takes off, you may have wasted a good chunk of extra effort. On the other hand, you could be saving yourself a ton of effort down the road. Designing it from the ground up to be big would make it much easier later on to let it grow big, with very little rewriting going on. I know for what I'm working on, I've decided to make at least a few choices now on the side of scaling, but I'm not going to do a complete change of thinking to get it to scale completely. Notably, I've redesigned my database from a conventional relational design to one similar to what was suggested on the Reddit site linked below, and I'm going to give memcache a try. So, the basic question, when is a good time to start thinking or worrying about scaling, and what are some good designs, tips, etc. for when doing so? A couple of things I've been reading, for those who are interested: http://www.codinghorror.com/blog/2009/06/scaling-up-vs-scaling-out-hidden-costs.html http://highscalability.com/blog/2010/5/17/7-lessons-learned-while-building-reddit-to-270-million-page.html http://developer.yahoo.com/performance/rules.html

Read the article
When to choose which machine learning classifier?

- by LM

Suppose I'm working on some classification problem. (Fraud detection and comment spam are two problems I'm working on right now, but I'm curious about any classification task in general.) How do I know which classifier I should use? (Decision tree, SVM, Bayesian, logistic regression, etc.) In which cases is one of them the "natural" first choice, and what are the principles for choosing that one? Examples of the type of answers I'm looking for (from Manning et al.'s "Introduction to Information Retrieval book": http://nlp.stanford.edu/IR-book/html/htmledition/choosing-what-kind-of-classifier-to-use-1.html): a. If your data is labeled, but you only have a limited amount, you should use a classifier with high bias (for example, Naive Bayes). [I'm guessing this is because a higher-bias classifier will have lower variance, which is good because of the small amount of data.] b. If you have a ton of data, then the classifier doesn't really matter so much, so you should probably just choose a classifier with good scalability. What are other guidelines? Even answers like "if you'll have to explain your model to some upper management person, then maybe you should use a decision tree, since the decision rules are fairly transparent" are good. I care less about implementation/library issues, though. Also, for a somewhat separate question, besides standard Bayesian classifiers, are there 'standard state-of-the-art' methods for comment spam detection (as opposed to email spam)? [Not sure if stackoverflow is the best place to ask this question, since it's more machine learning than actual programming -- if not, any suggestions for where else?]

Read the article
How to achieve high availability?

- by tanyehzheng

My boss wants to have a system that takes into concern of continent wide catastrophic event. He wants to have two servers in US and two servers in Asia (1 login server and 1 worker server in each continent). In the event that earthquake breaks the connection between the two continents, both should work alone. When the connection is revived, they should sync each other back to normal. External cloud system not allowed as he has no confidence. The system should take into account of scalability which means addition of new servers should be easy to configure. The servers should be load balanced. The connection between the servers should be very secure(encrypted and send through SSL although SSL takes care of encryption). The system should let one and only one user log in with one account. (beware of latency between continent and two users sharing account may reach both login server at the same time) Please help. I'm already at the end of my wit. Thank you in advance.

Read the article
To implement a remote desktop sharing solution

- by Cameigons

Hi, I'm on planning/modeling phase to develop a remote desktop sharing solution, which must be web browser based. In other words: an user will be able to see and interact with someone's remote desktop using his web-browser. Everything the user who wants to share his desktop will need, besides his browser, is installing an add-in, which he's going to be prompted about when necessary. The add-in is required since (afaik) no browser technology allows desktop control from an app running within the browser alone. The add-in installation process must be as simple and transparent as possible to the user (similar to AdobeConnectNow, in case anyone's acquainted with it). The user can share his desktop with lots of people at the same time, but concede desktop control to only one of them at a time(makes no sense being otherwise). Project requirements: All technology employed must be open-source license compatible Both front ends are going to be in flash (browser) Must work on Linux, Windows XP(and later) and MacOSX. Must work at least with IE7(and later) and Firefox3.0(and later). At the very least, once the sharer's stream hits the server from where it'll be broadcast, hereon it must be broadcasted in flv (so I'm thinking whether to do the encoding at the client's machine (the one sharing the desktop) or send it in some other format to the server and encode it there). Performance and scalability are important: It must be able to handle hundreds of dozens of users(one desktop sharer, the rest viewers) We'll definitely be using red5. My doubts concern mostly implementing the desktop publisher side (add-in and streamer): 1) Are you aware of other projects that I could look into for ideas? (I'm aware of bigbluebutton.org and code.google.com/p/openmeetings) 2) Should I base myself on VNC ? 3) Bearing in mind the need to have it working cross-platform, what language should I go with? (My team is very used with java and I have some knowledge of C/C++, but anything goes really). 4) Any other advices are appreciated.

Read the article
Best practices, PHP, tracking millions of impressions per day.

- by John

What do I have to do to make 20k mysql inserts per second possible (during peak hours around 1k/sec during slower times)? I've been doing some research and I've seen the "INSERT DELAYED" suggestion, writing to a flat file, "fopen(file,'a')", and then running a chron job to dump the "needed" data into mysql, etc. I've also heard you need multiple servers and "load balancers" which I've never heard of, to make something like this work. I've also been looking at these "cloud server" thing-a-ma-jigs, and their automatic scalability, but not sure about what's actually scalable. The application is just a tracker script, so if I have 100 websites that get 3 million page loads a day, there will be around 300 million inserts a day. The data will be ran through a script that will run every 15-30 minutes which will normalize the data and insert it into another mysql table. How do the big dogs do it? How do the little dogs do it? I can't afford a huge server anymore so any intuitive ways, if there are multiple ways of going at it, you smart people can think of.. please let me know :)

Read the article
Django: Applying Calculations To A Query Set

- by TheLizardKing

I have a QuerySet that I wish to pass to a generic view for pagination: links = Link.objects.annotate(votes=Count('vote')).order_by('-created')[:300] This is my "hot" page which lists my 300 latest submissions (10 pages of 30 links each). I want to now sort this QuerySet by an algorithm that HackerNews uses: (p - 1) / (t + 2)^1.5 p = votes minus submitter's initial vote t = age of submission in hours Now because applying this algorithm over the entire database would be pretty costly I am content with just the last 300 submissions. My site is unlikely to be the next digg/reddit so while scalability is a plus it is required. My question is now how do I iterate over my QuerySet and sort it by the above algorithm? For more information, here are my applicable models: class Link(models.Model): category = models.ForeignKey(Category, blank=False, default=1) user = models.ForeignKey(User) created = models.DateTimeField(auto_now_add=True) modified = models.DateTimeField(auto_now=True) url = models.URLField(max_length=1024, unique=True, verify_exists=True) name = models.CharField(max_length=512) def __unicode__(self): return u'%s (%s)' % (self.name, self.url) class Vote(models.Model): link = models.ForeignKey(Link) user = models.ForeignKey(User) created = models.DateTimeField(auto_now_add=True) def __unicode__(self): return u'%s vote for %s' % (self.user, self.link) Notes: I don't have "downvotes" so just the presence of a Vote row is an indicator of a vote or a particular link by a particular user.

Read the article
What's the most efficient query?

- by Aaron Carlino

I have a table named Projects that has the following relationships: has many Contributions has many Payments In my result set, I need the following aggregate values: Number of unique contributors (DonorID on the Contribution table) Total contributed (SUM of Amount on Contribution table) Total paid (SUM of PaymentAmount on Payment table) Because there are so many aggregate functions and multiple joins, it gets messy do use standard aggregate functions the the GROUP BY clause. I also need the ability to sort and filter these fields. So I've come up with two options: Using subqueries: SELECT Project.ID AS PROJECT_ID, (SELECT SUM(PaymentAmount) FROM Payment WHERE ProjectID = PROJECT_ID) AS TotalPaidBack, (SELECT COUNT(DISTINCT DonorID) FROM Contribution WHERE RecipientID = PROJECT_ID) AS ContributorCount, (SELECT SUM(Amount) FROM Contribution WHERE RecipientID = PROJECT_ID) AS TotalReceived FROM Project; Using a temporary table: DROP TABLE IF EXISTS Project_Temp; CREATE TEMPORARY TABLE Project_Temp (project_id INT NOT NULL, total_payments INT, total_donors INT, total_received INT, PRIMARY KEY(project_id)) ENGINE=MEMORY; INSERT INTO Project_Temp (project_id,total_payments) SELECT `Project`.ID, IFNULL(SUM(PaymentAmount),0) FROM `Project` LEFT JOIN `Payment` ON ProjectID = `Project`.ID GROUP BY 1; INSERT INTO Project_Temp (project_id,total_donors,total_received) SELECT `Project`.ID, IFNULL(COUNT(DISTINCT DonorID),0), IFNULL(SUM(Amount),0) FROM `Project` LEFT JOIN `Contribution` ON RecipientID = `Project`.ID GROUP BY 1 ON DUPLICATE KEY UPDATE total_donors = VALUES(total_donors), total_received = VALUES(total_received); SELECT * FROM Project_Temp; Tests for both are pretty comparable, in the 0.7 - 0.8 seconds range with 1,000 rows. But I'm really concerned about scalability, and I don't want to have to re-engineer everything as my tables grow. What's the best approach?

Read the article
In which domains are message oriented middleware like AMQP useful?

- by cocotwo

What problem do MOM (Message Oriented Middleware) solve? Scalability? Integration? In which domain are they typically used and in which domains are they typically not used? For example, say, is Google using such solution for it's main search engine or to power GMail? What about big websites like Walmart, eBay, FedEx (pretty much a Java shop) and buy.com (pretty much an MS shop)? Does MOM solve a need there? Does it make any sense when you're writing a Webapp where you control the server-side and have an homogenous environment (say tens of Amazon EC2 instances all running Linux + Java JVMs) there and where the clients are, well, Web browsers? Does it make sense for desktop apps that need to communicate with a server? Or is it 'only' for big enterprise stuff where you typically have a happy mix of countless of different systems that needs to communicate in a way or another? I'm a bit confused as to what they're useful for and I think that with example of where they're appropriate and where they're not appropriate I could better understand their use.

Read the article
asp.net mvc 2 - return JavaScript with View

- by Tomaszewski

Hi, using ASP.NET MVC 2 I have a navigation menu inside my Master Page. In the navigation menu, I am trying add a class to the that the current page relates to (i.e., home page will add class="active" to the Home button). I'm trying to consider scalability and the fact that I don't want to change individual pages if the navigation changes later. The only way I can think of doing this is: Add JavaScript to each individual View that will add the class when the DOM is ready Return JavaScript when return View() occurs on point (2), I am unsure how to do. Thusfar I have been doing the following in my controller: public ActionResult Index() { ViewData["messege"] = JavaScript("<script type='text/javascript' language='javascript'> $(document).ready(function () { console.log('hi hi hi'); }); </script>"); return View(); } but in my view, when I call: <%: ViewData["messege"] %> I get: System.Web.Mvc.JavaScriptResult as the result Would you guys have any ideas on How to solve the navigation menu probelem, other than the solutions I've listed return JavaScript along with your view from the Controller Thanks, in advanced!

Read the article
On the search for my next great .Net Read

- by user127954

Just got done with "The art of unit testing". It was a great read and i think everyone should go buy a copy. With that said i think the next book I'm like to read would be a architecture / Design type book that would focus heavily on building your objects / software in such a way that it would be: Low Coupling High Cohesion Easily Maintainable / Extended Easy to test Easy to Navigate / Debug The above characteristcs are the most important ones but also maybe it would also include (but not necessary) designing for: Performance - Don't want to design a system at at the end find out its dog slow :) Scalability - Again don't want to design something at the end find out it won't scale. I'd also prefer (but not necessary again): Something newer - Architectural principles seem to gradually evolve / improve over time and id like something with current thinking. .Net as illustrating language - like i said above its not mandatory but since its what i use every day id prefer it to be in .net. Doesn't really matter if its in vb.net or c# Some of the topics that would be talked about its how to minimize dependencies and using interfaces throughout your solution rather than concrete classes. Maybe it would constract /compare some of the newest design principles like DDD, Repository Pattern, Ect... I already have "Clean Code" (don't know if its this type of book or not) and "Working effectively with legacy code" on my radar but id like to read a book based upon the topic i talked about above first. Is there such a book?

Read the article
Best practices for (over)using Azure queues

- by John

Hi, I'm in the early phases of designing an Azure-based application. One of the things that attracts me to Azure is the scalability, given the variability of the demand I'm likely to expect. As such I'm trying to keep things loosely coupled so I can add instances when I need to. The recommendations I've seen for architecting an application for Azure include keeping web role logic to a minimum, and having processing done in worker roles, using queues to communicate and some sort of back-end store like SQL Azure or Azure Tables. This seems like a good idea to me as I can scale up either or both parts of the application without any issue. However I'm curious if there are any best practices (or if anyone has any experiences) for when it's best to just have the web role talk directly to the data store vs. sending data by the queue? I'm thinking of the case where I have a simple insert to do from the web role - while I could set this up as a message, send it on the queue, and have a worker role pick it up and do the insert, it seems like a lot of double-handling. However I also appreciate that it may be the case that this is better in the long run, in case the web role gets overwhelmed or more complex logic ends up being required for the insert. I realise this might be a case where the answer is "it depends entirely on the situation, check your perf metrics" - but if anyone has any thoughts I'd be very appreciative! Thanks John

Read the article

Search Results

Search found 874 results on 35 pages for 'scalability'.

Page 29/35 | < Previous Page | 25 26 27 28 29 30 31 32 33 34 35 | Next Page >

- by jeckels

- by markus

- by Joe

- by DVK

- by mrjoltcola

- by Victor Rodrigues

- by Anagoge

- by reinier

- by Basiclife

- by aklin81

- by John

- by James Westgate

- by pws5068

- by Dave Keck

- by Slokun

- by LM

- by tanyehzheng

- by Cameigons

- by John

- by TheLizardKing

- by Aaron Carlino

- by cocotwo

- by Tomaszewski

- by user127954

- by John

< Previous Page | 25 26 27 28 29 30 31 32 33 34 35 | Next Page >