Search Results

Search found 4580 results on 184 pages for 'faster'.

Page 155/184 | < Previous Page | 151 152 153 154 155 156 157 158 159 160 161 162  | Next Page >

  • Initial capacity of collection types, i.e. Dictionary, List

    - by Neil N
    Certain collection types in .Net have an optional "Initial Capacity" constructor param. i.e. Dictionary<string, string> something = new Dictionary<string,string>(20); List<string> anything = new List<string>(50); I can't seem to find what the default initial capacity is for these objects on MSDN. If I know I will only be storing 12 or so items in a dictionary, doesn't it make sense to set the initial capacity to something like 20? My reasoning is, assuming the capacity grows like it does for a StringBuiler, which doubles each time the capacity is hit, and each re-allocation is costly, why not pre-set the size to something you know will hold your data, with some extra room just in case? If the initial capacity is 100, and I know I will only need a dozen or so, it seems as though the rest of that allocated RAM is allocated for nothing. Please spare me the "premature optimization" speil for the O(n^n)th time. I know it won't make my apps any faster or save any meaningful amount of memory, this is mostly out of curiosity.

    Read the article

  • one two-directed tcp socket OR two one-directed? (linux, high volume, low latency)

    - by osgx
    Hello I need to send (interchange) a high volume of data periodically with the lowest possible latency between 2 machines. The network is rather fast (e.g. 1Gbit or even 2G+). Os is linux. Is it be faster with using 1 tcp socket (for send and recv) or with using 2 uni-directed tcp sockets? The test for this task is very like NetPIPE network benchmark - measure latency and bandwidth for sizes from 2^1 up to 2^13 bytes, each size sent and received 3 times at least (in teal task the number of sends is greater. both processes will be sending and receiving, like ping-pong maybe). The benefit of 2 uni-directed connections come from linux: http://lxr.linux.no/linux+v2.6.18/net/ipv4/tcp_input.c#L3847 3847/* 3848 * TCP receive function for the ESTABLISHED state. 3849 * 3850 * It is split into a fast path and a slow path. The fast path is 3851 * disabled when: ... 3859 * - Data is sent in both directions. Fast path only supports pure senders 3860 * or pure receivers (this means either the sequence number or the ack 3861 * value must stay constant) ... 3863 * 3864 * When these conditions are not satisfied it drops into a standard 3865 * receive procedure patterned after RFC793 to handle all cases. 3866 * The first three cases are guaranteed by proper pred_flags setting, 3867 * the rest is checked inline. Fast processing is turned on in 3868 * tcp_data_queue when everything is OK. All other conditions for disabling fast path is false. And only not-unidirected socket stops kernel from fastpath in receive

    Read the article

  • Box2D in Flash runs quicker when drawing debug data than not

    - by bowdengm
    I've created a small game with Box2d for AS3 - I have sprites attached to the stage that take their position from the underlying Box2d world. These sprites are mostly PNGs. When the game runs with DrawDebugData() bening called every update, it runs nice and smoothly. However when I comment this out, it runs choppily. In both cases all my sprites are being rendered. So it seems that it's running faster when it's drawing the debug data additionaly (i.e. my sprites are on the screen in both cases!) What's going on? Does drawing the debug data flick some sort of 'render quick' switch? If so, what's the switch!? I can't see it in the Box2D code. function Update(e){ m_world.Step(m_timeStep, m_velocityIterations, m_positionIterations); // draw debug? m_world.DrawDebugData(); // with the above line in, I get 27fps, without it, I get 19fps. // that's the only change that's causing such a huge difference. doStuff(); } Interestingly, If i set the debug draw scale to something different to my world scale, it slows down to 19fps. So there's something happening when it draws the boxes under my sprites causing it to run quicker.. Cheers, Guy

    Read the article

  • Saving multiple items per single database cell...

    - by eugeneK
    Hi, i have a countries list. Each user can check multiple countries. Once saved, this "user country list" will be used to get whether other users fit into countries certain user chose. Question is what would be the most efficient approach to this problem... I have one, one to save user selection as delimited list like Canada,USA,France ... in single varchar(max) field but problem with it would be that once user from Germany enters page i perform this check on. To search for Germany i would be needed to get all items and un-delimit each field to check against value or to use sql 'like' which again is pretty damn slow.. If you have better solution or some tips i would be glad to hear. Just to make sure, many users will have their own selections of countries from which and only they want to have users to land on their page. While millions of users will reach those pages. So the faster approach will be the better. technology, MSSQL and ASP.NET thanks

    Read the article

  • java.math.BigInteger pow(exponent) question

    - by Jan Kraus
    Hi, I did some tests on pow(exponent) method. Unfortunately, my math skills are not strong enough to handle the following problem. I'm using this code: BigInteger.valueOf(2).pow(var); Results: var | time in ms 2000000 | 11450 2500000 | 12471 3000000 | 22379 3500000 | 32147 4000000 | 46270 4500000 | 31459 5000000 | 49922 See? 2,500,000 exponent is calculated almost as fast as 2,000,000. 4,500,000 is calculated much faster then 4,000,000. Why is that? To give you some help, here's the original implementation of BigInteger.pow(exponent): public BigInteger pow(int exponent) { if (exponent < 0) throw new ArithmeticException("Negative exponent"); if (signum==0) return (exponent==0 ? ONE : this); // Perform exponentiation using repeated squaring trick int newSign = (signum<0 && (exponent&1)==1 ? -1 : 1); int[] baseToPow2 = this.mag; int[] result = {1}; while (exponent != 0) { if ((exponent & 1)==1) { result = multiplyToLen(result, result.length, baseToPow2, baseToPow2.length, null); result = trustedStripLeadingZeroInts(result); } if ((exponent >>>= 1) != 0) { baseToPow2 = squareToLen(baseToPow2, baseToPow2.length, null); baseToPow2 = trustedStripLeadingZeroInts(baseToPow2); } } return new BigInteger(result, newSign); }

    Read the article

  • Need help tuning a SQL statement

    - by jeffself
    I've got a table that has two fields (custno and custno2) that need to be searched from a query. I didn't design this table, so don't scream at me. :-) I need to find all records where either the custno or custno2 matches the value returned from a query on the same table based on a titleno. In other words, the user types in 1234 for the titleno. My query searches the table to find the custno associated with the titleno. It also looks for the custno2 for that titleno. Then it needs to do a search on the same table for all other records that have either the custno or custno2 returned in the previous search in the custno or custno2 fields for those other records. Here is what I've come up with: SELECT BILLYR, BILLNO, TITLENO, VINID, TAXPAID, DUEDATE, DATEPIF, PROPDESC FROM TRCDBA.BILLSPAID WHERE CUSTNO IN (select custno from trcdba.billspaid where titleno = '1234' union select custno2 from trcdba.billspaid where titleno = '1234' and custno2 != '') OR CUSTNO2 IN (select custno from trcdba.billspaid where titleno = '1234' union select custno2 from trcdba.billspaid where titleno = '1234' and custno2 != '') The query takes about 5-10 seconds to return data. Can it be rewritten to work faster?

    Read the article

  • Static Vs Non-Static Method Performance C#

    - by dotnetguts
    Hello All, I have few global methods declared in public class in my asp.net web application. I have habbit of declaring all global methods in public class in following format public static string MethodName(parameters) { } I want to know how it would impact on performance point of view? 1) Which one is Better? Static Method or Non-Static Method? 2) Reason why it is better? Following link shows Non-Static methods are good because, static methods are using locks to be Thread-safe. The always do internally a Monitor.Enter() and Monitor.exit() to ensure Thread-safety. http://bytes.com/topic/c-sharp/answers/231701-static-vs-non-static-function-performance And Following link shows Static Methods are good static methods are normally faster to invoke on the call stack than instance methods. There are several reasons for this in the C# programming language. Instance methods actually use the 'this' instance pointer as the first parameter, so an instance method will always have that overhead. Instance methods are also implemented with the callvirt instruction in the intermediate language, which imposes a slight overhead. Please note that changing your methods to static methods is unlikely to help much on ambitious performance goals, but it can help a tiny bit and possibly lead to further reductions. http://dotnetperls.com/static-method I am little confuse which one to use? Thanks

    Read the article

  • What hash algorithms are paralellizable? Optimizing the hashing of large files utilizing on mult-co

    - by DanO
    I'm interested in optimizing the hashing of some large files (optimizing wall clock time). The I/O has been optimized well enough already and the I/O device (local SSD) is only tapped at about 25% of capacity, while one of the CPU cores is completely maxed-out. I have more cores available, and in the future will likely have even more cores. So far I've only been able to tap into more cores if I happen to need multiple hashes of the same file, say an MD5 AND a SHA256 at the same time. I can use the same I/O stream to feed two or more hash algorithms, and I get the faster algorithms done for free (as far as wall clock time). As I understand most hash algorithms, each new bit changes the entire result, and it is inherently challenging/impossible to do in parallel. Are any of the mainstream hash algorithms parallelizable? Are there any non-mainstream hashes that are parallelizable (and that have at least a sample implementation available)? As future CPUs will trend toward more cores and a leveling off in clock speed, is there any way to improve the performance of file hashing? (other than liquid nitrogen cooled overclocking?) or is it inherently non-parallelizable?

    Read the article

  • R: Are there any alternatives to loops for subsetting from an optimization standpoint?

    - by Adam
    A recurring analysis paradigm I encounter in my research is the need to subset based on all different group id values, performing statistical analysis on each group in turn, and putting the results in an output matrix for further processing/summarizing. How I typically do this in R is something like the following: data.mat <- read.csv("...") groupids <- unique(data.mat$ID) #Assume there are then 100 unique groups results <- matrix(rep("NA",300),ncol=3,nrow=100) for(i in 1:100) { tempmat <- subset(data.mat,ID==groupids[i]) #Run various stats on tempmat (correlations, regressions, etc), checking to #make sure this specific group doesn't have NAs in the variables I'm using #and assign results to x, y, and z, for example. results[i,1] <- x results[i,2] <- y results[i,3] <- z } This ends up working for me, but depending on the size of the data and the number of groups I'm working with, this can take up to three days. Besides branching out into parallel processing, is there any "trick" for making something like this run faster? For instance, converting the loops into something else (something like an apply with a function containing the stats I want to run inside the loop), or eliminating the need to actually assign the subset of data to a variable?

    Read the article

  • What is the optimum way to select the most dissimilar individuals from a population?

    - by Aaron D
    I have tried to use k-means clustering to select the most diverse markers in my population, for example, if we want to select 100 lines I cluster the whole population to 100 clusters then select the closest marker to the centroid from each cluster. The problem with my solution is it takes too much time (probably my function needs optimization), especially when the number of markers exceeds 100000. So, I will appreciate it so much if anyone can show me a new way to select markers that maximize diversity in my population and/or help me optimize my function to make it work faster. Thank you # example: library(BLR) data(wheat) dim(X) mdf<-mostdiff(t(X), 100,1,nstart=1000) Here is the mostdiff function that i used: mostdiff <- function(markers, nClust, nMrkPerClust, nstart=1000) { transposedMarkers <- as.array(markers) mrkClust <- kmeans(transposedMarkers, nClust, nstart=nstart) save(mrkClust, file="markerCluster.Rdata") # within clusters, pick the markers that are closest to the cluster centroid # turn the vector of which markers belong to which clusters into a list nClust long # each element of the list is a vector of the markers in that cluster clustersToList <- function(nClust, clusters) { vecOfCluster <- function(whichClust, clusters) { return(which(whichClust == clusters)) } return(apply(as.array(1:nClust), 1, vecOfCluster, clusters)) } pickCloseToCenter <- function(vecOfCluster, whichClust, transposedMarkers, centers, pickHowMany) { clustSize <- length(vecOfCluster) # if there are fewer than three markers, the center is equally distant from all so don't bother if (clustSize < 3) return(vecOfCluster[1:min(pickHowMany, clustSize)]) # figure out the distance (squared) between each marker in the cluster and the cluster center distToCenter <- function(marker, center){ diff <- center - marker return(sum(diff*diff)) } dists <- apply(transposedMarkers[vecOfCluster,], 1, distToCenter, center=centers[whichClust,]) return(vecOfCluster[order(dists)[1:min(pickHowMany, clustSize)]]) } }

    Read the article

  • Multithreaded linked list traversal

    - by Rob Bryce
    Given a (doubly) linked list of objects (C++), I have an operation that I would like multithread, to perform on each object. The cost of the operation is not uniform for each object. The linked list is the preferred storage for this set of objects for a variety of reasons. The 1st element in each object is the pointer to the next object; the 2nd element is the previous object in the list. I have solved the problem by building an array of nodes, and applying OpenMP. This gave decent performance. I then switched to my own threading routines (based off Windows primitives) and by using InterlockedIncrement() (acting on the index into the array), I can achieve higher overall CPU utilization and faster through-put. Essentially, the threads work by "leap-frog'ing" along the elements. My next approach to optimization is to try to eliminate creating/reusing the array of elements in my linked list. However, I'd like to continue with this "leap-frog" approach and somehow use some nonexistent routine that could be called "InterlockedCompareDereference" - to atomically compare against NULL (end of list) and conditionally dereference & store, returning the dereferenced value. I don't think InterlockedCompareExchangePointer() will work since I cannot atomically dereference the pointer and call this Interlocked() method. I've done some reading and others are suggesting critical sections or spin-locks. Critical sections seem heavy-weight here. I'm tempted to try spin-locks but I thought I'd first pose the question here and ask what other people are doing. I'm not convinced that the InterlockedCompareExchangePointer() method itself could be used like a spin-lock. Then one also has to consider acquire/release/fence semantics... Ideas? Thanks!

    Read the article

  • Updating section in ConfigParser (or an alternative)

    - by lyrae
    I am making a plugin for another program and so I am trying to make thing as lightweight as possible. What i need to do is be able to update the name of a section in the ConfigParser's config file. [project name] author:john doe email: [email protected] year: 2010 I then have text fields where user can edit project's name, author, email and year. I don't think changing [project name] is possible, so I have thought of two solutions: 1 -Have my config file like this: [0] projectname: foobar author:john doe email: [email protected] year: 2010 that way i can change project's name just like another option. But the problem is, i would need the section # to be auto incremented. And to do this i would have to get every section, sort of, and figure out what the next number should be. The other option would be to delete the entire section and its value, and re-add it with the updated values which would require a little more work as well, such as passing a variable that holds the old section name through functions, etc, but i wouldn't mind if it's faster. Which of the two is best? or is there another way? I am willing to go with the fastest/lightweight solution possible, doesn't matter if it requires more work or not.

    Read the article

  • How can I accelerate the generation of the an MD5 Checksum within vb.net?

    - by Richard
    I'm working with some very large files residing on P2 (Panasonic) cards. Part of the process we employ is to first generate a checksum of the file we are going to copy, then copy the file, then run a checksum on the file to confirm that it copied OK. The problem is, is that files are large (70 GB+) and take a long time to complete. It's an issue since we will eventually be dealing with thousands of these files. I would like to find a faster way to generate the checksum other than using the System.Security.Cryptography.MD5CryptoServiceProvider I don't care if this means using a specialized hardware card, provided it works and is not to ungodly expensive. I would prefer to have a method of encoding that provided some feedback as to how far the process has gone along so I can display it like I do now. The application is written in vb.net. I would prefer to be able to use it as component, library, reference within my application, but I'm willing to call an outside application if there is enough improvement in the speed of generating the checksum. Needless to say, the checksum must be consistent and correct. :-) Thank you in advance for your time and efforts, Richard

    Read the article

  • Vector [] vs copying

    - by sak
    What is faster and/or generally better? vector<myType> myVec; int i; myType current; for( i = 0; i < 1000000; i ++ ) { current = myVec[ i ]; doSomethingWith( current ); doAlotMoreWith( current ); messAroundWith( current ); checkSomeValuesOf( current ); } or vector<myType> myVec; int i; for( i = 0; i < 1000000; i ++ ) { doSomethingWith( myVec[ i ] ); doAlotMoreWith( myVec[ i ] ); messAroundWith( myVec[ i ] ); checkSomeValuesOf( myVec[ i ] ); } I'm currently using the first solution. There are really millions of calls per second and every single bit comparison/move is performance-problematic.

    Read the article

  • How to hold a queue of messages and have a group of working threads without polling?

    - by Mark
    I have a workflow that I want to looks something like this: / Worker 1 \ =Request Channel= - [Holding Queue|||] - Worker 2 - =Response Channel= \ Worker 3 / That is: Requests come in and they enter a FIFO queue Identical workers then pick up tasks from the queue At any given time any worker may work only one task When a worker is free and the holding queue is non-empty the worker should immediately pick up another task When tasks are complete, a worker places the result on the Response Channel I know there are QueueChannels in Spring Integration, but these channels require polling (which seems suboptimal). In particular, if a worker can be busy, I'd like the worker to be busy. Also, I've considered avoiding the queue altogether and simply letting tasks round-robin to all workers, but it's preferable to have a single waiting line as some tasks may be accomplished faster than others. Furthermore, I'd like insight into how many jobs are remaining (which I can get from the queue) and the ability to cancel all or particular jobs. How can I implement this message queuing/work distribution pattern while avoiding a polling? Edit: It appears I'm looking for the Message Dispatcher pattern -- how can I implement this using Spring/Spring Integration?

    Read the article

  • Simple question about the lunarlander example.

    - by Smills
    I am basing my game off the lunarlander example. This is the run loop I am using (very similar to what is used in lunarlander). I am getting considerable performance issues associated with my drawing, even if I draw almost nothing. I noticed the below method. Why is the canvas being created and set to null each cycle? @Override public void run() { while (mRun) { Canvas c = null; try { c = mSurfaceHolder.lockCanvas();//null synchronized (mSurfaceHolder) { updatePhysics(); doDraw(c); } } finally { // do this in a finally so that if an exception is thrown // during the above, we don't leave the Surface in an // inconsistent state if (c != null) { mSurfaceHolder.unlockCanvasAndPost(c); } } } } Most of the times I have read anything about canvases it is more along the lines of: mField = new Bitmap(...dimensions...); Canvas c = new Canvas(mField); My question is: why is Google's example done that way (null canvas), what are the benefits of this, and is there a faster way to do it?

    Read the article

  • Partial class or "chained inheritance"

    - by Charlie boy
    Hi From my understanding partial classes are a bit frowned upon by professional developers, but I've come over a bit of an issue; I have made an implementation of the RichTextBox control that uses user32.dll calls for faster editing of large texts. That results in quite a bit of code. Then I added spellchecking capabilities to the control, this was made in another class inheriting RichTextBox control as well. That also makes up a bit of code. These two functionalities are quite separate but I would like them to be merged so that I can drop one control on my form that has both fast editing capabilities and spellchecking built in. I feel that simply adding the code form one class to the other would result in a too large code file, especially since there are two very distinct areas of functionality, so I seem to need another approach. Now to my question; To merge these two classes should I make the spellchecking RichTextBox inherit from the fast edit one, that in turn inherits RichTextBox? Or should I make the two classes partials of a single class and thus making them more “equal” so to speak? This is more of a question of OO principles and exercise on my part than me trying to reinvent the wheel, I know there are plenty of good text editing controls out there. But this is just a hobby for me and I just want to know how this kind of solution would be managed by a professional. Thanks!

    Read the article

  • libcurl (c api) READFUNCTION for http PUT blocking forever

    - by Duane
    I am using libcurl for a RESTful library. I am having two problems with a PUT message, I am just trying to send a small content like "hello" via put. My READFUNCTION for PUT's blocks for a very large amount of time (minutes) when I follow the manual at curl.haxx.se and return a 0 indicating I have finished the content. (on os X) When I return something 0 this succeeds much faster (< 1 sec) When I run this on my linux machine (ubuntu 10.4) this blocking event appears to NEVER return when I return 0, if I change the behavior to return the size written libcurl appends all the data in the http body sending way more data and it fails with a "too much data" message from the server. my readfunction is below, any help would be greatly appreciated. I am using libcurl 7.20.1 typedef struct{ void *data; int body_size; int bytes_remaining; int bytes_written; } postdata; size_t readfunc(void *ptr, size_t size, size_t nmemb, void *stream) { if(stream) { postdata ud = (postdata)stream; if(ud->bytes_remaining) { if(ud->body_size > size*nmemb) { memcpy(ptr, ud->data+ud->bytes_written, size*nmemb); ud->bytes_written+=size+nmemb; ud->bytes_remaining = ud->body_size-size*nmemb; return size*nmemb; } else { memcpy(ptr, ud->data+ud->bytes_written, ud->bytes_remaining); ud->bytes_remaining=0; return 0; } }

    Read the article

  • STL vector performance

    - by iAdam
    STL vector class stores a copy of the object using copy constructor each time I call push_back. Wouldn't it slow down the program? I can have a custom linkedlist kind of class which deals with pointers to objects. Though it would not have some benefits of STL but still should be faster. See this code below: #include <vector> #include <iostream> #include <cstring> using namespace std; class myclass { public: char* text; myclass(const char* val) { text = new char[10]; strcpy(text, val); } myclass(const myclass& v) { cout << "copy\n"; //copy data } }; int main() { vector<myclass> list; myclass m1("first"); myclass m2("second"); cout << "adding first..."; list.push_back(m1); cout << "adding second..."; list.push_back(m2); cout << "returning..."; myclass& ret1 = list.at(0); cout << ret1.text << endl; return 0; } its output comes out as: adding first...copy adding second...copy copy The output shows the copy constructor is called both times when adding and when retrieving the value even then. Does it have any effect on performance esp when we have larger objects?

    Read the article

  • Fastest XML parser for small, simple documents in Java

    - by Varkhan
    I have to objectify very simple and small XML documents (less than 1k, and it's almost SGML: no namespaces, plain UTF-8, you name it...), read from a stream, in Java. I am using JAXP to process the data from my stream into a Document object. I have tried Xerces, it's way too big and slow... I am using Dom4j, but I am still spending way too much time in org.dom4j.io.SAXReader. Does anybody out there have any suggestion on a faster, more efficient implementation, keeping in mind I have very tough CPU and memory constraints? [Edit 1] Keep in mind that my documents are very small, so the overhead of staring the parser can be important. For instance I am spending as much time in org.xml.sax.helpers.XMLReaderFactory.createXMLReader as in org.dom4j.io.SAXReader.read [Edit 2] The result has to be in Dom format, as I pass the document to decision tools that do arbitrary processing on it, like switching code based on the value of arbitrary XPaths, but also extracting lists of values packed as children of a predefined node. [Edit 3] In any case I eventually need to load/parse the complete document, since all the information it contains is going to be used at some point. (This question is related to, but different from, http://stackoverflow.com/questions/373833/best-xml-parser-for-java )

    Read the article

  • How would I instruct extconf.rb to use additional g++ optimization flags, and which are advisable?

    - by mohawkjohn
    I'm using Rice to write a C++ extension for a Ruby gem. The extension is in the form of a shared object (.so) file. This requires 'mkmf-rice' instead of 'mkmf', but the two (AFAIK) are pretty similar. By default, the compiler uses the flags -g -O2. Personally, I find this kind of silly, since it's hard to debug with any optimization enabled. I've resorted to editing the Makefile to take out the flags I don't like (e.g., removing -fPIC -shared when I need to debug using main() instead of Ruby's hooks). But I figure there's got to be a better way. I know I can just do $CPPFLAGS += " -DRICE" to add additional flags. But how do I remove things without editing the Makefile directly? A secondary question: what optimizations are safe for shared objects loaded by Ruby? Can I do things like -funroll-loops? What do you all recommend? It's a scientific computing project, so the faster the better. Memory is not much of an issue. Many thanks!

    Read the article

  • Issue with GCD and too many threads

    - by dariaa
    I have an image loader class which provided with NSURL loads and image from the web and executes completion block. Code is actually quite simple - (void)downloadImageWithURL:(NSString *)URLString completion:(BELoadImageCompletionBlock)completion { dispatch_async(_queue, ^{ // dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0), ^{ UIImage *image = nil; NSURL *URL = [NSURL URLWithString:URLString]; if (URL) { image = [UIImage imageWithData:[NSData dataWithContentsOfURL:URL]]; } dispatch_async(dispatch_get_main_queue(), ^{ completion(image, URLString); }); }); } When I replace dispatch_async(_queue, ^{ with commented out dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0), ^{ Images are loading much faster, wich is quite logical (before that images would be loaded one at a time, now a bunch of them are loading simultaneously). My issue is that I have perhaps 50 images and I call downloadImageWithURL:completion: method for all of them and when I use global queue instead of _queue my app eventually crashes and I see there are 85+ threads. Can the problem be that my calling dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0) 50 times in a row makes GCD create too many threads? I thought that gcd handles all the treading and makes sure the number of threads is not huge, but if it's not the case is there any way I can influence number of threads?

    Read the article

  • g-wan - reproducing the performance claims

    - by user2603628
    Using gwan_linux64-bit.tar.bz2 under Ubuntu 12.04 LTS unpacking and running gwan then pointing wrk at it (using a null file null.html) wrk --timeout 10 -t 2 -c 100 -d20s http://127.0.0.1:8080/null.html Running 20s test @ http://127.0.0.1:8080/null.html 2 threads and 100 connections Thread Stats Avg Stdev Max +/- Stdev Latency 11.65s 5.10s 13.89s 83.91% Req/Sec 3.33k 3.65k 12.33k 75.19% 125067 requests in 20.01s, 32.08MB read Socket errors: connect 0, read 37, write 0, timeout 49 Requests/sec: 6251.46 Transfer/sec: 1.60MB .. very poor performance, in fact there seems to be some kind of huge latency issue. During the test gwan is 200% busy and wrk is 67% busy. Pointing at nginx, wrk is 200% busy and nginx is 45% busy: wrk --timeout 10 -t 2 -c 100 -d20s http://127.0.0.1/null.html Thread Stats Avg Stdev Max +/- Stdev Latency 371.81us 134.05us 24.04ms 91.26% Req/Sec 72.75k 7.38k 109.22k 68.21% 2740883 requests in 20.00s, 540.95MB read Requests/sec: 137046.70 Transfer/sec: 27.05MB Pointing weighttpd at nginx gives even faster results: /usr/local/bin/weighttp -k -n 2000000 -c 500 -t 3 http://127.0.0.1/null.html weighttp - a lightweight and simple webserver benchmarking tool starting benchmark... spawning thread #1: 167 concurrent requests, 666667 total requests spawning thread #2: 167 concurrent requests, 666667 total requests spawning thread #3: 166 concurrent requests, 666666 total requests progress: 9% done progress: 19% done progress: 29% done progress: 39% done progress: 49% done progress: 59% done progress: 69% done progress: 79% done progress: 89% done progress: 99% done finished in 7 sec, 13 millisec and 293 microsec, 285172 req/s, 57633 kbyte/s requests: 2000000 total, 2000000 started, 2000000 done, 2000000 succeeded, 0 failed, 0 errored status codes: 2000000 2xx, 0 3xx, 0 4xx, 0 5xx traffic: 413901205 bytes total, 413901205 bytes http, 0 bytes data The server is a virtual 8 core dedicated server (bare metal), under KVM Where do I start looking to identify the problem gwan is having on this platform ? I have tested lighttpd, nginx and node.js on this same OS, and the results are all as one would expect. The server has been tuned in the usual way with expanded ephemeral ports, increased ulimits, adjusted time wait recycling etc.

    Read the article

  • Splitting a set of object into several subsets of 'similar' objects

    - by doublep
    Suppose I have a set of objects, S. There is an algorithm f that, given a set S builds certain data structure D on it: f(S) = D. If S is large and/or contains vastly different objects, D becomes large, to the point of being unusable (i.e. not fitting in allotted memory). To overcome this, I split S into several non-intersecting subsets: S = S1 + S2 + ... + Sn and build Di for each subset. Using n structures is less efficient than using one, but at least this way I can fit into memory constraints. Since size of f(S) grows faster than S itself, combined size of Di is much less than size of D. However, it is still desirable to reduce n, i.e. the number of subsets; or reduce the combined size of Di. For this, I need to split S in such a way that each Si contains "similar" objects, because then f will produce a smaller output structure if input objects are "similar enough" to each other. The problems is that while "similarity" of objects in S and size of f(S) do correlate, there is no way to compute the latter other than just evaluating f(S), and f is not quite fast. Algorithm I have currently is to iteratively add each next object from S into one of Si, so that this results in the least possible (at this stage) increase in combined Di size: for x in S: i = such i that size(f(Si + {x})) - size(f(Si)) is min Si = Si + {x} This gives practically useful results, but certainly pretty far from optimum (i.e. the minimal possible combined size). Also, this is slow. To speed up somewhat, I compute size(f(Si + {x})) - size(f(Si)) only for those i where x is "similar enough" to objects already in Si. Is there any standard approach to such kinds of problems? I know of branch and bounds algorithm family, but it cannot be applied here because it would be prohibitively slow. My guess is that it is simply not possible to compute optimal distribution of S into Si in reasonable time. But is there some common iteratively improving algorithm?

    Read the article

  • Creating and appending a big DOM with javascript - most optimized way?

    - by fenderplayer
    I use the following code to append a big dom on a mobile browser (webkit): 1. while(someIndex--) // someIndex ranges from 10 to possibly 1000 2. { 3. var html01 = ['<div class="test">', someVal,'</div>', 4. '<div><p>', someTxt.txt1, someTxt.txt2, '</p></div>', 5. // lots of html snippets interspersed with variables 6. // on average ~40 to 50 elements in this array 7. ].join(''); 8. var fragment = document.createDocumentFragment(), 9. div = fragment.appendChild(document.createElement('div')); 10. div.appendChild(jQuery(html01)[0]); 11. jQuery('#screen1').append(fragment); 12. } //end while loop 13. // similarly i create 'html02' till 'html15' to append in other screen divs Is there a better or faster way to do the above? Do you see any problems with the code? I am a little worried about line 10 where i wrap in jquery and then take it out.

    Read the article

< Previous Page | 151 152 153 154 155 156 157 158 159 160 161 162  | Next Page >