Search Results

Search found 14282 results on 572 pages for 'performance counter'.

Page 157/572 | < Previous Page | 153 154 155 156 157 158 159 160 161 162 163 164  | Next Page >

  • Javascript scope chain

    - by Geromey
    Hi, I am trying to optimize my program. I think I understand the basics of closure. I am confused about the scope chain though. I know that in general you want a low scope (to access variables quickly). Say I have the following object: var my_object = (function(){ //private variables var a_private = 0; return{ //public //public variables a_public : 1, //public methods some_public : function(){ debugger; alert(this.a_public); alert(a_private); }; }; })(); My understanding is that if I am in the some_public method I can access the private variables faster than the public ones. Is this correct? My confusion comes with the scope level of this. When the code is stopped at debugger, firebug shows the public variable inside the this keyword. The this word is not inside a scope level. How fast is accessing this? Right now I am storing any this.properties as another local variable to avoid accessing it multiple times. Thanks very much!

    Read the article

  • 2 approaches for tracking online users with Redis. Which one is faster?

    - by Stanislav
    Recently I found an nice blog post presenting 2 approaches for tracking online users of a web site with the help of Redis. 1) Smart-keys and setting their expiration http://techno-weenie.net/2010/2/3/where-s-waldo-track-user-locations-with-node-js-and-redis 2) Set-s and intersects http://www.lukemelia.com/blog/archives/2010/01/17/redis-in-practice-whos-online/ Can you judge which one should be faster and why?

    Read the article

  • Perf4J Logging Config Help

    - by manyxcxi
    I currently have a long running process that I am trying to analyze with Perf4J. I currently have it writing results in CSV format to its own log file using the AsyncCoalescingStatisticsAppender and a StatisticsCsvLayout on the file appender. My question is; when I try and use the --graph option from the command line (using the perf4j jar) it isn't populating the data points- it isn't populating anything. Are my appenders set incorrectly? The log file contains hundreds (sometimes thousands) of data points of about 10 different tag names. <appender name="perfAppender" class="org.apache.log4j.FileAppender"> <param name="File" value="perfStats.log"/> <layout class="org.perf4j.log4j.StatisticsCsvLayout"> </layout> </appender> <appender name="CoalescingStatistics" class="org.perf4j.log4j.AsyncCoalescingStatisticsAppender"> <!-- The TimeSlice option is used to determine the time window for which all received StopWatch logs are aggregated to create a single GroupedTimingStatistics log. Here we set it to 10 seconds, overriding the default of 30000 ms --> <param name="TimeSlice" value="10000"/> <appender-ref ref="ConsoleAppender"/> <appender-ref ref="CompositeRollingFileAppender"/> <appender-ref ref="perfAppender"/> </appender>

    Read the article

  • Image 8-connectivity without excessive branching?

    - by shoosh
    I'm writing a low level image processing algorithm which needs to do alot of 8-connectivity checks for pixels. For every pixel I often need to check the pixels above it, below it and on its sides and diagonals. On the edges of the image there are special cases where there are only 5 or 3 neighbors instead of 8 neighbors for a pixels. The naive way to do it is for every access to check if the coordinates are in the right range and if not, return some default value. I'm looking for a way to avoid all these checks since they introduce a large overhead to the algorithm. Are there any tricks to avoid it altogether?

    Read the article

  • Why is numpy's einsum faster than numpy's built in functions?

    - by Ophion
    Lets start with three arrays of dtype=np.double. Timings are performed on a intel CPU using numpy 1.7.1 compiled with icc and linked to intel's mkl. A AMD cpu with numpy 1.6.1 compiled with gcc without mkl was also used to verify the timings. Please note the timings scale nearly linearly with system size and are not due to the small overhead incurred in the numpy functions if statements these difference will show up in microseconds not milliseconds: arr_1D=np.arange(500,dtype=np.double) large_arr_1D=np.arange(100000,dtype=np.double) arr_2D=np.arange(500**2,dtype=np.double).reshape(500,500) arr_3D=np.arange(500**3,dtype=np.double).reshape(500,500,500) First lets look at the np.sum function: np.all(np.sum(arr_3D)==np.einsum('ijk->',arr_3D)) True %timeit np.sum(arr_3D) 10 loops, best of 3: 142 ms per loop %timeit np.einsum('ijk->', arr_3D) 10 loops, best of 3: 70.2 ms per loop Powers: np.allclose(arr_3D*arr_3D*arr_3D,np.einsum('ijk,ijk,ijk->ijk',arr_3D,arr_3D,arr_3D)) True %timeit arr_3D*arr_3D*arr_3D 1 loops, best of 3: 1.32 s per loop %timeit np.einsum('ijk,ijk,ijk->ijk', arr_3D, arr_3D, arr_3D) 1 loops, best of 3: 694 ms per loop Outer product: np.all(np.outer(arr_1D,arr_1D)==np.einsum('i,k->ik',arr_1D,arr_1D)) True %timeit np.outer(arr_1D, arr_1D) 1000 loops, best of 3: 411 us per loop %timeit np.einsum('i,k->ik', arr_1D, arr_1D) 1000 loops, best of 3: 245 us per loop All of the above are twice as fast with np.einsum. These should be apples to apples comparisons as everything is specifically of dtype=np.double. I would expect the speed up in an operation like this: np.allclose(np.sum(arr_2D*arr_3D),np.einsum('ij,oij->',arr_2D,arr_3D)) True %timeit np.sum(arr_2D*arr_3D) 1 loops, best of 3: 813 ms per loop %timeit np.einsum('ij,oij->', arr_2D, arr_3D) 10 loops, best of 3: 85.1 ms per loop Einsum seems to be at least twice as fast for np.inner, np.outer, np.kron, and np.sum regardless of axes selection. The primary exception being np.dot as it calls DGEMM from a BLAS library. So why is np.einsum faster that other numpy functions that are equivalent? The DGEMM case for completeness: np.allclose(np.dot(arr_2D,arr_2D),np.einsum('ij,jk',arr_2D,arr_2D)) True %timeit np.einsum('ij,jk',arr_2D,arr_2D) 10 loops, best of 3: 56.1 ms per loop %timeit np.dot(arr_2D,arr_2D) 100 loops, best of 3: 5.17 ms per loop The leading theory is from @sebergs comment that np.einsum can make use of SSE2, but numpy's ufuncs will not until numpy 1.8 (see the change log). I believe this is the correct answer, but have not been able to confirm it. Some limited proof can be found by changing the dtype of input array and observing speed difference and the fact that not everyone observes the same trends in timings.

    Read the article

  • yuicompressor error, not sure what is wrong?

    - by mrblah
    Hi, Very confused here, trying out the yuicompressor on a simple javascript file. My js file looks like: function splitText(text) { return text.split('-')[1]; } The error is: [INFO] Using charset Cp1252 [Error] 1:20:illegal character [Error] 1:20:syntax error [Error] 1:40:illegal character [Error] 1:49:missing ; before statement [Error] 1:50:illegal character .. .. [Error] 7:3:missing | in compound statement [error] 1:0:compilation produced 38 syntax errors ... Can someone please explain to me what is wrong?

    Read the article

  • Loading animation Memory leak

    - by Ayaz Alavi
    Hi, I have written network class that is managing all network calls for my application. There are two methods showLoadingAnimationView and hideLoadingAnimationView that will show UIActivityIndicatorView in a view over my current viewcontroller with fade background. I am getting memory leaks somewhere on these two methods. Here is the code -(void)showLoadingAnimationView { textmeAppDelegate *textme = (textmeAppDelegate *)[[UIApplication sharedApplication] delegate]; [[UIApplication sharedApplication] setNetworkActivityIndicatorVisible:YES]; if(wrapperLoading != nil) { [wrapperLoading release]; } wrapperLoading = [[UIView alloc] initWithFrame:CGRectMake(0.0, 0.0, 320.0, 480.0)]; wrapperLoading.backgroundColor = [UIColor clearColor]; wrapperLoading.alpha = 0.8; UIView *_loadingBG = [[UIView alloc] initWithFrame:CGRectMake(0.0, 0.0, 320.0, 480.0)]; _loadingBG.backgroundColor = [UIColor blackColor]; _loadingBG.alpha = 0.4; circlingWheel = [[UIActivityIndicatorView alloc] initWithActivityIndicatorStyle:UIActivityIndicatorViewStyleWhiteLarge]; CGRect wheelFrame = circlingWheel.frame; circlingWheel.frame = CGRectMake(((320.0 - wheelFrame.size.width) / 2.0), ((480.0 - wheelFrame.size.height) / 2.0), wheelFrame.size.width, wheelFrame.size.height); [wrapperLoading addSubview:_loadingBG]; [wrapperLoading addSubview:circlingWheel]; [circlingWheel startAnimating]; [textme.window addSubview:wrapperLoading]; [_loadingBG release]; [circlingWheel release]; } -(void)hideLoadingAnimationView { [[UIApplication sharedApplication] setNetworkActivityIndicatorVisible:NO]; wrapperLoading.alpha = 0.0; [self.wrapperLoading removeFromSuperview]; //[NSTimer scheduledTimerWithTimeInterval:0.8 target:wrapperLoading selector:@selector(removeFromSuperview) userInfo:nil repeats:NO]; } Here is how I am calling these two methods [NSThread detachNewThreadSelector:@selector(showLoadingAnimationView) toTarget:self withObject:nil]; and then somewhere later in the code i am using following function call to hide animation. [self hideLoadingAnimationView]; I am getting memory leaks when I call showLoadingAnimationView function. Anything wrong in the code or is there any better technique to show loading animation when we do network calls?

    Read the article

  • High CPU usage when running several "java -version" in parallel

    - by Prateesh
    This is just out of curiosity to understand i have a small shell script for ((i = 0; i < 50; i++)) do java -version & done when i run this my CPU usage report by sar is as below 07:51:25 PM CPU %user %nice %system %iowait %steal %idle 07:51:30 PM all 6.98 0.00 1.75 1.00 0.00 90.27 07:51:31 PM all 43.00 0.00 12.00 0.00 0.00 45.00 07:51:32 PM all 86.28 0.00 13.72 0.00 0.00 0.00 07:51:33 PM all 5.25 0.00 1.75 0.50 0.00 92.50 As you can see, on the third line the CPU is at 100% My java version is 1.5.0_22-b03.

    Read the article

  • has anyone used simile timeline with large amounts of data

    - by oo
    i am using this simile timeline with large amounts of data and i keep getting firefox popping up saying "a script has appeared to no longer be running, do you want to kill it"? is there a limit to the amount of json you can send back to it. I have about 1000 different timeline points with dates, descriptions, etc.

    Read the article

  • Tuning MySQL to take advantage of a 4GB VPS

    - by alistair.mp
    Hello, We're running a large site at the moment which has a dedicated VPS for it's database server which is running MySQL and nothing else. At the moment all four CPU cores are running at close to 100% all of the time but the memory usage sticks at around 268MB out of an available 4096MB. I'm wondering what we can do to better utilise the memory and reduce the CPU load by tweaking MySQL's settings? Here is what we currently have in my.cnf: http://pastie.org/private/hxeji9o8n3u9up9mvtinbq Thanks

    Read the article

  • Make compiler copy characters using movsd

    - by Suma
    I would like to copy a relatively short sequence of memory (less than 1 KB, typically 2-200 bytes) in a time critical function. The best code for this on CPU side seems to be rep movsd. However I somehow cannot make my compiler to generate this code. I hoped (and I vaguely remember seeing so) using memcpy would do this using compiler built-in instrinsic, but based on disassembly and debugging it seems compiler is using call to memcpy/memmove library implementation instead. I also hoped the compiler might be smart enough to recognize following loop and use rep movsd on its own, but it seems it does not. char *dst; const char *src; // ... for (int r=size; --r>=0; ) *dst++ = *src++; Is there some way to make the Visual Studio compiler to generate rep movsd sequence other than using inline assembly?

    Read the article

  • Practical approach to concurrency control

    - by Industrial
    Hi everyone, I'd read this article recently and are very interested on how to make a practical approach to Concurrency control on a web server. The server will run CentOS + PHP + mySQL with Memcached. How would you set it up to work? http://saasinterrupted.com/2010/02/05/high-availability-principle-concurrency-control/ Thanks!

    Read the article

  • Bulk inserts into sqlite db on the iphone...

    - by akaii
    I'm inserting a batch of 100 records, each containing a dictonary containing arbitrarily long HTML strings, and by god, it's slow. On the iphone, the runloop is blocking for several seconds during this transaction. Is my only recourse to use another thread? I'm already using several for acquiring data from HTTP servers, and the sqlite documentation explicitly discourages threading with the database, even though it's supposed to be thread-safe... Is there something I'm doing extremely wrong that if fixed, would drastically reduce the time it takes to complete the whole operation? NSString* statement; statement = @"BEGIN EXCLUSIVE TRANSACTION"; sqlite3_stmt *beginStatement; if (sqlite3_prepare_v2(database, [statement UTF8String], -1, &beginStatement, NULL) != SQLITE_OK) { printf("db error: %s\n", sqlite3_errmsg(database)); return; } if (sqlite3_step(beginStatement) != SQLITE_DONE) { sqlite3_finalize(beginStatement); printf("db error: %s\n", sqlite3_errmsg(database)); return; } NSTimeInterval timestampB = [[NSDate date] timeIntervalSince1970]; statement = @"INSERT OR REPLACE INTO item (hash, tag, owner, timestamp, dictionary) VALUES (?, ?, ?, ?, ?)"; sqlite3_stmt *compiledStatement; if(sqlite3_prepare_v2(database, [statement UTF8String], -1, &compiledStatement, NULL) == SQLITE_OK) { for(int i = 0; i < [items count]; i++){ NSMutableDictionary* item = [items objectAtIndex:i]; NSString* tag = [item objectForKey:@"id"]; NSInteger hash = [[NSString stringWithFormat:@"%@%@", tag, ownerID] hash]; NSInteger timestamp = [[item objectForKey:@"updated"] intValue]; NSData *dictionary = [NSKeyedArchiver archivedDataWithRootObject:item]; sqlite3_bind_int( compiledStatement, 1, hash); sqlite3_bind_text( compiledStatement, 2, [tag UTF8String], -1, SQLITE_TRANSIENT); sqlite3_bind_text( compiledStatement, 3, [ownerID UTF8String], -1, SQLITE_TRANSIENT); sqlite3_bind_int( compiledStatement, 4, timestamp); sqlite3_bind_blob( compiledStatement, 5, [dictionary bytes], [dictionary length], SQLITE_TRANSIENT); while(YES){ NSInteger result = sqlite3_step(compiledStatement); if(result == SQLITE_DONE){ break; } else if(result != SQLITE_BUSY){ printf("db error: %s\n", sqlite3_errmsg(database)); break; } } sqlite3_reset(compiledStatement); } timestampB = [[NSDate date] timeIntervalSince1970] - timestampB; NSLog(@"Insert Time Taken: %f",timestampB); // COMMIT statement = @"COMMIT TRANSACTION"; sqlite3_stmt *commitStatement; if (sqlite3_prepare_v2(database, [statement UTF8String], -1, &commitStatement, NULL) != SQLITE_OK) { printf("db error: %s\n", sqlite3_errmsg(database)); } if (sqlite3_step(commitStatement) != SQLITE_DONE) { printf("db error: %s\n", sqlite3_errmsg(database)); } sqlite3_finalize(beginStatement); sqlite3_finalize(compiledStatement); sqlite3_finalize(commitStatement);

    Read the article

  • How well do zippers perform in practice, and when should they be used?

    - by Rob
    I think that the zipper is a beautiful idea; it elegantly provides a way to walk a list or tree and make what appear to be local updates in a functional way. Asymptotically, the costs appear to be reasonable. But traversing the data structure requires memory allocation at each iteration, where a normal list or tree traversal is just pointer chasing. This seems expensive (please correct me if I am wrong). Are the costs prohibitive? And what under what circumstances would it be reasonable to use a zipper?

    Read the article

  • Impact of ordering of correlated subqueries within a projection

    - by Michael Petito
    I'm noticing something a bit unexpected with how SQL Server (SQL Server 2008 in this case) treats correlated subqueries within a select statement. My assumption was that a query plan should not be affected by the mere order in which subqueries (or columns, for that matter) are written within the projection clause of the select statement. However, this does not appear to be the case. Consider the following two queries, which are identical except for the ordering of the subqueries within the CTE: --query 1: subquery for Color is second WITH vw AS ( SELECT p.[ID], (SELECT TOP(1) [FirstName] FROM [Preference] WHERE p.ID = ID AND [FirstName] IS NOT NULL ORDER BY [LastModified] DESC) [FirstName], (SELECT TOP(1) [Color] FROM [Preference] WHERE p.ID = ID AND [Color] IS NOT NULL ORDER BY [LastModified] DESC) [Color] FROM Person p ) SELECT ID, Color, FirstName FROM vw WHERE Color = 'Gray'; --query 2: subquery for Color is first WITH vw AS ( SELECT p.[ID], (SELECT TOP(1) [Color] FROM [Preference] WHERE p.ID = ID AND [Color] IS NOT NULL ORDER BY [LastModified] DESC) [Color], (SELECT TOP(1) [FirstName] FROM [Preference] WHERE p.ID = ID AND [FirstName] IS NOT NULL ORDER BY [LastModified] DESC) [FirstName] FROM Person p ) SELECT ID, Color, FirstName FROM vw WHERE Color = 'Gray'; If you look at the two query plans, you'll see that an outer join is used for each subquery and that the order of the joins is the same as the order the subqueries are written. There is a filter applied to the result of the outer join for color, to filter out rows where the color is not 'Gray'. (It's odd to me that SQL would use an outer join for the color subquery since I have a non-null constraint on the result of the color subquery, but OK.) Most of the rows are removed by the color filter. The result is that query 2 is significantly cheaper than query 1 because fewer rows are involved with the second join. All reasons for constructing such a statement aside, is this an expected behavior? Shouldn't SQL server opt to move the filter as early as possible in the query plan, regardless of the order the subqueries are written?

    Read the article

  • atomic operation cost

    - by osgx
    Hello What is the cost of the atomic operation? How much cycles does it consume? Will it pause other processors on SMP or NUMA, or will it block memory accesses? Will it flush reorder buffer in out-of-order CPU? What effects will be on the cache? Thanks.

    Read the article

  • Pre-generating GUIDs for use in python?

    - by rjuiaa1
    I have a python program that needs to generate several guids and hand them back with some other data to a client over the network. It may be hit with a lot of requests in a short time period and I would like the latency to be as low as reasonably possible. Ideally, rather than generating new guids on the fly as the client waits for a response, I would rather be bulk-generating a list of guids in the background that is continually replenished so that I always have pre-generated ones ready to hand out. I am using the uuid module in python on linux. I understand that this is using the uuidd daemon to get uuids. Does uuidd already take care of pre-genreating uuids so that it always has some ready? From the documentation it appears that it does not. Is there some setting in python or with uuidd to get it to do this automatically? Is there a more elegant approach then manually creating a background thread in my program that maintains a list of uuids?

    Read the article

  • Python Sets vs Lists

    - by mvid
    In Python, which data structure is more efficient/speedy? Assuming that order is not important to me and I would be checking for duplicates anyway, is a Python set slower than a Python list?

    Read the article

  • How to fire server-side methods with jQuery

    - by Nasser Hajloo
    I have a large application and I'm going to enabling short-cut key for it. I'd find 2 JQuery plug-ins (demo plug-in 1 - Demo plug-in 2) that do this for me. you can find both of them in this post in StackOverFlow My application is a completed one and I'm goining to add some functionality to it so I don't want towrite code again. So as a short-cut is just catching a key combination, I'm wonder how can I call the server methods which a short-cut key should fire? So How to use either of these plug-ins, by just calling the methods I'd written before? Actually How to fire Server methods with Jquery? You can also find a good article here, by Dave Ward Update: here is the scenario. When User press CTRL+Del the GridView1_OnDeleteCommand so I have this protected void grdDocumentRows_DeleteCommand(object source, System.Web.UI.WebControls.DataGridCommandEventArgs e) { try { DeleteRow(grdDocumentRows.DataKeys[e.Item.ItemIndex].ToString()); clearControls(); cmdSaveTrans.Text = Hajloo.Portal.Common.Constants.Accounting.Documents.InsertClickText; btnDelete.Visible = false; grdDocumentRows.EditItemIndex = -1; BindGrid(); } catch (Exception ex) { Page.AddMessage(GetLocalResourceObject("AProblemAccuredTryAgain").ToString(), MessageControl.TypeEnum.Error); } } private void BindGrid() { RefreshPage(); grdDocumentRows.DataSource = ((DataSet)Session[Hajloo.Portal.Common.Constants.Accounting.Session.AccDocument]).Tables[AccDocument.TRANSACTIONS_TABLE]; grdDocumentRows.DataBind(); } private void RefreshPage() { Creditors = (decimal)((AccDocument)Session[Hajloo.Portal.Common.Constants.Accounting.Session.AccDocument]).Tables[AccDocument.ACCDOCUMENT_TABLE].Rows[0][AccDocument.ACCDOCUMENT_CREDITORS_SUM_FIELD]; Debtors = (decimal)((AccDocument)Session[Hajloo.Portal.Common.Constants.Accounting.Session.AccDocument]).Tables[AccDocument.ACCDOCUMENT_TABLE].Rows[0][AccDocument.ACCDOCUMENT_DEBTORS_SUM_FIELD]; if ((Creditors - Debtors) != 0) labBalance.InnerText = GetLocalResourceObject("Differentiate").ToString() + "?" + (Creditors - Debtors).ToString(Hajloo.Portal.Common.Constants.Common.Documents.CF) + "?"; else labBalance.InnerText = GetLocalResourceObject("Balance").ToString(); lblSumDebit.Text = Debtors.ToString(Hajloo.Portal.Common.Constants.Common.Documents.CF); lblSumCredit.Text = Creditors.ToString(Hajloo.Portal.Common.Constants.Common.Documents.CF); if (grdDocumentRows.EditItemIndex == -1) clearControls(); } Th other scenario are the same. How to enable short-cut for these kind of code (using session , NHibernate, etc)

    Read the article

  • Of these 3 methods for reading linked lists from shared memory, why is the 3rd fastest?

    - by Joseph Garvin
    I have a 'server' program that updates many linked lists in shared memory in response to external events. I want client programs to notice an update on any of the lists as quickly as possible (lowest latency). The server marks a linked list's node's state_ as FILLED once its data is filled in and its next pointer has been set to a valid location. Until then, its state_ is NOT_FILLED_YET. I am using memory barriers to make sure that clients don't see the state_ as FILLED before the data within is actually ready (and it seems to work, I never see corrupt data). Also, state_ is volatile to be sure the compiler doesn't lift the client's checking of it out of loops. Keeping the server code exactly the same, I've come up with 3 different methods for the client to scan the linked lists for changes. The question is: Why is the 3rd method fastest? Method 1: Round robin over all the linked lists (called 'channels') continuously, looking to see if any nodes have changed to 'FILLED': void method_one() { std::vector<Data*> channel_cursors; for(ChannelList::iterator i = channel_list.begin(); i != channel_list.end(); ++i) { Data* current_item = static_cast<Data*>(i->get(segment)->tail_.get(segment)); channel_cursors.push_back(current_item); } while(true) { for(std::size_t i = 0; i < channel_list.size(); ++i) { Data* current_item = channel_cursors[i]; ACQUIRE_MEMORY_BARRIER; if(current_item->state_ == NOT_FILLED_YET) { continue; } log_latency(current_item->tv_sec_, current_item->tv_usec_); channel_cursors[i] = static_cast<Data*>(current_item->next_.get(segment)); } } } Method 1 gave very low latency when then number of channels was small. But when the number of channels grew (250K+) it became very slow because of looping over all the channels. So I tried... Method 2: Give each linked list an ID. Keep a separate 'update list' to the side. Every time one of the linked lists is updated, push its ID on to the update list. Now we just need to monitor the single update list, and check the IDs we get from it. void method_two() { std::vector<Data*> channel_cursors; for(ChannelList::iterator i = channel_list.begin(); i != channel_list.end(); ++i) { Data* current_item = static_cast<Data*>(i->get(segment)->tail_.get(segment)); channel_cursors.push_back(current_item); } UpdateID* update_cursor = static_cast<UpdateID*>(update_channel.tail_.get(segment)); while(true) { if(update_cursor->state_ == NOT_FILLED_YET) { continue; } ::uint32_t update_id = update_cursor->list_id_; Data* current_item = channel_cursors[update_id]; if(current_item->state_ == NOT_FILLED_YET) { std::cerr << "This should never print." << std::endl; // it doesn't continue; } log_latency(current_item->tv_sec_, current_item->tv_usec_); channel_cursors[update_id] = static_cast<Data*>(current_item->next_.get(segment)); update_cursor = static_cast<UpdateID*>(update_cursor->next_.get(segment)); } } Method 2 gave TERRIBLE latency. Whereas Method 1 might give under 10us latency, Method 2 would inexplicably often given 8ms latency! Using gettimeofday it appears that the change in update_cursor-state_ was very slow to propogate from the server's view to the client's (I'm on a multicore box, so I assume the delay is due to cache). So I tried a hybrid approach... Method 3: Keep the update list. But loop over all the channels continuously, and within each iteration check if the update list has updated. If it has, go with the number pushed onto it. If it hasn't, check the channel we've currently iterated to. void method_three() { std::vector<Data*> channel_cursors; for(ChannelList::iterator i = channel_list.begin(); i != channel_list.end(); ++i) { Data* current_item = static_cast<Data*>(i->get(segment)->tail_.get(segment)); channel_cursors.push_back(current_item); } UpdateID* update_cursor = static_cast<UpdateID*>(update_channel.tail_.get(segment)); while(true) { for(std::size_t i = 0; i < channel_list.size(); ++i) { std::size_t idx = i; ACQUIRE_MEMORY_BARRIER; if(update_cursor->state_ != NOT_FILLED_YET) { //std::cerr << "Found via update" << std::endl; i--; idx = update_cursor->list_id_; update_cursor = static_cast<UpdateID*>(update_cursor->next_.get(segment)); } Data* current_item = channel_cursors[idx]; ACQUIRE_MEMORY_BARRIER; if(current_item->state_ == NOT_FILLED_YET) { continue; } found_an_update = true; log_latency(current_item->tv_sec_, current_item->tv_usec_); channel_cursors[idx] = static_cast<Data*>(current_item->next_.get(segment)); } } } The latency of this method was as good as Method 1, but scaled to large numbers of channels. The problem is, I have no clue why. Just to throw a wrench in things: if I uncomment the 'found via update' part, it prints between EVERY LATENCY LOG MESSAGE. Which means things are only ever found on the update list! So I don't understand how this method can be faster than method 2. The full, compilable code (requires GCC and boost-1.41) that generates random strings as test data is at: http://pastebin.com/e3HuL0nr

    Read the article

< Previous Page | 153 154 155 156 157 158 159 160 161 162 163 164  | Next Page >