code optimization - Page 120

Error: Can't find common super class of ...

- by PatlaDJ

I am trying to process with Proguard a MS Windows desktop application (Java 6 SE using the SWT lib provided by Eclipse). And I get the following critical error: Unexpected error while performing partial evaluation: Class = [org/eclipse/swt/widgets/DateTime] Method = [<init>(Lorg/eclipse/swt/widgets/Composite;I)V] Exception = [java.lang.IllegalArgumentException] (Can't find common super class of [java/lang/StringBuffer] and [org/eclipse/swt/internal/win32/TCHAR]) Error: Can't find common super class of [java/lang/StringBuffer] and [org/eclipse/swt/internal/win32/TCHAR] ---------------------------- When I tried to Google the error, it came out only on two spots on the entire web, that astonished me greatly. I am newbie using Proguard and Java code optimization tools at all. Any thoughts and suggestions how to fix this, will be appreciated. Thanks in advance.

Read the article

what is the best way to generate fake data for classification problem ?

- by Berkay

i'm working on a project and i have a subset of user's key-stroke time data.This means that the user makes n attempts and i will use these recorded attempt time data in various kinds of classification algorithms for future user attempts to verify that the login process is done by the user or some another person. (Simply i can say that this is biometrics) I have 3 different times of the user login attempt process, ofcourse this is subset of the infinite data. until now it is an easy classification problem, i decided to use WEKA but as far as i understand i have to create some fake data to feed the classification algorithm. can i use some optimization algorithms ? or is there any way to create this fake data to get min false positives ? Thanks

Read the article

Is there any way to get MSVC to pass structs arguments in registers on x64?

- by Luke

For a function with signature: struct Pair { void *v1, *v2 }; void f(Pair p); compiled on x64, I would like Pair's fields to be passed via register, as if the function was: void f(void *v1, void *v2); Compiling a test with gcc 4.2.1 for x86_64 on OSX 10.6, I can see this is exactly what happens by examining the disassembly. However, compiling with MSVC 2008 for x64 on Windows, the disassembly shows that Pair is passed on the stack. I understand that platform ABIs can prevent this optimization; does anyone know any MSVC-specific annotations, calling conventions, flags, or other hacks that can get this to work? Thank you!

Read the article

How to optimize MATLAB loops?

- by striglia

I have been working lately on a number of iterative algorithms in MATLAB, and been getting hit hard by MATLAB's performance (or lack thereof) when it comes to loops. I'm aware of the benefit of vectorizing code when possible, but are there any tools for optimization when you need the loop for your algorithm? I am aware of the MEX-file option to write small subroutines in C/C++, although given my algorithms, this can be a very painful option given the data structures required. I mainly use MATLAB for the simplicity and speed of prototyping, so a syntactically complex, statically typed language is not ideal for my situation. Are there any other suggestions? Even other languages (python?) which have relatively painless matrix tools are an option.

Read the article

C++ defines for a 'better' Release mode build in VS

- by darid

I currently use the following preprocessor defines, and various optimization settings: WIN32_LEAN_AND_MEAN VC_EXTRALEAN NOMINMAX _CRT_SECURE_NO_WARNINGS _SCL_SECURE_NO_WARNINGS _SECURE_SCL=0 _HAS_ITERATOR_DEBUGGING=0 My question is what other things do fellow SOers use, add, define, in order to get a Release Mode build from VS C++ (2008,2010) to be as performant as possible? btw, I've tried PGO etc, it does help a bit but nothing that comes to parity, also I'm not using streams, the C++ i'm talking about its more like C but making use of templates and STL algorithms. As it stands now very simple code segments flop when compared to what GCC produces on say an equivalent x86 machine running linux (2.6+ kernel) using 02. Side-Note: I believe a lot of the issues relate directly to the STL version (Dinkum) provided by MS. Could people please elaborate on experiences using STLPort etc with VS C++.

Read the article

Tips for optimizing C#/.NET programs

- by Bob

It seems like optimization is a lost art these days. Wasn't there a time when all programmers squeezed every ounce of efficiency from their code? Often doing so while walking 5 miles in the snow? In the spirit of bringing back a lost art, what are some tips that you know of for simple (or perhaps complex) changes to optimize C#/.NET code? Since it's such a broad thing that depends on what one is trying to accomplish it'd help to provide context with your tip. For instance: When concatenating many strings together use StringBuilder instead. If you're only concatenating a handful of strings it's ok to use the + operator. Use string.Compare to compare 2 strings instead of doing something like string1.ToLower() == string2.ToLower()

Read the article

Lucene.Net memory consumption and slow search when too many clauses used

- by Umer

I have a DB having text file attributes and text file primary key IDs and indexed around 1 million text files along with their IDs (primary keys in DB). Now, I am searching at two levels. First is straight forward DB search, where i get primary keys as result (roughly 2 or 3 million IDs) Then i make a Boolean query for instance as following +Text:"test*" +(pkID:1 pkID:4 pkID:100 pkID:115 pkID:1041 .... ) and search it in my Index file. The problem is that such query (having 2 million clauses) takes toooooo much time to give result and consumes reallly too much memory.... Is there any optimization solution for this problem ?

Read the article

Compile-time trigonometry in C

- by lhahne

I currently have code that looks like while (very_long_loop) { ... y1 = getSomeValue(); ... x1 = y1*cos(PI/2); x2 = y2*cos(SOME_CONSTANT); ... outputValues(x1, x2, ...); } the obvious optimization would be to compute the cosines ahead-of-time. I could do this by filling an array with the values but I was wondering would it be possible to make the compiler compute these at compile-time?

Read the article

Optimizing MySQL statement with lot of count(row) an sum(row+row2)...

- by Zombies

I need to use InnoDB storage engine on a table with about 1mil or so records in it at any given time. It has records being inserted to it at a very fast rate, which are then dropped within a few days, maybe a week. The ping table has about a million rows, whereas the website table only about 10,000. My statement is this: select url from website ws, ping pi where ws.idproxy = pi.idproxy and pi.entrytime > curdate() - 3 and contentping+tcpping is not null group by url having sum(contentping+tcpping)/(count(*)-count(errortype)) < 500 and count(*) > 3 and count(errortype)/count(*) < .15 order by sum(contentping+tcpping)/(count(*)-count(errortype)) asc; I added an index on entrytime, yet no dice. Can anyone throw me a bone as to what I should consider to look into for basic optimization of this query. The result set is only like 200 rows, so I'm not getting killed there.

Read the article

In C, would !~b ever be faster than b == 0xff ?

- by James Morris

From a long time ago I have a memory which has stuck with me that says comparisons against zero are faster than any other value (ahem Z80). In some C code I'm writing I want to skip values which have all their bits set. Currently the type of these values is char but may change. I have two different alternatives to perform the test: if (!~b) /* skip */ and if (b == 0xff) /* skip */ Apart from the latter making the assumption that b is an 8bit char whereas the former does not, would the former ever be faster due to the old compare to zero optimization trick, or are the CPUs of today way beyond this kind of thing?

Read the article

How do you design a database to allow fast multicolumn searching?

- by Fletcher Moore

I am creating a real estate search from RETS data, but this is a general question. When you have a variety of columns that you would like the user to be able to filter their search result by, how do you optimize this? For example, http://www.charlestonrealestateguide.com/listings.php has 16 or so optional filters. Granted, he only has up to 11,000 entries (I have the same data), but I don't imagine the search is performed with just a giant WHERE AND AND AND ... clause. Or is this typically accomplished with one giant multicolumn index? Newegg, Amazon, and countless others also have cool & fast filtering systems for large amounts of data. How do they do it? And is there a database optimization reason for the tendency to provide ranges instead of empty inputs, or is that merely for user convenience?

Read the article

jQuery Optimizations

- by aepheus

I've just come to the end of a large development project. We were on a tight timeline, so a lot of optimization was "deferred". Now that we met our deadline, we're going back and trying to optimize things. My questions is this: What are some of the most important things you look for when optimizing jQuery web sites. Alternately I'd love to hear of sites/lists that have particularly good advise for optimizing jQuery. I've already read a few articles, http://www.tvidesign.co.uk/blog/improve-your-jquery-25-excellent-tips.aspx was an especially good read.

Read the article

Is it possible to shrink rt.jar with ProGuard?

- by PatlaDJ

Is there a procedure by which you can optimize/shrink/select/obfuscate only 'used by your app' classes/methods/fields from rt.jar provided by Sun by using some optimization software like ProGuard (or maybe other?). Then you would actually be able to minimize the download size of your application considerably and make it much more secure ? Right? Related questions: Do you know if Sun's "jigsaw project" which is waited to come out, is intended to automatically handle this particular issue? Did somebody manage yet to form an opinion about Avian java alternative? Please share it here. Thank you.

Read the article

Django: Set foreign key using integer?

- by User

Is there a way to set foreign key relationship using the integer id of a model? This would be for optimization purposes. For example, suppose I have an Employee model: class Employee(models.Model): first_name = models.CharField(max_length=100) last_name = models.CharField(max_length=100) type = models.ForeignKey('EmployeeType') and EmployeeType(models.Model): type = models.CharField(max_length=100) I want the flexibility of having unlimited employee types, but in the deployed application there will likely be only a single type so I'm wondering if there is a way to hardcode the id and set the relationship this way. This way I can avoid a db call to get the EmployeeType object first.

Read the article

Should I make my MutexLock volatile?

- by sje397

I have some code in a function that goes something like this: void foo() { { // scope the locker MutexLocker locker(&mutex); // do some stuff.. } bar(); } The function call bar() also locks the mutex. I am having an issue whereby the program crashes (for someone else, who has not as yet provided a stack trace or more details) unless the mutex lock inside bar is disabled. Is it possible that some optimization is messing around with the way I have scoped the locker instance, and if so, would making it volatile fix it? Is that a bad idea? Thanks.

Read the article

Defined variables and arrays vs functions in php

- by Frank Presencia Fandos

Introduction I have some sort of values that I might want to access several times each page is loaded. I can take two different approaches for accessing them but I'm not sure which one is 'better'. Three already implemented examples are several options for the Language, URI and displaying text that I describe here: Language Right now it is configured in this way: lang() is a function that returns different values depending on the argument. Example: lang("full") returns the current language, "English", while lang() returns the abbreviation of the current language, "en". There are many more options, like lang("select"), lang("selectact"), etc that return different things. The code is too long and irrelevant for the case so if anyone wants it just ask for it. Url The $Url array also returns different values depending on the request. The whole array is fully defined in the beginning of the page and used to get shorter but accurate links of the current page. Example: $Url['full'] would return "http://mypage.org/path/to/file.php?page=1" and $Url['file'] would return "file.php". It's useful for action="" within the forms and many other things. There are more values for $Url['folder'], $Url['file'], etc. Same thing about the code, if wanted, just request it. Text [You can skip this section] There's another array called $Text that is defined in the same way than $Url. The whole array is defined at the beginning, making a mysql call and defining all $Text[$i] for current page with a while loop. I'm not sure if this is more efficient than multiple calls for a single mysql cell. Example: $Text['54'] returns "This is just a test array!" which this could perfectly be implemented with a function like text(54). Question With the 3 examples you can see that I use different methods to do almost the same function (no pun intended), but I'm not sure which one should become the standard one for my code. I could create a function called url() and other called text() to output what I want. I think that working with functions in those cases is better, but I'm not sure why. So I'd really appreciate your opinions and advice. Should I mix arrays and functions in the way I described or should I just use funcions? Please, base your answer in this: The source needs to be readable and reusable by other developers Resource consumption (processing, time and memory). The shorter the code the better. The more you explain the reasons the better. Thank you PS, now I know the differences between $Url and $Uri.

Read the article

PHP speed optimisation.

- by Petah

Hi, Im wondering about speed optimization in PHP. I have a series of files that are requested every page load. On average there are 20 files. Each file must be read an parsed if they have changed. And this is excluding that standard files required for a web page (HTML, CSS, images, etc). EG - client requests page - server outputs html, css, images - server outputs dynamic content (20+/- files combined and minified). What would be the best way to serve these files as fast as possible?

Read the article

If statements inside or outside loops?

- by Timmy

Is it better if I do this: foreach my $item ( @array ) { if ( $bool ) { .. code .. } else { .. code .. } } or if ( $bool ) { foreach my $item ( @array ) { } } else { foreach my $item ( @array ) { } }

Read the article

How to check for C++ copy ellision

- by Steve

I ran across this article on copy ellision in C++ and I've seen comments about it in the boost library. This is appealing, as I prefer my functions to look like verylargereturntype DoSomething(...) rather than void DoSomething(..., verylargereturntype& retval) So, I have two questions about this Google has virtually no documentation on this at all, how real is this? How can I check that this optimization is actually occuring? I assume it involves looking at the assembly, but lets just say that isn't my strong suit. If anyone can give a very basic example as to what successful ellision looks like, that would be very useful I won't be using copy ellision just to prettify things, but if I can be guaranteed that it works, it sounds pretty useful.

Read the article

In Python, is there a way to call a method on every item of an iterable? [closed]

- by Thane Brimhall

Possible Duplicate: Is there a map without result in python? I often come to a situation in my programs when I want to quickly/efficiently call an in-place method on each of the items contained by an iterable. (Quickly meaning the overhead of a for loop is unacceptable). A good example would be a list of sprites when I want to call draw() on each of the Sprite objects. I know I can do something like this: [sprite.draw() for sprite in sprite_list] But I feel like the list comprehension is misused since I'm not using the returned list. The same goes for the map function. Stone me for premature optimization, but I also don't want the overhead of the return value. What I want to know is if there's a method in Python that lets me do what I just explained, perhaps like the hypothetical function I suggest below: do_all(sprite_list, draw)

Read the article

learning sample of likely() and unlikely() compiler hints

- by osgx

Hello How can I demonstrate for students the usability of likely and unlikely compiler hints (__builtin_expect)? Can you write an sample code, which will be several times faster with these hints comparing the code without hints.

Read the article

SQL: Optimize insensive SELECTs on DateTime fields

- by Fedyashev Nikita

I have an application for scheduling certain events. And all these events must be reviewed after each scheduled time. So basically we have 3 tables: items(id, name) scheduled_items(id, item_id, execute_at - datetime) - item_id column has an index option. reviewed_items(id, item_id, created_at - datetime) - item_id column has an index option. So core function of the application is "give me any items(which are not yet reviewed) for the actual moment". How can I optimize this solution for speed(because it is very core business feature and not micro optimization)? I suppose that adding index to the datetime fields doesn't make any sense because the cardinality or uniqueness on that fields are very high and index won't give any(?) speed-up. Is it correct? What would you recommend? Should I try no-SQL? -- mysql -V 5.075 I use caching where it makes sence.

Read the article

Issuing Current Time Increments in StreamInsight (A Practical Example)

The issuing of a Current Time Increment, Cti, in StreamInsight is very definitely one of the most important concepts to learn if you want your Streams to be responsive. A full discussion of how to issue Ctis is beyond the scope of this article but a very good explanation in addition to Books Online can be found in these three articles by a member of the StreamInsight team at Microsoft, Ciprian Gerea. Time in StreamInsight Series http://blogs.msdn.com/b/streaminsight/archive/2010/07/23/time-in-streaminsight-i.aspx http://blogs.msdn.com/b/streaminsight/archive/2010/07/30/time-in-streaminsight-ii.aspx http://blogs.msdn.com/b/streaminsight/archive/2010/08/03/time-in-streaminsight-iii.aspx A lot of the problems I see with unresponsive or stuck streams on the MSDN Forums are to do with how Ctis are enqueued or in a lot of cases not enqueued. If you enqueue events and never enqueue a Cti then StreamInsight will be perfectly happy. You, on the other hand, will never see data on the output as you have not told StreamInsight to flush the stream. This article deals with a specific implementation problem I had recently whilst working on a StreamInsight project. I look at some possible options and discuss why they would not work before showing the way I solved the problem. The stream of data I was dealing with on this project was very bursty that is to say when events were flowing they came through very quickly and in large numbers (1000 events/sec), but when the stream calmed down it could be a few seconds between each event. When enqueuing events into the StreamInsight engne it is best practice to do so with a StartTime that is given to you by the system producing the event . StreamInsight processes events and it doesn't matter whether those events are being pushed into the engine by a source system or the events are being read from something like a flat file in a directory somewhere. You can apply the same logic and temporal algebra to both situations. Reading from a file is an excellent example of where the time of the event on the source itself is very important. We could be reading that file a long time after it was written. Being able to read the StartTime from the events allows us to define windows that will hold the correct sets of events. I was able to do this with my stream but this is where my problems started. Below is a very simple script to create a SQL Server table and populate it with sample data that will show exactly the problem I had. CREATE TABLE [dbo].[t] ( [c1] [int] PRIMARY KEY, [c2] [datetime] NULL ) INSERT t VALUES (1,'20100810'),(2,'20100810'),(3,'20100810') Column c2 defines the StartTime of the event on the source and as you can see the values in all 3 rows of data is the same. If we read Ciprian’s articles we know that we can define how Ctis get injected into the stream in 3 different places The Stream Definition The Input Factory The Input Adapter I personally have always been a fan of enqueing Ctis through the factory. Below is code typical of what I would use to do this On the class itself I do some inheriting public class SimpleInputFactory : ITypedInputAdapterFactory<SimpleInputConfig>, ITypedDeclareAdvanceTimeProperties<SimpleInputConfig> And then I implement the following function public AdapterAdvanceTimeSettings DeclareAdvanceTimeProperties<TPayload>(SimpleInputConfig configInfo, EventShape eventShape) { return new AdapterAdvanceTimeSettings( new AdvanceTimeGenerationSettings(configInfo.CtiFrequency, TimeSpan.FromTicks(-1)), AdvanceTimePolicy.Adjust); } The configInfo .CtiFrequency property is a value I pass through to define after how many events I want a Cti to be injected and this in turn will flush through the stream of data. I usually pass a value of 1 for this setting. The second parameter determines the CTI timestamp in terms of a delay relative to the events. -1 ticks in the past results in 1 tick in the future, i.e., ahead of the event. The problem with this method though is that if consecutive events have the same StartTime then only one of those events will be enqueued. In this example I use the following to define how I assign the StartTime of my events currEvent.StartTime = (DateTimeOffset)dt.c2; If I go ahead and run my StreamInsight process with this configuration i can see on the output adapter that two events have been removed To see this in a little more depth I can use the StreamInsight Debugger and see what happens internally. What is happening here is that the first event arrives and a Cti is injected with a time of 1 tick after the StartTime of that event (Also the EndTime of the event). The second event arrives and it has a StartTime of before the Cti and even though we specified AdvanceTimePolicy.Adjust on the factory we know that a point event can never be adjusted like this and the event is dropped. The same happens for the third event as well (The second and third events get trumped by the Cti). For a more detailed discussion of why this happens look here http://www.sqlis.com/sqlis/post/AdvanceTimePolicy-and-Point-Event-Streams-In-StreamInsight.aspx We end up with a single event being pushed into the output adapter and our result now makes sense. The next way I tried to solve this problem by changing the value of the second parameter to TimeSpan.Zero Here is how my factory code now looks public AdapterAdvanceTimeSettings DeclareAdvanceTimeProperties<TPayload>(SimpleInputConfig configInfo, EventShape eventShape) { return new AdapterAdvanceTimeSettings( new AdvanceTimeGenerationSettings(configInfo.CtiFrequency, TimeSpan.Zero), AdvanceTimePolicy.Adjust); } What I am doing here is declaring a policy that says inject a Cti together with every event and stamp it with a StartTime that is equal to the start time of the event itself (TimeSpan.Zero). This method has plus points as well as a downside. The upside is that no events will be lost by having the same StartTime as previous events. The Downside is that because the Cti is declared with the StartTime of the event itself then it does not actually flush that particular event because in the StreamInsight algebra, a Cti commits only those events that occurred strictly before them. To flush the events we need a Cti to be enqueued with a greater StartTime than the events themselves. Here is what happened when I ran this configuration As you can see all we got through was the Cti and none of the events. The debugger output shows the stamps on the Cti and the events themselves. Because the Cti issued has the same timestamp (StartTime) as the events then none of the events get flushed. I was nearly there but not quite. Because my stream was bursty it was possible that the next event would not come along for a few seconds and this was far too long for an event to be enqueued and not be flushed to the output adapter. I needed another solution. Two possible solutions crossed my mind although only one of them made sense when I explored it some more. Where multiple events have the same StartTime I could add 1 tick to the first event, two to the second, three to third etc thereby giving them unique StartTime values. Add a timer to manually inject Ctis The problem with the first implementation is that I would be giving the events a new StartTime. This would cause me the following problems If I want to define windows over the stream then some events may not be captured in the right windows and therefore any calculations on those windows I did would be wrong What would happen if we had 10,000 events with the same StartTime? I would enqueue them with StartTime + n ticks. Along comes a genuine event with a StartTime of the very first event + 1 tick. It is now too far in the past as far as my stream is concerned and it would be dropped. Not what I would want to do at all. I decided then to look at the Timer based solution I created a timer on my input adapter that elapsed every 200ms. private Timer tmr; public SimpleInputAdapter(SimpleInputConfig configInfo) { ctx = new SimpleTimeExtractDataContext(configInfo.ConnectionString); this.configInfo = configInfo; tmr = new Timer(200); tmr.Elapsed += new ElapsedEventHandler(t_Elapsed); tmr.Enabled = true; } void t_Elapsed(object sender, ElapsedEventArgs e) { ts = DateTime.Now - dtCtiIssued; if (ts.TotalMilliseconds >= 200 && TimerIssuedCti == false) { EnqueueCtiEvent(System.DateTime.Now.AddTicks(-100)); TimerIssuedCti = true; } } In the t_Elapsed event handler I find out the difference in time between now and when the last event was processed (dtCtiIssued). I then check to see if that is greater than or equal to 200ms and if the last issuing of a Cti was done by the timer or by a genuine event (TimerIssuedCti). If I didn’t do this check then I would enqueue a Cti every time the timer elapsed which is not something I wanted. If the difference between the two times is greater than or equal to 500ms and the last event enqueued was by a real event then I issue a Cti through the timer to flush the event Queue, otherwise I do nothing. When I enqueue the Ctis into my stream in my ProduceEvents method I also set the values of dtCtiIssued and TimerIssuedCti currEvent = CreateInsertEvent(); currEvent.StartTime = (DateTimeOffset)dt.c2; TimerIssuedCti = false; dtCtiIssued = currEvent.StartTime; If I go ahead and run this configuration I see the following in my output. As we can see the first Cti gets enqueued as before but then another is enqueued by the timer and because this has a later timestamp it flushes the enqueued events through the engine. Conclusion Hopefully this has shown how the enqueuing of Ctis can have a dramatic effect on the responsiveness of your output in StreamInsight. Understanding the temporal nature of the product is for me one of the most important things you can learn. I have attached my solution for the demos. It is all in one project and testing each variation is a simple matter of commenting and un-commenting the parts in the code we have been dealing with here.

Read the article

Tiered Design With Analytical Widgets - Is This Code Smell?

- by Repo Man

The idea I'm playing with right now is having a multi-leveled "tier" system of analytical objects which perform a certain computation on a common object and then create a new set of analytical objects depending on their outcome. The newly created analytical objects will then get their own turn to run and optionally create more analytical objects, and so on and so on. The point being that the child analytical objects will always execute after the objects that created them, which is relatively important. The whole apparatus will be called by a single thread so I'm not concerned with thread safety at the moment. As long as a certain base condition is met, I don't see this being an unstable design but I'm still a little bit queasy about it. Is this some serious code smell or should I go ahead and implement it this way? Is there a better way? Here is a sample implementation: namespace WidgetTier { public class Widget { private string _name; public string Name { get { return _name; } } private TierManager _tm; private static readonly Random random = new Random(); static Widget() { } public Widget(string name, TierManager tm) { _name = name; _tm = tm; } public void DoMyThing() { if (random.Next(1000) > 1) { _tm.Add(); } } } //NOT thread-safe! public class TierManager { private Dictionary<int, List<Widget>> _tiers; private int _tierCount = 0; private int _currentTier = -1; private int _childCount = 0; public TierManager() { _tiers = new Dictionary<int, List<Widget>>(); } public void Add() { if (_currentTier + 1 >= _tierCount) { _tierCount++; _tiers.Add(_currentTier + 1, new List<Widget>()); } _tiers[_currentTier + 1].Add(new Widget(string.Format("({0})", _childCount), this)); _childCount++; } //Dangerous? public void Sweep() { _currentTier = 0; while (_currentTier < _tierCount) //_tierCount will start at 1 but keep increasing because child objects will keep adding more tiers. { foreach (Widget w in _tiers[_currentTier]) { w.DoMyThing(); } _currentTier++; } } public void PrintAll() { for (int t = 0; t < _tierCount; t++) { Console.Write("Tier #{0}: ", t); foreach (Widget w in _tiers[t]) { Console.Write(w.Name + " "); } Console.WriteLine(); } } } class Program { static void Main(string[] args) { TierManager tm = new TierManager(); for (int c = 0; c < 10; c++) { tm.Add(); //create base widgets; } tm.Sweep(); tm.PrintAll(); Console.ReadLine(); } } }

Read the article

Accessing local variable doesn't improve performance

- by NicMagnier

The short version Why is this code: var index = (Math.floor(y / scale) * img.width + Math.floor(x / scale)) * 4; More performant than this one? var index = Math.floor(ref_index) * 4; The long version This week, the author of Impact js published an article about some rendering issue: http://www.phoboslab.org/log/2012/09/drawing-pixels-is-hard In the article there was the source of a function to scale an image by accessing pixels in the canvas. I wanted to suggest some traditional ways to optimize this kind of code so that the scaling would be shorter at loading time. But after testing it my result was most of the time worst that the original function. Guessing this was the JavaScript engine that was doing some smart optimization I tried to understand a bit more what was going on so I did a bunch of test. But my results are quite confusing and I would need some help to understand what's going on. I have a test page here: http://www.mx981.com/stuff/resize_bench/test.html jsPerf: http://jsperf.com/local-variable-due-to-the-scope-lookup To start the test, click the picture and the results will appear in the console. There are three different versions: The original code: for( var y = 0; y < heightScaled; y++ ) { for( var x = 0; x < widthScaled; x++ ) { var index = (Math.floor(y / scale) * img.width + Math.floor(x / scale)) * 4; var indexScaled = (y * widthScaled + x) * 4; scaledPixels.data[ indexScaled ] = origPixels.data[ index ]; scaledPixels.data[ indexScaled+1 ] = origPixels.data[ index+1 ]; scaledPixels.data[ indexScaled+2 ] = origPixels.data[ index+2 ]; scaledPixels.data[ indexScaled+3 ] = origPixels.data[ index+3 ]; } } jsPerf: http://jsperf.com/so-accessing-local-variable-doesn-t-improve-performance One of my attempt to optimize it: var ref_index = 0; var ref_indexScaled = 0 var ref_step = 1 / scale; for( var y = 0; y < heightScaled; y++ ) { for( var x = 0; x < widthScaled; x++ ) { var index = Math.floor(ref_index) * 4; scaledPixels.data[ ref_indexScaled++ ] = origPixels.data[ index ]; scaledPixels.data[ ref_indexScaled++ ] = origPixels.data[ index+1 ]; scaledPixels.data[ ref_indexScaled++ ] = origPixels.data[ index+2 ]; scaledPixels.data[ ref_indexScaled++ ] = origPixels.data[ index+3 ]; ref_index+= ref_step; } } jsPerf: http://jsperf.com/so-accessing-local-variable-doesn-t-improve-performance The same optimized code but with recalculating the index variable each time (Hybrid) var ref_index = 0; var ref_indexScaled = 0 var ref_step = 1 / scale; for( var y = 0; y < heightScaled; y++ ) { for( var x = 0; x < widthScaled; x++ ) { var index = (Math.floor(y / scale) * img.width + Math.floor(x / scale)) * 4; scaledPixels.data[ ref_indexScaled++ ] = origPixels.data[ index ]; scaledPixels.data[ ref_indexScaled++ ] = origPixels.data[ index+1 ]; scaledPixels.data[ ref_indexScaled++ ] = origPixels.data[ index+2 ]; scaledPixels.data[ ref_indexScaled++ ] = origPixels.data[ index+3 ]; ref_index+= ref_step; } } jsPerf: http://jsperf.com/so-accessing-local-variable-doesn-t-improve-performance The only difference in the two last one is the calculation of the 'index' variable. And to my surprise the optimized version is slower in most browsers (except opera). Results of personal testing (not the jsPerf tests): Opera Original: 8668ms Optimized: 932ms Hybrid: 8696ms Chrome Original: 139ms Optimized: 145ms Hybrid: 136ms Safari Original: 433ms Optimized: 853ms Hybrid: 451ms Firefox Original: 343ms Optimized: 422ms Hybrid: 350ms After digging around, it seems an usual good practice is to access mainly local variable due to the scope lookup. Because The optimized version only call one local variable it should be faster that the Hybrid code which call multiple variable and object in addition to the various operation involved. So why the "optimized" version is slower? I thought that it might be because some JavaScript engine don't optimize the Optimized version because it is not hot enough but after using --trace-opt in chrome, it seems all version are properly compiled by V8. At this point I am a bit clueless and wonder if somebody would know what is going on? I did also some more test cases in this page: http://www.mx981.com/stuff/resize_bench/index.html

Search Results

Search found 90546 results on 3622 pages for 'code optimization'.

Page 120/3622 | < Previous Page | 116 117 118 119 120 121 122 123 124 125 126 127 | Next Page >

- by PatlaDJ

- by Berkay

- by Luke

- by striglia

- by darid

- by Bob

- by Umer

- by lhahne

- by Zombies

- by James Morris

- by Fletcher Moore

- by aepheus

- by PatlaDJ

- by User

- by sje397

- by Frank Presencia Fandos

- by Petah

- by Timmy

- by Steve

- by Thane Brimhall

- by osgx

- by Fedyashev Nikita

- by Repo Man

- by NicMagnier

< Previous Page | 116 117 118 119 120 121 122 123 124 125 126 127 | Next Page >