Accessing local variable doesn't improve performance
- by NicMagnier
The short version
Why is this code:
var index = (Math.floor(y / scale) * img.width + Math.floor(x / scale)) * 4;
More performant than this one?
var index = Math.floor(ref_index) * 4;
The long version
This week, the author of Impact js published an article about some rendering issue:
http://www.phoboslab.org/log/2012/09/drawing-pixels-is-hard
In the article there was the source of a function to scale an image by accessing pixels in the canvas. I wanted to suggest some traditional ways to optimize this kind of code so that the scaling would be shorter at loading time. But after testing it my result was most of the time worst that the original function.
Guessing this was the JavaScript engine that was doing some smart optimization I tried to understand a bit more what was going on so I did a bunch of test. But my results are quite confusing and I would need some help to understand what's going on.
I have a test page here:
http://www.mx981.com/stuff/resize_bench/test.html
jsPerf: http://jsperf.com/local-variable-due-to-the-scope-lookup
To start the test, click the picture and the results will appear in the console.
There are three different versions:
The original code:
for( var y = 0; y < heightScaled; y++ ) {
for( var x = 0; x < widthScaled; x++ ) {
var index = (Math.floor(y / scale) * img.width + Math.floor(x / scale)) * 4;
var indexScaled = (y * widthScaled + x) * 4;
scaledPixels.data[ indexScaled ] = origPixels.data[ index ];
scaledPixels.data[ indexScaled+1 ] = origPixels.data[ index+1 ];
scaledPixels.data[ indexScaled+2 ] = origPixels.data[ index+2 ];
scaledPixels.data[ indexScaled+3 ] = origPixels.data[ index+3 ];
}
}
jsPerf: http://jsperf.com/so-accessing-local-variable-doesn-t-improve-performance
One of my attempt to optimize it:
var ref_index = 0;
var ref_indexScaled = 0
var ref_step = 1 / scale;
for( var y = 0; y < heightScaled; y++ ) {
for( var x = 0; x < widthScaled; x++ ) {
var index = Math.floor(ref_index) * 4;
scaledPixels.data[ ref_indexScaled++ ] = origPixels.data[ index ];
scaledPixels.data[ ref_indexScaled++ ] = origPixels.data[ index+1 ];
scaledPixels.data[ ref_indexScaled++ ] = origPixels.data[ index+2 ];
scaledPixels.data[ ref_indexScaled++ ] = origPixels.data[ index+3 ];
ref_index+= ref_step;
}
}
jsPerf: http://jsperf.com/so-accessing-local-variable-doesn-t-improve-performance
The same optimized code but with recalculating the index variable each time (Hybrid)
var ref_index = 0;
var ref_indexScaled = 0
var ref_step = 1 / scale;
for( var y = 0; y < heightScaled; y++ ) {
for( var x = 0; x < widthScaled; x++ ) {
var index = (Math.floor(y / scale) * img.width + Math.floor(x / scale)) * 4;
scaledPixels.data[ ref_indexScaled++ ] = origPixels.data[ index ];
scaledPixels.data[ ref_indexScaled++ ] = origPixels.data[ index+1 ];
scaledPixels.data[ ref_indexScaled++ ] = origPixels.data[ index+2 ];
scaledPixels.data[ ref_indexScaled++ ] = origPixels.data[ index+3 ];
ref_index+= ref_step;
}
}
jsPerf: http://jsperf.com/so-accessing-local-variable-doesn-t-improve-performance
The only difference in the two last one is the calculation of the 'index' variable.
And to my surprise the optimized version is slower in most browsers (except opera).
Results of personal testing (not the jsPerf tests):
Opera
Original: 8668ms
Optimized: 932ms
Hybrid: 8696ms
Chrome
Original: 139ms
Optimized: 145ms
Hybrid: 136ms
Safari
Original: 433ms
Optimized: 853ms
Hybrid: 451ms
Firefox
Original: 343ms
Optimized: 422ms
Hybrid: 350ms
After digging around, it seems an usual good practice is to access mainly local variable due to the scope lookup. Because The optimized version only call one local variable it should be faster that the Hybrid code which call multiple variable and object in addition to the various operation involved.
So why the "optimized" version is slower?
I thought that it might be because some JavaScript engine don't optimize the Optimized version because it is not hot enough but after using --trace-opt in chrome, it seems all version are properly compiled by V8.
At this point I am a bit clueless and wonder if somebody would know what is going on?
I did also some more test cases in this page:
http://www.mx981.com/stuff/resize_bench/index.html