STL find performs bettern than hand-crafter loop
- by dusha
Hello all,
I have some question. Given the following C++ code fragment:
#include <boost/progress.hpp>
#include <vector>
#include <algorithm>
#include <numeric>
#include <iostream>
struct incrementor
{
incrementor() : curr_() {}
unsigned int operator()()
{ return curr_++; }
private:
unsigned int curr_;
};
template<class Vec>
char const* value_found(Vec const& v, typename Vec::const_iterator i)
{
return i==v.end() ? "no" : "yes";
}
template<class Vec>
typename Vec::const_iterator find1(Vec const& v, typename Vec::value_type val)
{
return find(v.begin(), v.end(), val);
}
template<class Vec>
typename Vec::const_iterator find2(Vec const& v, typename Vec::value_type val)
{
for(typename Vec::const_iterator i=v.begin(), end=v.end(); i<end; ++i)
if(*i==val) return i;
return v.end();
}
int main()
{
using namespace std;
typedef vector<unsigned int>::const_iterator iter;
vector<unsigned int> vec;
vec.reserve(10000000);
boost::progress_timer pt;
generate_n(back_inserter(vec), vec.capacity(), incrementor());
//added this line, to avoid any doubts, that compiler is able to
// guess the data is sorted
random_shuffle(vec.begin(), vec.end());
cout << "value generation required: " << pt.elapsed() << endl;
double d;
pt.restart();
iter found=find1(vec, vec.capacity());
d=pt.elapsed();
cout << "first search required: " << d << endl;
cout << "first search found value: " << value_found(vec, found)<< endl;
pt.restart();
found=find2(vec, vec.capacity());
d=pt.elapsed();
cout << "second search required: " << d << endl;
cout << "second search found value: " << value_found(vec, found)<< endl;
return 0;
}
On my machine (Intel i7, Windows Vista) STL find (call via find1) runs about 10 times faster than the hand-crafted loop (call via find2). I first thought that Visual C++ performs some kind of vectorization (may be I am mistaken here), but as far as I can see assembly does not look the way it uses vectorization. Why is STL loop faster? Hand-crafted loop is identical to the loop from the STL-find body.
I was asked to post program's output. Without shuffle:
value generation required: 0.078
first search required: 0.008
first search found value: no
second search required: 0.098
second search found value: no
With shuffle (caching effects):
value generation required: 1.454
first search required: 0.009
first search found value: no
second search required: 0.044
second search found value: no
Many thanks,
dusha.
P.S. I return the iterator and write out the result (found or not), because I would like to prevent compiler optimization, that it thinks the loop is not required at all. The searched value is obviously not in the vector.