I am trying find the number of unique elements in a vector compared against multiple vectors using C++.
Suppose I have,
v1: 5, 8, 13, 16, 20
v2: 2, 4, 6, 8
v3: 20
v4: 1, 2, 3, 4, 5, 6, 7
v5: 1, 3, 5, 7, 11, 13, 15
The number of unique elements in v1 is 1 (i.e. number 16).
I tried two approaches.
Added vectors v2,v3,v4 and v5 into a vector of vector. For each element in v1, checked if the element is present in any of the other vectors.
Combined all the vectors v2,v3,v4 and v5 using merge sort into a single vector and compared it against v1 to find the unique elements.
Note: sample_vector = v1 and all_vectors_merged contains v2,v3,v4,v5
//Method 1
unsigned int compute_unique_elements_1(vector<unsigned int> sample_vector,vector<vector<unsigned int> > all_vectors_merged)
{
unsigned int duplicate = 0;
for (unsigned int i = 0; i < sample_vector.size(); i++)
{
for (unsigned int j = 0; j < all_vectors_merged.size(); j++)
{
if (std::find(all_vectors_merged.at(j).begin(), all_vectors_merged.at(j).end(), sample_vector.at(i)) != all_vectors_merged.at(j).end())
{
duplicate++;
}
}
}
return sample_vector.size()-duplicate;
}
// Method 2
unsigned int compute_unique_elements_2(vector<unsigned int> sample_vector, vector<unsigned int> all_vectors_merged)
{
unsigned int unique = 0;
unsigned int i = 0, j = 0;
while (i < sample_vector.size() && j < all_vectors_merged.size())
{
if (sample_vector.at(i) > all_vectors_merged.at(j))
{
j++;
}
else if (sample_vector.at(i) < all_vectors_merged.at(j))
{
i++;
unique ++;
}
else
{
i++;
j++;
}
}
if (i < sample_vector.size())
{
unique += sample_vector.size() - i;
}
return unique;
}
Of these two techniques, I see that Method 2 gives faster results.
1) Method 1: Is there a more efficient way to find the elements than running std::find on all the vectors for all the elements in v1.
2) Method 2: Extra overhead in comparing vectors v2,v3,v4,v5 and sorting them.
How can I do this in a better way?