Optimizing sparse dot-product in C#
Posted
by Haggai
on Stack Overflow
See other posts from Stack Overflow
or by Haggai
Published on 2010-05-10T12:42:41Z
Indexed on
2010/05/12
16:14 UTC
Read the original article
Hit count: 193
c#
|optimization
Hello.
I'm trying to calculate the dot-product of two very sparse associative arrays. The arrays contain an ID and a value, so the calculation should be done only on those IDs that are common to both arrays, e.g. <(1, 0.5), (3, 0.7), (12, 1.3)> * <(2, 0.4), (3, 2.3), (12, 4.7)> = 0.7*2.3 + 1.3*4.7 .
My implementation (call it dict) currently uses Dictionaries, but it is too slow to my taste.
double dot_product(IDictionary<int, double> arr1, IDictionary<int, double> arr2)
{
double res = 0;
double val2;
foreach (KeyValuePair<int, double> p in arr1)
if (arr2.TryGetValue(p.Key, out val2))
res += p.Value * val2;
return res;
}
The full arrays have about 500,000 entries each, while the sparse ones are only tens to hundreds entries each.
I did some experiments with toy versions of dot products. First I tried to multiply just two double arrays to see the ultimate speed I can get (let's call this "flat").
Then I tried to change the implementation of the associative array multiplication using an int[] ID array and a double[] values array, walking together on both ID arrays and multiplying when they are equal (let's call this "double").
I then tried to run all three versions with debug or release, with F5 or Ctrl-F5. The results are as follows:
debug F5: dict: 5.29s double: 4.18s (79% of dict) flat: 0.99s (19% of dict, 24% of double)
debug ^F5: dict: 5.23s double: 4.19s (80% of dict) flat: 0.98s (19% of dict, 23% of double)
release F5: dict: 5.29s double: 3.08s (58% of dict) flat: 0.81s (15% of dict, 26% of double)
release ^F5: dict: 4.62s double: 1.22s (26% of dict) flat: 0.29s ( 6% of dict, 24% of double)
I don't understand these results.
Why isn't the dictionary version optimized in release F5 as do the double and flat versions?
Why is it only slightly optimized in the release ^F5 version while the other two are heavily optimized?
Also, since converting my code into the "double" scheme would mean lots of work - do you have any suggestions how to optimize the dictionary one?
Thanks!
Haggai
© Stack Overflow or respective owner