Java: ArrayList bottleneck
- by Jack
Hello,
while profiling a java application that calculates hierarchical clustering of thousands of elements I realized that ArrayList.get occupies like half of the CPU needed in the clusterization part of the execution.
The algorithm searches the two more similar elements (so it is O(n*(n+1)/2) ), here's the pseudo code:
int currentMax = 0.0f
for (int i = 0 to n)
for (int j = i to n)
get content i-th and j-th
if their similarity > currentMax
update currentMax
merge the two clusters
So effectively there are a lot of ArrayList.get involved.
Is there a faster way? I though that since ArrayList should be a linear array of references it should be the quickest way and maybe I can't do anything since there are simple too many gets.. but maybe I'm wrong. I don't think using a HashMap could work since I need to get them all on every iteration and map.values() should be backed by an ArrayList anyway..
Otherwise should I try other collection libraries that are more optimized? Like google's one, or apache one..
Thanks