Optimizing processing and management of large Java data arrays
- by mikera
I'm writing some pretty CPU-intensive, concurrent numerical code that will process large amounts of data stored in Java arrays (e.g. lots of double[100000]s). Some of the algorithms might run millions of times over several days so getting maximum steady-state performance is a high priority.
In essence, each algorithm is a Java object that has an method API something like:
public double[] runMyAlgorithm(double[] inputData);
or alternatively a reference could be passed to the array to store the output data:
public runMyAlgorithm(double[] inputData, double[] outputData);
Given this requirement, I'm trying to determine the optimal strategy for allocating / managing array space. Frequently the algorithms will need large amounts of temporary storage space. They will also take large arrays as input and create large arrays as output.
Among the options I am considering are:
Always allocate new arrays as local variables whenever they are needed (e.g. new double[100000]). Probably the simplest approach, but will produce a lot of garbage.
Pre-allocate temporary arrays and store them as final fields in the algorithm object - big downside would be that this would mean that only one thread could run the algorithm at any one time.
Keep pre-allocated temporary arrays in ThreadLocal storage, so that a thread can use a fixed amount of temporary array space whenever it needs it. ThreadLocal would be required since multiple threads will be running the same algorithm simultaneously.
Pass around lots of arrays as parameters (including the temporary arrays for the algorithm to use). Not good since it will make the algorithm API extremely ugly if the caller has to be responsible for providing temporary array space....
Allocate extremely large arrays (e.g. double[10000000]) but also provide the algorithm with offsets into the array so that different threads will use a different area of the array independently. Will obviously require some code to manage the offsets and allocation of the array ranges.
Any thoughts on which approach would be best (and why)?