Java: fastest way to do random reads on huge disk file(s)
Posted
by cocotwo
on Stack Overflow
See other posts from Stack Overflow
or by cocotwo
Published on 2010-02-27T09:18:49Z
Indexed on
2010/04/24
7:23 UTC
Read the original article
Hit count: 228
I've got a moderately big set of data, about 800 MB or so, that is basically some big precomputed table that I need to speed some computation by several orders of magnitude (creating that file took several mutlicores computers days to produce using an optimized and multi-threaded algo... I do really need that file).
Now that it has been computed once, that 800MB of data is read only.
I cannot hold it in memory.
As of now it is one big huge 800MB file but splitting in into smaller files ain't a problem if it can help.
I need to read about 32 bits of data here and there in that file a lot of time. I don't know before hand where I'll need to read these data: the reads are uniformly distributed.
What would be the fastest way in Java to do my random reads in such a file or files? Ideally I should be doing these reads from several unrelated threads (but I could queue the reads in a single thread if needed).
Is Java NIO the way to go?
I'm not familiar with 'memory mapped file': I think I don't want to map the 800 MB in memory.
All I want is the fastest random reads I can get to access these 800MB of disk-based data.
btw in case people wonder this is not at all the same as the question I asked not long ago:
http://stackoverflow.com/questions/2346722/java-fast-disk-based-hash-set
© Stack Overflow or respective owner