Avoid an "out of memory error" in Java(eclipse), when using large data structure?

Posted by gnomed on Stack Overflow See other posts from Stack Overflow or by gnomed
Published on 2010-03-17T04:55:41Z Indexed on 2010/03/17 5:01 UTC
Read the original article Hit count: 247

OK, so I am writing a program that unfortunately needs to use a huge data structure to complete its work, but it is failing with a "out of memory error" during its initialization. While I understand entirely what that means and why it is a problem, I am having trouble overcoming it, since my program needs to use this large structure and I don't know any other way to store it.

The program first indexes a large corpus of text files that I provide. This works fine.

Then it uses this index to initialize a large 2D array. This array will have nXn entries, where "n" is the number of unique words in the corpus of text. For the relatively small chunk I am testing it on(about 60 files) it needs to make approximately 30,000x30,000 entries. this will probably be bigger once I run it on my full intended corpus too.

It consistently fails every time, after it indexes, while it is initializing the data structure(to be worked on later).

Things I have done include:
revamp my code to use a primitive "int[]" instead of a "TreeMap"
eliminate redundant structures, etc...
Also, I have run eclipse with "eclipse -vmargs -Xmx2g" to max out my allocated memory

I am fairly confident this is not going to be a simple line of code solution, but is most likely going to require a very new approach. I am looking for what that approach is, any ideas?

Thanks, B.

© Stack Overflow or respective owner

Related posts about java

Related posts about eclipse