Avoid an "out of memory error" in Java(eclipse), when using large data structure?
        Posted  
        
            by gnomed
        on Stack Overflow
        
        See other posts from Stack Overflow
        
            or by gnomed
        
        
        
        Published on 2010-03-17T04:55:41Z
        Indexed on 
            2010/03/17
            5:01 UTC
        
        
        Read the original article
        Hit count: 292
        
OK, so I am writing a program that unfortunately needs to use a huge data structure to complete its work, but it is failing with a "out of memory error" during its initialization. While I understand entirely what that means and why it is a problem, I am having trouble overcoming it, since my program needs to use this large structure and I don't know any other way to store it.
The program first indexes a large corpus of text files that I provide. This works fine.
Then it uses this index to initialize a large 2D array. This array will have nXn entries, where "n" is the number of unique words in the corpus of text. For the relatively small chunk I am testing it on(about 60 files) it needs to make approximately 30,000x30,000 entries. this will probably be bigger once I run it on my full intended corpus too.
It consistently fails every time, after it indexes, while it is initializing the data structure(to be worked on later).
Things I have done include:
revamp my code to use a primitive "int[]" instead of a "TreeMap"
eliminate redundant structures, etc...
Also, I have run eclipse with "eclipse -vmargs -Xmx2g" to max out my allocated memory
I am fairly confident this is not going to be a simple line of code solution, but is most likely going to require a very new approach. I am looking for what that approach is, any ideas?
Thanks, B.
© Stack Overflow or respective owner