stanford pos tagger runs out of memory?
Posted
by goh
on Stack Overflow
See other posts from Stack Overflow
or by goh
Published on 2010-06-11T09:26:36Z
Indexed on
2010/06/11
9:33 UTC
Read the original article
Hit count: 659
python
|stanford-nlp
my stanford tagger ran out of memory. Is it because the text has to be properly formatted? This is because i use it to tag html contents, with the tags stripped, but there may have quite a excessive amount of newlines.
here is the error:
BlockquoWARNING: Untokenizable: ? (char in decimal: 9829) Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at edu.stanford.nlp.sequences.ExactBestSequenceFinder.bestSequenceNew(Ex actBestSequenceFinder.java:175) at edu.stanford.nlp.sequences.ExactBestSequenceFinder.bestSequence(Exact BestSequenceFinder.java:98) at edu.stanford.nlp.tagger.maxent.TestSentence.runTagInference(TestSente nce.java:277) at edu.stanford.nlp.tagger.maxent.TestSentence.testTagInference(TestSent ence.java:258) at edu.stanford.nlp.tagger.maxent.TestSentence.tagSentence(TestSentence. java:110) at edu.stanford.nlp.tagger.maxent.MaxentTagger.tagSentence(MaxentTagger. java:825) at edu.stanford.nlp.tagger.maxent.MaxentTagger.runTagger(MaxentTagger.ja va:1319) at edu.stanford.nlp.tagger.maxent.MaxentTagger.runTagger(MaxentTagger.ja va:1225) at edu.stanford.nlp.tagger.maxent.MaxentTagger.runTagger(MaxentTagger.ja va:1183) at edu.stanford.nlp.tagger.maxent.MaxentTagger.main(MaxentTagger.java:13 58)
© Stack Overflow or respective owner