Persisting NLP parsed data
Posted
by
tjb1982
on Programmers
See other posts from Programmers
or by tjb1982
Published on 2012-10-17T20:59:44Z
Indexed on
2012/10/17
23:19 UTC
Read the original article
Hit count: 382
I've recently started experimenting with NLP using Stanford's CoreNLP, and I'm wondering what are some of the standard ways to store NLP parsed data for something like a text mining application?
One way I thought might be interesting is to store the children as an adjacency list and make good use of recursive queries (postgres supports this and I've found it works really well). Something like this:
Component (
id,
POS,
parent_id
)
Word (
id,
raw,
lemma,
POS,
NER
)
CW_Map (
component_id,
word_id,
position int
)
But I assume there are probably many standard ways to do this depending on what kind of analysis is being done that have been adopted by people working in the field over the years. So what are the standard persistence strategies for NLP parsed data and how are they used?
© Programmers or respective owner