Persisting NLP parsed data

Posted by tjb1982 on Programmers See other posts from Programmers or by tjb1982
Published on 2012-10-17T20:59:44Z Indexed on 2012/10/17 23:19 UTC
Read the original article Hit count: 493

Filed under:

database

|

parsing

|

persistence

|

nlp

I've recently started experimenting with NLP using Stanford's CoreNLP, and I'm wondering what are some of the standard ways to store NLP parsed data for something like a text mining application?

One way I thought might be interesting is to store the children as an adjacency list and make good use of recursive queries (postgres supports this and I've found it works really well). Something like this:

Component (
  id,
  POS,
  parent_id
)

Word (
  id,
  raw,
  lemma,
  POS,
  NER
)

CW_Map (
  component_id,
  word_id,
  position int
)

But I assume there are probably many standard ways to do this depending on what kind of analysis is being done that have been adopted by people working in the field over the years. So what are the standard persistence strategies for NLP parsed data and how are they used?

© Programmers or respective owner

Related posts about database

SQL SERVER Retrieve and Explore Database Backup without Restoring Database Idera virtual database

as seen on Dot net Slackers - Search for 'Dot net Slackers'
I recently downloaded Ideras SQL virtual database, and tested it. There are a few things about this tool which caught my attention.My ScenarioIt is quite common in real life that sometimes observing or retrieving older data is necessary; however, it had changed as time passed by. The full database… >>> More
Cloning A Database On The Same Server Using Rman Duplicate From Active Database

as seen on Oracle Blogs - Search for 'Oracle Blogs'
To clone a database using Rman we used to require an existing Rman backup, on 11g we can clone databases using the "from active" database option. In this case we do not require an existing backup, the active datafiles will be used as the source for the clone. In order to clone with the source database… >>> More
cPickle ImportError: No module named multiarray

as seen on Stack Overflow - Search for 'Stack Overflow'
Hello, I'm using cPickle to save my Database into file. The code looks like that: def Save_DataBase(): import cPickle from scipy import * from numpy import * a=Results.VersionName #filename='D:/results/'+a[a.find('/')+1:-a.find('/')-2]+Results.AssType[:3]+str(random.randint(0,100))+Results.Distribution+"… >>> More
SQL SERVER – 2008 – Introduction to Snapshot Database – Restore From Snapshot

as seen on SQL Authority - Search for 'SQL Authority'
Snapshot database is one of the most interesting concepts that I have used at some places recently. Here is a quick definition of the subject from Book On Line: A Database Snapshot is a read-only, static view of a database (the source database). Multiple snapshots can exist on a source database and… >>> More
OTN ???? ?????? ???????

as seen on Oracle Blogs - Search for 'Oracle Blogs'
Database ?? Database ??????? Database ?????????? Java WebLogic Server/????????·???? SOA/BPM/????? ???????/???? ID??/?????? ?????EPM/BI EPM/BI ??????? EPM/BI ???? OS/??? ???? ????? MySQL Database ?? ???? ?? ????????? ??? ?? ORACLE MASTER… >>> More

Related posts about parsing

Hot to fix nautilus desktop on linux mint

as seen on Ask Ubuntu - Search for 'Ask Ubuntu'
so I'm using Linux Mint 13 with Cinnamon and suddenly there are no icons on the desktop and the right click doesn't work, it's like the desktop doesn't start up at all, but the Cinnamon interface and everything else are working just fine. This happens only when I open the session with Cinnamon, if… >>> More
Is parsing JSON faster than parsing XML

as seen on Stack Overflow - Search for 'Stack Overflow'
I'm creating a sophisticated JavaScript library for working with my company's server side framework. The server side framework encodes its data to a simple XML format. There's no fancy namespacing or anything like that. Ideally I'd like to parse all of the data in the browser as JSON. However, if… >>> More
Looking for a tutorial on Recursive Descent Parsing.

as seen on Stack Overflow - Search for 'Stack Overflow'
I am trying to parse some data to no success. Can anyone recommend a good introduction with a lot of examples to Recursive Descent Parsing? I haven't been able to find any. >>> More
Parsing XML with Hpricot, a Gem of a Ruby Gem

as seen on Internet.com - Search for 'Internet.com'
Need to parse complex XML documents but don't know where to begin? Leave the task to Ruby's powerful Hpricot library. >>> More
Parsing scripts that use curly braces

as seen on Programmers - Search for 'Programmers'
To get an idea of what I'm doing, I am writing a python parser that will parse directx .x text files. The problem I have deals with how the files are formatted. Although I'm writing it in python, I'm looking for general algorithms for dealing with this sort of parsing. .x files define data using… >>> More