Datamining on a mysql database

Posted by sliptix on Stack Overflow See other posts from Stack Overflow or by sliptix
Published on 2010-03-31T13:03:46Z Indexed on 2010/03/31 13:13 UTC
Read the original article Hit count: 568

Filed under:
|
|
|

Hello,

I Begin with textmining. I have two database tables with thousands of data..

a table for "skills" and a table for "skills categories"

  • every "skill" belongs to a skills categorie.
  • a "skill" is , physicaly, a varchar(200) field in the database, where there is some text describing the skill.

Here are some skills extracted from the skills table:

"PHP (good level), Java (intermediaite), C++" "PHP5" "project management and quality management" "begining Javascript" "water engineering" "dfsdf zerze rzer" "cibling customers"

what i want to do is to extract knowledge from those fields, i mean extract only the real skill and ignore the rest of useless text. for the above example i want to get only an array with:

"PHP" "Java" "C++" "PHP5" "project management" "quality management" "Javascript" "water engineering" "cibling customers"

what should i do to extract the skills from tons of data please ? do you know specific algorithms to do this ? ex : k-means ... ?

Thanks in advance.

© Stack Overflow or respective owner

Related posts about datamining

Related posts about php