Automatically Organize Tags in Tax/Folksonomy
- by Rob Wilkerson
I'm working on a process that will perform natural language processing (NLP) on one--and potentially several--of our content rich sites. What I'd like to do once the NLP is complete is to automatically organize the output (generally a set of terms that you might think of as tags given the prevalence of that metaphor) into some kind of standard or generally accepted organizational structure.
In a perfect world, I'd really like this to be crowd sourced under the folksonomy concept (as opposed to a taxonomy) since the ultimate goal is to target/appeal to real people rather than "domain experts", but I'm open to ideas and best practices. For the obvious purpose of scalability, I'd like to automate the population of this tax/folksonomy so that "some guy" in the team/organization isn't responsible for looking at a bunch of words (with or without context) and arbitrarily fleshing out the contextual components of the tree.
I have a few ideas for doing this that require some research to establish viability, but I have exactly zero practical experience with this sort of thing so the ideas really just boil down to stuff I made up that might perform some role in accomplishing the task. Imagining that others have vastly more experience with this sort of thing, I'm hoping that I can stand on your shoulders.
Thanks for your thoughts and insights.