identify documents from results of mahout clustering
Posted
by
Tejas
on Stack Overflow
See other posts from Stack Overflow
or by Tejas
Published on 2010-10-15T20:18:29Z
Indexed on
2011/01/15
19:53 UTC
Read the original article
Hit count: 223
mahout
I am using mahout to cluster text documents indexed using solr.
I have used the "text" field in the document to form vectors. Then I used the k-means driver in mahout for clustering and then the clusterdumper utility to dump the results.
I am having difficulty in understanding the output results from the dumper. I could see the clusters formed with term vectors in those clusters. But how do I extract the documents from these clusters. I want the result to be the input documents appearing in different clusters.
© Stack Overflow or respective owner