Package to compare LSA, TFIDF, Cosine metrics and Language Models
Posted
by gouwsmeister
on Stack Overflow
See other posts from Stack Overflow
or by gouwsmeister
Published on 2009-10-12T21:12:48Z
Indexed on
2010/05/02
17:58 UTC
Read the original article
Hit count: 301
Hi,
I'm looking for a package (any language, really) that I can use on a corpus of 50 documents to perform interdocument similarity testing in various metrics, like tfidf, okapi, language models, lsa, etc.
I want as a result a document similarity matrix, i.e. doc1 is x% similar to doc2, etc... This is for research purposes, not for production. I specifically want the doc similarity matrix as I want to correlate this with human ratings.
Thank you in advance!
© Stack Overflow or respective owner