Advice on String Similarity Metrics (Java). Distance, sounds like or combo?
Posted
by andreas
on Stack Overflow
See other posts from Stack Overflow
or by andreas
Published on 2010-04-21T13:01:48Z
Indexed on
2010/04/21
23:03 UTC
Read the original article
Hit count: 479
Hello,
A part of a process requires to apply String Similarity Algorithms.
The results of this process will be stored and produce lets say SS_Dataset.
Based on this Dataset, further decisions will have to be made.
My questions are:
Should i apply one or more string similarity algorithms to produce SS_Dataset ?
Any comparisons between algorithms that calculate the 'distance' and the 'Sounds Like' similarity ?
Does one family of algorithms produces more accurate results over the other? Does a combination give more accurate results on similarity?
- Can you recommend implementations that you have worked with?
My implementation will include packages from the following libraries
http://www.dcs.shef.ac.uk/~sam/simmetrics.html
Regards,
© Stack Overflow or respective owner