Detecting syllables in a word
Posted
by
user50705
on Stack Overflow
See other posts from Stack Overflow
or by user50705
Published on 2009-01-01T17:08:41Z
Indexed on
2010/12/21
23:54 UTC
Read the original article
Hit count: 355
I need to find a fairly efficient way to detect syllables in a word. E.g.,
invisible -> in-vi-sib-le
There are some syllabification rules that could be used:
V CV VC CVC CCV CCCV CVCC
*where V is a vowel and C is a consonant. e.g.,
pronunciation (5 Pro-nun-ci-a-tion; CV-CVC-CV-V-CVC)
I've tried few methods, among which were using regex (which helps only if you want to count syllables) or hard coded rule definition (a brute force approach which proves to be very inefficient) and finally using a finite state automata (which did not result with anything useful).
The purpose of my application is to create a dictionary of all syllables in a given language. This dictionary will later be used for spell checking applications (using Bayesian classifiers) and text to speech synthesis.
I would appreciate if one could give me tips on an alternate way to solve this problem besides my previous approaches.
I work in Java, but any tip in C/C++, C#, Python, Perl... would work for me.
© Stack Overflow or respective owner