Detecting syllables in a word
- by user50705
I need to find a fairly efficient way to detect syllables in a word. E.g.,
invisible - in-vi-sib-le
There are some syllabification rules that could be used:
V
CV
VC
CVC
CCV
CCCV
CVCC
*where V is a vowel and C is a consonant.
e.g.,
pronunciation (5 Pro-nun-ci-a-tion; CV-CVC-CV-V-CVC)
I've tried few methods, among which were using regex (which helps only if you want to count syllables) or hard coded rule definition (a brute force approach which proves to be very inefficient) and finally using a finite state automata (which did not result with anything useful).
The purpose of my application is to create a dictionary of all syllables in a given language. This dictionary will later be used for spell checking applications (using Bayesian classifiers) and text to speech synthesis.
I would appreciate if one could give me tips on an alternate way to solve this problem besides my previous approaches.
I work in Java, but any tip in C/C++, C#, Python, Perl... would work for me.