phonemes - Developer IT

How to get Phonemes on voice recognition?

- by XBasic3000

i am working on Voice Recognition to Display the Phonemes and its wave form using the built-in voice recognition on vista and windows 7 using Delphi2009.

Detect similar sounding words in Ruby

- by JP

I'm aware of SOUNDEX and (double) Metaphone, but these don't let me test for the similarity of words as a whole - for example "Hi" sounds very similar to "Bye", but both of these methods will mark them as completely different. Are there any libraries in Ruby, or any methods you know of, that are capable of determining the similarity between two words? (Either a boolean is/isn't similar, or numerical 40% similar) edit: Extra bonus points if there is an easy method to 'drop in' a different dialect or language!

Read the article

How can I change how OS X's 'say' command pronounces a word?

- by jwhitlock

OS X's say command is useful for some tasks (such as Skype's 'notify me when a contact comes online), but it is pronouncing some names incorrectly. Is there a way to teach say to pronounce a word differently? For example, try: say "Hi, Joel Spolsky" The 'ol' sounds like 'ball' rather than 'old'. I'd like to add an exception that say "Pronounce Spolsky like this", rather than try to teach new linguistic rules. I bet there is a way since it can pronounce "iphone" as Apple wants. Update - After some research, here's what I've learned: Text-to-speech is split between turning the text to phonemes, and then the phonemes are turned into audio using a voice. Changing the voice doesn't effect the phonemes. The Speech Synthesis Manager has some functions for turning text to phonemes, and a method for registering a speech dictionary that will add new text-phoneme maps. However, Apple's speech dictionary must be in a binary form - I didn't find any plist XML. Using dtrace while running say, I found some interesting files opened in /System/Library/PrivateFrameworks/SpeechDictionary.framework/Resources. This is probably the speech dictionary, but they are all binary, except for Homophones, which is XML. Adding entries to Homophones does nothing - it is probably used in speech-to-text. They are also code signed by Apple - changing them may prevent some programs from working. PrefixDictionary CartNames CartLite SymbolDictionary Homophones There are ways to add text versions of application interface elements so VoiceOver works, a lot of which a developer gets for free, but there are tricky bits. The standard here appears to be to use a phonetic spelling as needed. My guesses are: say is a light layer of code on top of the Speech Synthesis Manager. It would be easy for the Apple devs to add a command line option to take the path to a speech dictionary plist for alternate phoneme mapping, but they didn't. It may be a useful open-source project to write a better say. Skype probably uses Speech Synthesis Manager directly, leaving no hooks to change the way my friend's names are pronounced, other than spelling them phonetically, which is silly. The easiest way to make a command line version of say is how JRobert suggested. Here's my quick implementation, using Doug Harris's spelling suggestion: #!/bin/sh echo $@ | tr '[A-Z]' '[a-z]' | sed "s/spolsky/spowlsky/g" | /usr/bin/say Finally, some fun command line stuff: # Apple is weird sqlite3 /System/Library/PrivateFrameworks/SpeechDictionary.framework/Resources/Tuples .dump # Get too much information about what files are being opened sudo dtrace -n 'syscall::open*:entry { printf("%s %s",execname,copyinstr(arg0)); }' # Just fun say -v bad "Joel Spolsky Spolsky Spolsky Spolsky Spolsky, Joel Spolsky Spolsky Spolsky Spolsky Spolsky" echo "scale=1000; 4*a(1)" | bc -l | say

Read the article

Emulate Historical Figures i.e. Einstein - Is this possible using linguistic logic for my http://www.ustimeline.com Education System

- by Johnnylight

After hearing about the success of IBM's Watson I started thinking perhaps emulating human language is now possible? My goal is to create Virtual Historical characters to represent the main characters in my Adventur-Cation The Great American Adventure program such as Einstein or Crazy Horse. The goal is to build an intelligent system capable of indexing the internet and storing the data using a schema using modern knowledge on linguistic theory (phonemes, morphemes, syntax) to build a system capable to returning a semantically sound response very similar to the response made by the same person if still alive today. The goal would be to use the same engine/system for all characters. Each characters would have their own digital representation and voice, and would organize data differently based on tags/keywords stored about the individual. Imagine a Max Headroom Einstein. Based on the success of Watson, I believe something like this may now be possible. Would be an interesting way to study history and would be a vehicle of entertainment as well. Can anyone confirm if this has already been attempted? Is anyone interested in exploring this using Cognitive Science, Psychology, Artificial Intelligence, Historical data captured on the internet, and Linguistic theory?

Read the article

Training speech recognition software

- by wyatt

A little left field, but I'm trying to train a speech recognition program and the guidelines suggest that I attempt to speak clearly but naturally. I notice, however, that when one speaks naturally each word tends to drift into the next, resulting in a rather ambiguous boundary between the words. One the one hand, speaking in a more stilted manner would seem to aid the computer in recognising the phonemes, but on the other it would tend to make it less likely to understand more natural speech. Anyone knowledgeable in the field out there who can suggest which of the two approaches is more effective? Thanks

Read the article

how to put my own words. small problem in Sphinx configuration

- by Nubkadiya

hi, i have setup Sphinx and tested Hello World and its working fine. but i wanted to change the words. so i change the WSJ extension and edited the Dictionary and included the name which i wanted with the phonemes. and then i zip it and again change the extension type to jar and checked. but it says "WARNING jsgfGrammar Can't find pronunciation for nugegoda" and when i talked to some friends about it they asking me to put that word to the language model also but i cannot find such a language model in the library folder. please can someone help me the normal english words are working fine. i hope someone help me on this.

Developer IT