speechsynthesizer - Developer IT

Text to MP3 using System.Speech.Synthesis.SpeechSynthesizer

- by Rob

I am trying to get a text-to-speech to save to an MP3. Currently I have the System.Speech.Synthesis speaking to a WAV file nicely. With New System.Speech.Synthesis.SpeechSynthesizer '.SetOutputToWaveFile(pOutputPath) This works fine .SetOutputToWaveStream(<<Problem bit>>) .Speak(pTextToSpeak) .SetOutputToNull() .Dispose() End With Now the first line commented out produces a WAV file which is nice. Currently I am trying to replace that with an MP3 output stream and not having much success. I have tried the Yeti.MMedia converter but either it isn't going to work or I haven't got it to work successfully. I have to admit here I don't know much about encodings, speeds etc. So the question I have is, does anyone know of a nice way I can say something like the following: .SetOutputToWaveStream(New MP3WriteStream(pOutputPath)) and have the SpeechSynthesizer write to the WAV which then gets converted to the MP3 and ends up on the HDD.

Read the article

SpeechSynthesizer in C# creates wav that has 22kHz... needs to be 16kHz

- by Adrian

My C# application needs to covert text to wav file and inject it into a Skype call. The code that creates the wav file is below. The problem is that the file has 22kHz sample rate and Skype accepts only 16kHz. Is there any way to adjust this setting? using (System.IO.FileStream stream = System.IO.File.Create("message.wav")) { System.Speech.Synthesis.SpeechSynthesizer speechEngine = new System.Speech.Synthesis.SpeechSynthesizer(); speechEngine.SetOutputToWaveStream(stream); speechEngine.Speak(number); stream.Flush(); }

Read the article

Constant Memory Leak in SpeechSynthesizer

- by DudeFX

I have developed a project which I would like to release which uses c#, WPF and the System.Speech.Synthesizer object. The issue preventing the release of this project is that whenever SpeakAsync is called it leaves a memory leak that grows to the point of eventual failure. I believe I have cleaned up properly after using this object, but cannot find a cure. I have run the program through Ants Memory Profiler and it reports that WAVEHDR and WaveHeader is growing with each call. I have created a sample project to try to pinpoint the cause, but am still at a loss. Any help would be appreciated. The project uses VS2008 and is a c# WPF project that targets .NET 3.5 and Any CPU. You need to manually add a reference to System.Speech. Here is the Code: <Window x:Class="SpeechTest.Window1" xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation" xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" Title="Window1" Height="300" Width="300"> <Grid> <StackPanel Orientation="Vertical"> <Button Content="Start Speaking" Click="Start_Click" Margin="10" /> <Button Content="Stop Speaking" Click="Stop_Click" Margin="10" /> <Button Content="Exit" Click="Exit_Click" Margin="10"/> </StackPanel> </Grid> // Start of code behind using System; using System.Windows; using System.Speech.Synthesis; namespace SpeechTest { public partial class Window1 : Window { // speak setting private bool speakingOn = false; private int curLine = 0; private string [] speakLines = { "I am wondering", "Why whenever Speech is called", "A memory leak occurs", "If you run this long enough", "It will eventually crash", "Any help would be appreciated" }; public Window1() { InitializeComponent(); } private void Start_Click(object sender, RoutedEventArgs e) { speakingOn = true; SpeakLine(); } private void Stop_Click(object sender, RoutedEventArgs e) { speakingOn = false; } private void Exit_Click(object sender, RoutedEventArgs e) { App.Current.Shutdown(); } private void SpeakLine() { if (speakingOn) { // Create our speak object SpeechSynthesizer spk = new SpeechSynthesizer(); spk.SpeakCompleted += new EventHandler(spk_Completed); // Speak the line spk.SpeakAsync(speakLines[curLine]); } } public void spk_Completed(object sender, SpeakCompletedEventArgs e) { if (sender is SpeechSynthesizer) { // get access to our Speech object SpeechSynthesizer spk = (SpeechSynthesizer)sender; // Clean up after speaking (thinking the event handler is causing the memory leak) spk.SpeakCompleted -= new EventHandler(spk_Completed); // Dispose the speech object spk.Dispose(); // bump it curLine++; // check validity if (curLine = speakLines.Length) { // back to the beginning curLine = 0; } // Speak line SpeakLine(); } } } } I run this program on Windows 7 64 bit and it will run and eventually halt when attempting to create a new SpeechSynthesizer object. When run on Windows Vista 64 bit the memory will grow from a starting point of 34k to so far about 400k and growing. Can anyone see anything in the code that might be causing this, or is this an issue with the Speech object itself. Any help would be appreciated.

Read the article

SpeechSynthesizer Exception - Please Help

- by Chris

Hi. I have the following code: private List<VoiceInfo> GetInstalledVoices(SpeechSynthesizer synthesizer) { CultureInfo currentCulture = CultureInfo.CurrentCulture; var listOfVoiceInfo = from voice in synthesizer.GetInstalledVoices(currentCulture) select voice.VoiceInfo; return listOfVoiceInfo.ToList<VoiceInfo>(); } I then call the code from the following code snippet: var synthesizer = new SpeechSynthesizer(); var installedVoices = GetInstalledVoices(synthesizer); VoiceInfo voice = null; if (installedVoices != null && installedVoices.Count > 0) { voice = installedVoices.FirstOrDefault(); } if (voice != null) { synthesizer.SelectVoice(voice.Name); } The line of code that selects the voice throws the following exception: "Cannot set voice. No matching voice is installed or the voice was disabled." This is being done from within an ASP.NET web application - running on Windows Server 2003 R2. When I run this from within Visual Studio 2008 - everything works fine. I created a simple Console app to perform the same action - then ran it from the Windows Server 2003 machine - and it worked fine. I even modified the code in the Console app to loop through each of the installed voices and select the voice. No problems. However, when doing the same from within the web application, I get the same error. I am beating my head against a wall on this one. ANY help on this would be greatly appreciated. Thanks. Chris

Read the article

Question SpeechSynthesizer.SetOutputToAudioStream audio format problem

- by Chris Kugler

Hi, I'm currently working on an application which requires transmission of speech encoded to a specific audio format. System.Speech.AudioFormat.SpeechAudioFormatInfo synthFormat = new System.Speech.AudioFormat.SpeechAudioFormatInfo(System.Speech.AudioFormat.EncodingFormat.Pcm, 8000, 16, 1, 16000, 2, null); This states that the audio is in PCM format, 8000 samples per second, 16 bits per sample, mono, 16000 average bytes per second, block alignment of 2. When I attempt to execute the following code there is nothing written to my MemoryStream instance; however when I change from 8000 samples per second up to 11025 the audio data is written successfully. SpeechSynthesizer synthesizer = new SpeechSynthesizer(); waveStream = new MemoryStream(); PromptBuilder pbuilder = new PromptBuilder(); PromptStyle pStyle = new PromptStyle(); pStyle.Emphasis = PromptEmphasis.None; pStyle.Rate = PromptRate.Fast; pStyle.Volume = PromptVolume.ExtraLoud; pbuilder.StartStyle(pStyle); pbuilder.StartParagraph(); pbuilder.StartVoice(VoiceGender.Male, VoiceAge.Teen, 2); pbuilder.StartSentence(); pbuilder.AppendText("This is some text."); pbuilder.EndSentence(); pbuilder.EndVoice(); pbuilder.EndParagraph(); pbuilder.EndStyle(); synthesizer.SetOutputToAudioStream(waveStream, synthFormat); synthesizer.Speak(pbuilder); synthesizer.SetOutputToNull(); There are no exceptions or errors recorded when using a sample rate of 8000 and I couldn't find anything useful in the documentation regarding SetOutputToAudioStream and why it succeeds at 11025 samples per second and not 8000. I have a workaround involving a wav file that I generated and converted to the correct sample rate using some sound editing tools, but I would like to generate the audio from within the application if I can. One particular point of interest was that the SpeechRecognitionEngine accepts that audio format and successfully recognized the speech in my synthesized wave file... Update: Recently discovered that this audio format succeeds for certain installed voices, but fails for others. It fails specifically for LH Michael and LH Michelle, and failure varies for certain voice settings defined in the PromptBuilder.

Read the article

Speech Recognition Grammar Rules using delphi code

- by XBasic3000

I need help to make ISeechRecoGrammar without using xml format. Like creating it on runtime on delphi. example: procedure TForm1.FormCreate(Sender: TObject); var AfterCmdState: ISpeechGrammarRuleState; temp : OleVariant; Grammar: ISpeechRecoGrammar; PropertiesRule: ISpeechGrammarRule; ItemRule: ISpeechGrammarRule; TopLevelRule: ISpeechGrammarRule; begin SpSharedRecoContext.EventInterests := SREAllEvents; Grammar := SpSharedRecoContext.CreateGrammar(m_GrammarId); TopLevelRule := Grammar.Rules.Add('TopLevelRule', SRATopLevel Or SRADynamic, 1); PropertiesRule := Grammar.Rules.Add('PropertiesRule', SRADynamic, 2); ItemRule := Grammar.Rules.Add('ItemRule', SRADynamic, 3); AfterCmdState := TopLevelRule.AddState; TopLevelRule.InitialState.AddWordTransition(AfterCmdState, 'test', temp, temp, '****', 0, temp, temp); Grammar.Rules.Commit; Grammar.CmdSetRuleState('TopLevelRule', SGDSActive); end; can someone reconstruct or midify this delphi code (above) to be exactly same function below(xml). <GRAMMAR LANGID="409">  <DEFINE> <ID NAME="RID_start" VAL="1"/> <ID NAME="PID_action" VAL="2"/> <ID NAME="PID_actionvalue" VAL="3"/> </DEFINE>  <RULE NAME="start" ID="RID_start" TOPLEVEL="ACTIVE"> <P>i am</P> <RULEREF NAME="action" PROPNAME="action" PROPID="PID_action" /> <O>OK</O> </RULE> <RULE NAME="action"> <L PROPNAME="actionvalue" PROPID="PID_actionvalue"> <P VAL="1">albert</P> <P VAL="2">francis</P> <P VAL="3">alex</P> </L> </RULE> </GRAMMAR> sorry for my english...

Read the article

System.Speech.Synthesis.SpeechSynthesizer - how to customize the voice?

- by LexRema

Hello. SpeechSynthesizer allows peaking different voices by using SelectVoiceByHints(VoiceGender, VoiceAge)function (as I understood). But no customization happens if I change the gender and voice age. Can you explain why? And if I'm doing something wrong, what is correct way to do that? Thank you.

Read the article

WPF Alphabet (Available for download)

- by mbcrump

WPF Alphabet is a application that I created to help my child learn the alphabet. It displays each letter and pronounces it using speech synthesis. It was developed using WPF and c# in about 3 hours (so its kinda rough). I went ahead and uploaded it to codeplex for those in similar situation or just wanting to see a particular WPF feature. I would also recommend Scott Hanselman’s BabySmash!. Specific WPF Features: DispatcherTimer (WPF) SpeechSynthesizer (WPF) URL Navigate (WPF) not PAGE XAML Examples: DockPanel Border TextBlock HyperLink Button's and Events Download full source and binaries here.

Read the article

SpeechRecognition issue

- by Leosa99 _

I'm creating a Speech Recognition Application like Siri in vb.net. I have found a database of words (in a .txt file) and i want to insert them in my application but its not working . Here my code : Dim WithEvents reco As New Recognition.SpeechRecognitionEngine Dim IA_VOICE As New SpeechSynthesizer Dim List_Word As New Recognition.SrgsGrammar.SrgsOneOf("IN database.") Public Sub New() reco.SetInputToDefaultAudioDevice() Dim gram As New Recognition.SrgsGrammar.SrgsDocument Dim WORD_RULE As New Recognition.SrgsGrammar.SrgsRule("MOT") LOAD_DATABSE(Application.StartupPath & "\RECO_WORD\DataBase.txt") WORD_RULE.Add(List_Word) gram.Rules.Add(WORD_RULE) gram.Root = WORD_RULE reco.LoadGrammar(New Recognition.Grammar(gram)) reco.RecognizeAsync() End Sub Private Sub reco_RecognizeCompleted(ByVal sender As Object, ByVal e As System.Speech.Recognition.RecognizeCompletedEventArgs) Handles reco.RecognizeCompleted reco.RecognizeAsync() End Sub Private Sub reco_SpeechRecognized(ByVal sender As Object, ByVal e As System.Speech.Recognition.RecognitionEventArgs) Handles reco.SpeechRecognized If e.Result.Text = "hi" Then MsgBox("HI!") End If End Sub Sub LOAD_DATABSE(Database_PATH As String) Dim lines() As String = File.ReadAllLines(Database_PATH) Dim numberLinesTotal = lignes.Length Dim numberlignedone As Integer = 0 Dim MOT As New StreamReader(BDD_PATH) While numberlignedone <> numberLinesTota numberlignedone += 1 Dim ITEM As New Recognition.SrgsGrammar.SrgsItem(MOT.ReadLine) Word_List.Items.Add(ITEMS) 'I think its here that its not working. End While MsgBox("END LOADING") End Sub</code> If you know why its not working... Thanks.

Read the article

Working through exercises in "Cocoa Programming for Mac OS X" - I'm stumped

- by Zigrivers

I've been working through the exercises in a book recommended here on stackoverflow, however I've run into a problem and after three days of banging my head on the wall, I think I need some help. I'm working through the "Speakline" exercise where we add a TableView to the interface and the table will display the "voices" that you can choose for the text to speech aspect of the program. I am having two problems that I can't seem to get to the bottom of: I get the following error: * Illegal NSTableView data source (). Must implement numberOfRowsInTableView: and tableView:objectValueForTableColumn:row: The tableView that is supposed to display the voices comes up blank I have a feeling that both of these problems are related. I'm including my interface code here: #import <Cocoa/Cocoa.h> @interface AppController : NSObject <NSSpeechSynthesizerDelegate, NSTableViewDelegate> { IBOutlet NSTextField *textField; NSSpeechSynthesizer *speechSynth; IBOutlet NSButton *stopButton; IBOutlet NSButton *startButton; IBOutlet NSTableView *tableView; NSArray *voiceList; } - (IBAction)sayIt:(id)sender; - (IBAction)stopIt:(id)sender; @end And my implementation code here: #import "AppController.h" @implementation AppController - (id)init { [super init]; //Log to help me understand what is happening NSLog(@"init"); speechSynth = [[NSSpeechSynthesizer alloc] initWithVoice:nil]; [speechSynth setDelegate:self]; voiceList = [[NSSpeechSynthesizer availableVoices] retain]; return self; } - (IBAction)sayIt:(id)sender { NSString *string = [[textField stringValue] stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceCharacterSet]]; //Is the string zero-length? if([string length] == 0) { NSLog(@"String from %@ is a string with a length of %d.", textField, [string length]); [speechSynth startSpeakingString:@"Please enter a phrase first."]; } [speechSynth startSpeakingString:string]; NSLog(@"Started to say: %@", string); [stopButton setEnabled:YES]; [startButton setEnabled:NO]; } - (IBAction)stopIt:(id)sender { NSLog(@"Stopping..."); [speechSynth stopSpeaking]; } - (void) speechSynthesizer:(NSSpeechSynthesizer *)sender didFinishSpeaking:(BOOL)complete { NSLog(@"Complete = %d", complete); [stopButton setEnabled:NO]; [startButton setEnabled:YES]; } - (NSInteger)numberOfRowsInTableView:(NSTableView *)aTableView { return [voiceList count]; } - (id)tableView: (NSTableView *)tv objecValueForTableColumn: (NSTableColumn *)tableColumn row:(NSInteger)row { NSString *v = [voiceList objectAtIndex:row]; NSLog(@"v = %@",v); NSDictionary *dict = [NSSpeechSynthesizer attributesForVoice:v]; return [dict objectForKey:NSVoiceName]; } /* - (BOOL)respondsToSelector:(SEL)aSelector { NSString *methodName = NSStringFromSelector(aSelector); NSLog(@"respondsToSelector: %@", methodName); return [super respondsToSelector:aSelector]; } */ @end Hopefully, you guys can see something obvious that I've missed. Thank you!

Developer IT