speech recognition

|Ideas for applications using face detection and recognition

- by Omry

Full disclosure: I work at face.com. Face.com just launched a free (up to an hourly limit) face detection and recognition REST API. We got a very handy API sandbox that developers can use to play the API and to see what it can and can't do. Besides the obvious point of letting you guys know about the API, I wanted to hear from you what kind of applications you think can be developed with it. Some pretty obvious ideas: Face based login (not entirely secure but still fun). Automatic face crop for sites that let users upload photos (dating sites etc) Some kind of integration into augmented reality games There is no right or wrong answers here, use your imagination :).

Read the article

Dice face value recognition

- by Jakob Gade

I’m trying to build a simple application that will recognize the values of two 6-sided dice. I’m looking for some general pointers, or maybe even an open source project. The two dice will be black and white, with white and black pips respectively. Their distance to the camera will always be the same, but their position on the playing surface will be random. (not the best example, the surface will be a different color and the shadows will be gone) I have no prior experience with developing this kind of recognition software, but I would assume the trick is to first isolate the faces by searching for the square profile with a dominating white or black color (the rest of the image, i.e. the table/playing surface, will in distinctly different colors), and then isolate the pips for the count. Shadows will be eliminated by top down lighting. I’m hoping the described scenario is so simple (read: common) it may even be used as an “introductory exercise” for developers working on OCR technologies or similar computer vision challenges.

Read the article

Speech recognition with Flash or Silverlight

- by Sebastián Grignoli

I'm developing a web user interface to enter some information that is not very complex but needs to be loaded in real time. I think that the application could make use of speech recognition to facilitate the task. Te core of the interface is being built with Javascript and jQuery, but can easily include a flash or silverlight component. I believe that´s probably the way to go... I don't need to recognize everything that the user says, but only a few prerecorded commands. Also, I don't want the user to click on a button to specify the begining and the end of the spoken command. It should be detected live. Is there anything that does this? I would be grateful if anyone tells me about a complete solution, free or commercial, as well as any advice on capturing a sound stream from the mic and process it with flash or sliverlight. Sebastian.-

Read the article

Suggestion for creating custom sound recognition software to toggle audio

- by Parrot owner

I need to develop a program that toggles a particular audio track on or off when it recognizes a parrot scream or screech. The software would need to recognize a particular range of sounds and allow some variations in the range (as a parrot likely won't replicate its sreeches EXACTLY each time). Example: Bird screeches, no audio. Bird stops screeching for five seconds, audio track praising the bird plays. Regular chattering needs to be ignored completely, as it is not to be discouraged. I've heard of java libraries that have speech recognition with dictionaries built in, but the software would need to be taught the particular sounds that my particular parrot makes - not words or any random bird sound. In addition as I mentioned above, it would need to allow for slight variation in the sound, as the screech will likely never be 100% identical to the recorded version. What would be the best way to go about this/what language should I look into?

Read the article

Sound sample recognition library/code

- by Daniel Mošmondor

I don't want sound-to-text software. What I need is the following: I'll record multiple (say 50+) audio streams (recordings of radio stations) from that recordings, I'll mark interesting audio clips - their length ranges from 2 to 60 seconds - there will be few thousands of such audio clips library should be able to find other instances of same audio clips from recorded sound streams confidence factor should be reported to used and additional input provided so the recognition could perform better next time Do you know of such software library? LGPL would be most valuable to me, but I can go for commercial license as well.

Read the article

Question SpeechSynthesizer.SetOutputToAudioStream audio format problem

- by Chris Kugler

Hi, I'm currently working on an application which requires transmission of speech encoded to a specific audio format. System.Speech.AudioFormat.SpeechAudioFormatInfo synthFormat = new System.Speech.AudioFormat.SpeechAudioFormatInfo(System.Speech.AudioFormat.EncodingFormat.Pcm, 8000, 16, 1, 16000, 2, null); This states that the audio is in PCM format, 8000 samples per second, 16 bits per sample, mono, 16000 average bytes per second, block alignment of 2. When I attempt to execute the following code there is nothing written to my MemoryStream instance; however when I change from 8000 samples per second up to 11025 the audio data is written successfully. SpeechSynthesizer synthesizer = new SpeechSynthesizer(); waveStream = new MemoryStream(); PromptBuilder pbuilder = new PromptBuilder(); PromptStyle pStyle = new PromptStyle(); pStyle.Emphasis = PromptEmphasis.None; pStyle.Rate = PromptRate.Fast; pStyle.Volume = PromptVolume.ExtraLoud; pbuilder.StartStyle(pStyle); pbuilder.StartParagraph(); pbuilder.StartVoice(VoiceGender.Male, VoiceAge.Teen, 2); pbuilder.StartSentence(); pbuilder.AppendText("This is some text."); pbuilder.EndSentence(); pbuilder.EndVoice(); pbuilder.EndParagraph(); pbuilder.EndStyle(); synthesizer.SetOutputToAudioStream(waveStream, synthFormat); synthesizer.Speak(pbuilder); synthesizer.SetOutputToNull(); There are no exceptions or errors recorded when using a sample rate of 8000 and I couldn't find anything useful in the documentation regarding SetOutputToAudioStream and why it succeeds at 11025 samples per second and not 8000. I have a workaround involving a wav file that I generated and converted to the correct sample rate using some sound editing tools, but I would like to generate the audio from within the application if I can. One particular point of interest was that the SpeechRecognitionEngine accepts that audio format and successfully recognized the speech in my synthesized wave file... Update: Recently discovered that this audio format succeeds for certain installed voices, but fails for others. It fails specifically for LH Michael and LH Michelle, and failure varies for certain voice settings defined in the PromptBuilder.

Read the article

PCA extended face recognition

- by cMinor

The state of the art says that we can use PCA to perform face recognition. like this, this or this I am working with a project that involves training a classifier to detect a person who is wearing glasess or hats or even a mustache. The purpose of doing this is to detect when a person that has robbed a bank, store, or have commeted some sort of crime(s) (we have their image in a database), enters a certain place ( historically we know these guys have robbed, so we should take care to avoid problems). We came first to have a distributed database with all images of criminals, then I thought to have a layer of them clasifying these criminals using accesories like hats, mustache or anything that hides their face etc... Then, to apply that knowledge to detect when a particular or a suspect person enters a comercial place. ( In practice when someone is going to rob not all the times they are using an accesorie...) What do you think about this idea of doing PCA to first detect principal components of the face and then the components of an accesory. I was thinking that maybe a probabilistic approach is better so we can compute the probability the criminal is the person that entered a place and call the respective authorities.

Read the article

VC++ 6 and MS Speech SDK 5.1 fatal error C1083: Cannot open source file: 'files\microsoft': No such

- by eg123

Trying to compile an application (flite synthesis sapi) on vc++6. This requires Microsoft Speech SDK 5.1 Have included C:\Program Files\Microsoft Speech SDK 5.1\IDL C:\Program Files\Microsoft Speech SDK 5.1\include using Toolsoptionsdirectories and also on another attempt via ProjectSettings Repeatedly get this error microsoft fatal error C1083: Cannot open source file: 'files\microsoft': No such file or directory speech fatal error C1083: Cannot open source file: 'speech': No such file or directory sdk fatal error C1083: Cannot open source file: 'sdk': No such file or directory idl fatal error C1083: Cannot open source file: '5.1\idl': No such file or directory FliteCMUKalDiphone.idl Thought it may be spaces related so included full path in quotes in relevant .h files. No joy Installed Microsoft Speech SDK 5.1 on another machine in same folder as flite and renamed to mssdk51 (so no spaces in pathname) but same error came up. Tried pasting in contents of each .idl called in file where glitch seems to generate Still same message. I am new to C++ and programming in general. My only guess is that something in the speech sdk is calling the .idl file and I can't find where from. Of course this is probably way wrong!

Read the article

Voice Recognition Google API

- by user2966744

thanks for reading. I'm creating a simple web based drawing app that uses speech recognition. I have created a simple page, the project is on github here: https://github.com/a5hton/speechdraw It has a 16x16 pixel grid. I would like to be able to draw on this grid by using simple words. For example if you say "right", the pixel to the right will be colored black. If you say "down" the pixel below the last one will be colored black. You can say up, down, left or right and the corresponding pixels will be colored. Saying "erase" will switch to erase mode, colouring the pixels back to their original color. Saying "lift" will lift the pen off the page. Saying "draw" will enable the draw mode. Could you please help me work out how to make this happen. Please see the simple page at to get an understanding. Thank you! Cheers, Michael

Read the article

Speech recognition plugin Runtime Error: Unhandled Exception. What could possibly cause it?

- by manuel

I'm writing a plugin (dll file) for speech recognition, and I'm creating a WinForm as its interface/dialog. When I run the plugin and click a button to start the initialization, I get an unhandled exception. Below is the complete details of it. See the end of this message for details on invoking just-in-time (JIT) debugging instead of this dialog box. ***** Exception Text ******* System.ArgumentException: Value does not fall within the expected range. at System.Speech.Internal.SapiInterop.SapiProxy.MTAThread.Invoke2(VoidDelegate pfn) at System.Speech.Internal.SapiInterop.SapiRecognizer.SetInput(Object input, Boolean allowFormatChanges) at System.Speech.Recognition.RecognizerBase.SetInputToDefaultAudioDevice() at System.Speech.Recognition.SpeechRecognitionEngine.SetInputToDefaultAudioDevice() at gen_myplugin.Dialog.init() at gen_myplugin.Dialog.btnSpeak_Click(Object sender, EventArgs e) at System.Windows.Forms.Control.OnClick(EventArgs e) at System.Windows.Forms.Button.OnClick(EventArgs e) at System.Windows.Forms.Button.OnMouseUp(MouseEventArgs mevent) at System.Windows.Forms.Control.WmMouseUp(Message& m, MouseButtons button, Int32 clicks) at System.Windows.Forms.Control.WndProc(Message& m) at System.Windows.Forms.ButtonBase.WndProc(Message& m) at System.Windows.Forms.Button.WndProc(Message& m) at System.Windows.Forms.Control.ControlNativeWindow.OnMessage(Message& m) at System.Windows.Forms.Control.ControlNativeWindow.WndProc(Message& m) at System.Windows.Forms.NativeWindow.Callback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam) ***** Loaded Assemblies ******* mscorlib Assembly Version: 2.0.0.0 Win32 Version: 2.0.50727.3603 (GDR.050727-3600) CodeBase: file:///c:/WINDOWS/Microsoft.NET/Framework/v2.0.50727/mscorlib.dll ---------------------------------------- gen_speechquery Assembly Version: 1.0.3755.878 Win32 Version: CodeBase: file:///C:/Program%20Files/Winamp/Plugins/gen_speechquery.dll ---------------------------------------- msvcm90 Assembly Version: 9.0.30729.1 Win32 Version: 9.00.30729.1 CodeBase: file:///C:/WINDOWS/WinSxS/x86_Microsoft.VC90.CRT_1fc8b3b9a1e18e3b_9.0.30729.1_x-ww_6f74963e/msvcm90.dll ---------------------------------------- System.Windows.Forms Assembly Version: 2.0.0.0 Win32 Version: 2.0.50727.3053 (netfxsp.050727-3000) CodeBase: file:///C:/WINDOWS/assembly/GAC_MSIL/System.Windows.Forms/2.0.0.0__b77a5c561934e089/System.Windows.Forms.dll ---------------------------------------- System Assembly Version: 2.0.0.0 Win32 Version: 2.0.50727.3053 (netfxsp.050727-3000) CodeBase: file:///C:/WINDOWS/assembly/GAC_MSIL/System/2.0.0.0__b77a5c561934e089/System.dll ---------------------------------------- System.Drawing Assembly Version: 2.0.0.0 Win32 Version: 2.0.50727.3053 (netfxsp.050727-3000) CodeBase: file:///C:/WINDOWS/assembly/GAC_MSIL/System.Drawing/2.0.0.0__b03f5f7f11d50a3a/System.Drawing.dll ---------------------------------------- System.Speech Assembly Version: 3.0.0.0 Win32 Version: 3.0.6920.1109 (lh_tools_devdiv_wpf.071009-1109) CodeBase: file:///C:/WINDOWS/assembly/GAC_MSIL/System.Speech/3.0.0.0__31bf3856ad364e35/System.Speech.dll ***** JIT Debugging ******* To enable just-in-time (JIT) debugging, the .config file for this application or computer (machine.config) must have the jitDebugging value set in the system.windows.forms section. The application must also be compiled with debugging enabled. For example: When JIT debugging is enabled, any unhandled exception will be sent to the JIT debugger registered on the computer rather than be handled by this dialog box.

Read the article

Anyone knows good references for Machine Learning Algorithms and Image Recognition?

- by RaymondBelonia

I need it for my thesis and for some reason I am having a hard time finding decent books or websites for it. My thesis topic is "Classification of Modern Art Paintings using Machine Learning Approach". My goal is to classify examples of modern art paintings to its respective modern art movement(expressionism, realism,etc..) using machine learning approach. Also, suggestions and comments about my thesis are greatly appreciated.

Read the article

How to store generated eigen faces for future face recognition?

- by user3237134

My code works in the following manner: 1.First, it obtains several images from the training set 2.After loading these images, we find the normalized faces,mean face and perform several calculation. 3.Next, we ask for the name of an image we want to recognize 4.We then project the input image into the eigenspace, and based on the difference from the eigenfaces we make a decision. 5.Depending on eigen weight vector for each input image we make clusters using kmeans command. Source code i tried: clear all close all clc % number of images on your training set. M=1200; %Chosen std and mean. %It can be any number that it is close to the std and mean of most of the images. um=60; ustd=32; %read and show images(bmp); S=[]; %img matrix for i=1:M str=strcat(int2str(i),'.jpg'); %concatenates two strings that form the name of the image eval('img=imread(str);'); [irow icol d]=size(img); % get the number of rows (N1) and columns (N2) temp=reshape(permute(img,[2,1,3]),[irow*icol,d]); %creates a (N1*N2)x1 matrix S=[S temp]; %X is a N1*N2xM matrix after finishing the sequence %this is our S end %Here we change the mean and std of all images. We normalize all images. %This is done to reduce the error due to lighting conditions. for i=1:size(S,2) temp=double(S(:,i)); m=mean(temp); st=std(temp); S(:,i)=(temp-m)*ustd/st+um; end %show normalized images for i=1:M str=strcat(int2str(i),'.jpg'); img=reshape(S(:,i),icol,irow); img=img'; end %mean image; m=mean(S,2); %obtains the mean of each row instead of each column tmimg=uint8(m); %converts to unsigned 8-bit integer. Values range from 0 to 255 img=reshape(tmimg,icol,irow); %takes the N1*N2x1 vector and creates a N2xN1 matrix img=img'; %creates a N1xN2 matrix by transposing the image. % Change image for manipulation dbx=[]; % A matrix for i=1:M temp=double(S(:,i)); dbx=[dbx temp]; end %Covariance matrix C=A'A, L=AA' A=dbx'; L=A*A'; % vv are the eigenvector for L % dd are the eigenvalue for both L=dbx'*dbx and C=dbx*dbx'; [vv dd]=eig(L); % Sort and eliminate those whose eigenvalue is zero v=[]; d=[]; for i=1:size(vv,2) if(dd(i,i)>1e-4) v=[v vv(:,i)]; d=[d dd(i,i)]; end end %sort, will return an ascending sequence [B index]=sort(d); ind=zeros(size(index)); dtemp=zeros(size(index)); vtemp=zeros(size(v)); len=length(index); for i=1:len dtemp(i)=B(len+1-i); ind(i)=len+1-index(i); vtemp(:,ind(i))=v(:,i); end d=dtemp; v=vtemp; %Normalization of eigenvectors for i=1:size(v,2) %access each column kk=v(:,i); temp=sqrt(sum(kk.^2)); v(:,i)=v(:,i)./temp; end %Eigenvectors of C matrix u=[]; for i=1:size(v,2) temp=sqrt(d(i)); u=[u (dbx*v(:,i))./temp]; end %Normalization of eigenvectors for i=1:size(u,2) kk=u(:,i); temp=sqrt(sum(kk.^2)); u(:,i)=u(:,i)./temp; end % show eigenfaces; for i=1:size(u,2) img=reshape(u(:,i),icol,irow); img=img'; img=histeq(img,255); end % Find the weight of each face in the training set. omega = []; for h=1:size(dbx,2) WW=[]; for i=1:size(u,2) t = u(:,i)'; WeightOfImage = dot(t,dbx(:,h)'); WW = [WW; WeightOfImage]; end omega = [omega WW]; end % Acquire new image % Note: the input image must have a bmp or jpg extension. % It should have the same size as the ones in your training set. % It should be placed on your desktop ed_min=[]; srcFiles = dir('G:\newdatabase\*.jpg'); % the folder in which ur images exists for b = 1 : length(srcFiles) filename = strcat('G:\newdatabase\',srcFiles(b).name); Imgdata = imread(filename); InputImage=Imgdata; InImage=reshape(permute((double(InputImage)),[2,1,3]),[irow*icol,1]); temp=InImage; me=mean(temp); st=std(temp); temp=(temp-me)*ustd/st+um; NormImage = temp; Difference = temp-m; p = []; aa=size(u,2); for i = 1:aa pare = dot(NormImage,u(:,i)); p = [p; pare]; end InImWeight = []; for i=1:size(u,2) t = u(:,i)'; WeightOfInputImage = dot(t,Difference'); InImWeight = [InImWeight; WeightOfInputImage]; end noe=numel(InImWeight); % Find Euclidean distance e=[]; for i=1:size(omega,2) q = omega(:,i); DiffWeight = InImWeight-q; mag = norm(DiffWeight); e = [e mag]; end ed_min=[ed_min MinimumValue]; theta=6.0e+03; %disp(e) z(b,:)=InImWeight; end IDX = kmeans(z,5); clustercount=accumarray(IDX, ones(size(IDX))); disp(clustercount); QUESTIONS: 1.It is working fine for M=50(i.e Training set contains 50 images) but not for M=1200(i.e Training set contains 1200 images).It is not showing any error.There is no output.I waited for 10 min still there is no output. I think it is going infinite loop.What is the problem?Where i was wrong? 2.Instead of running the training set everytime how eigen faces generated are stored so that stored eigen faces are used for future face recoginition for a new input image.So it reduces wastage of time.

Read the article

Face morph and recognition

- by startuper

I have two requirements: members of a social network choose other member's faces and morph an average face of them. The website finds other members' faces that resemble the morphed face and list up in order of resemblance. Is there a script that can do this? I see that http://www.faceresearch.org/demos/average does the item 1 but they don't license their technology. Please help. Thank you in advance.

Read the article

Delphi SAPI Text-To-Speech

- by Andreas Rejbrand

First of all: this is not a duplicate of http://stackoverflow.com/questions/1021490/delphi-and-sapi. I have a specific problem with the "SAPI in Delphi" subject. I have used the excellent Import Type-Library guide in Delphi 2009 to get a TSpVoice component in the component palette. This works great. With var SpVoice: TSpVoice; I can write SpVoice.Speak('This is an example.', 1); to get asynchronous audio output. First question According to the documentation, I would be able to write SpVoice.Speak('This is an example.', 0); to get synchronous audio output, but instead I get an EZeroDivide exception. Why's that? Second question But more importantly, I would like to be able to create the SpVoice object dynamically (I think this is called to "late-bound" the SpVoice object), partly because only a very small fraction of all sessions of my app will use it, and partly because I do not want to assume the existance of the SAPI server on the end-user's system. To this end, I tried procedure TForm1.FormClick(Sender: TObject); var SpVoice: Variant; begin SpVoice := CreateOleObject('SAPI.SpVoice'); SpVoice.Speak('this is a test', 0); end; which apparently does nothing at all! (Replacing the 0 with 1 gives me the EZeroDivide exception.) Disclaimer I am rather new to COM/OLE automation. I am sorry for any ignorance or stupidity shown by me in this post...

Read the article

Any simple shape recognition libraries for Java?

- by Phil

I am working on a on-screen keyboard for Android, and I need to recognize starting points, turning points and end points of lines drawn by the user on the keyboard. A simple straightening function would be nice, as it is difficult to draw a perfectly straight line even with a stylus, not to mention finger-only touchscreens today. What I am trying to write is something like Swype. Any good libraries that I can use or make reference to?

Read the article

How to add words to an already loaded grammar using System.Speech and SAPI 5.3

- by Kim Major

Given the following code, Choices choices = new Choices(); choices.Add(new GrammarBuilder(new SemanticResultValue("product", "<product/>"))); GrammarBuilder builder = new GrammarBuilder(); builder.Append(new SemanticResultKey("options", choices.ToGrammarBuilder())); Grammar grammar = new Grammar(builder) { Name = Constants.GrammarNameLanguage}; grammar.Priority = priority; _recognition.LoadGrammar(grammar); How can I add additional words to the loaded grammar? I know this can be achieved both in native code and using the SpeechLib interop, but I prefer to use the managed library. Update: What I want to achieve, is not having to load an entire grammar repeatedly because of individual changes. For small grammars I got good results by calling _recognition.RequestRecognizerUpdate() and then doing the unload of the old grammar and loading of a rebuilt grammar in the event: void Recognition_RecognizerUpdateReached(object sender, RecognizerUpdateReachedEventArgs e) For large grammars this becomes too expensive.

Read the article

Text and Picture Block Recognition in Picture

- by Murat

Hi, I'm new in Image Proccessing. I have an image and I want to parsing Text blocks and picture blocks. How can I do? I don't use OCR components. I need algorithm or sample or suggestion.

Read the article

High-Quality Text-To-Speech engine for personal use

- by phihag

I'm looking for a high-quality TTS engine that I can afford (let's say less than 1000$). So far, I've tried flite and festival. However, while the results are certainly understandable, technical texts are hard to follow. Commercial TTS solutions from Loquendo and Readspeaker sound way better. However, these companies don't seem to be willing to sell their product to mere mortals - I can't find a price on either's homepage. So, what are good TTS solutions for personal use?

Read the article

.NET Speech recognition plugin Runtime Error: Unhandled Exception. What could possibly cause it?

- by manuel

I'm writing a plugin (dll file) for speech recognition, and I'm creating a WinForm as its interface/dialog. When I run the plugin and click the 'Speak' to start the initialization, I get an unhandled exception. Here is a piece of the code: public ref class Dialog : public System::Windows::Forms::Form { public: SpeechRecognitionEngine^ sre; private: System::Void btnSpeak_Click(System::Object^ sender, System::EventArgs^ e) { Initialize(); } protected: void Initialize() { if (System::Threading::Thread::CurrentThread->GetApartmentState() != System::Threading::ApartmentState::STA) { throw gcnew InvalidOperationException("UI thread required"); } //create the recognition engine sre = gcnew SpeechRecognitionEngine(); //set our recognition engine to use the default audio device sre->SetInputToDefaultAudioDevice(); //create a new GrammarBuilder to specify which commands we want to use GrammarBuilder^ grammarBuilder = gcnew GrammarBuilder(); //append all the choices we want for commands. //we want to be able to move, stop, quit the game, and check for the cake. grammarBuilder->Append(gcnew Choices("play", "stop")); //create the Grammar from th GrammarBuilder Grammar^ customGrammar = gcnew Grammar(grammarBuilder); //unload any grammars from the recognition engine sre->UnloadAllGrammars(); //load our new Grammar sre->LoadGrammar(customGrammar); //add an event handler so we get events whenever the engine recognizes spoken commands sre->SpeechRecognized += gcnew EventHandler<SpeechRecognizedEventArgs^> (this, &Dialog::sre_SpeechRecognized); //set the recognition engine to keep running after recognizing a command. //if we had used RecognizeMode.Single, the engine would quite listening after //the first recognized command. sre->RecognizeAsync(RecognizeMode::Multiple); //this->init(); } void sre_SpeechRecognized(Object^ sender, SpeechRecognizedEventArgs^ e) { //simple check to see what the result of the recognition was if (e->Result->Text == "play") { MessageBox(plugin.hwndParent, L"play", 0, 0); } if (e->Result->Text == "stop") { MessageBox(plugin.hwndParent, L"stop", 0, 0); } } };

Read the article

[Android SDK] Text-To-Speech addSpeech not working properly

- by arcoraven

Hi, I'm trying to get my Android app to play a .wav file recording of the word "Spinach Salad" whenever it sees that phrase being spoken by TTS. Here's the relevant code: spinach_salad.wav is located in /res/raw prodName = "Spinach Salad" mTts.addSpeech(prodName, "com.example.textextractor", R.raw.spinach_salad); ...and later in the code: mTts.speak("blah blah blah " + prodName, TextToSpeech.QUEUE_ADD, null); I've also tried: mTts.speak("blah blah blah Spinach Salad", TextToSpeech.QUEUE_ADD, null); and mTts.speak("blah blah blah", TextToSpeech.QUEUE_ADD, null); mTts.speak(productName_str, TextToSpeech.QUEUE_ADD, null); In both cases, I'm just hearing the TTS synthesized audio, rather than my custom .wav file. (On a related note, the last chunk of code sometimes speaks out of order, saying the second line before the first).

Read the article

Speech recognition grammar using Delphi

- by XBasic3000

I need help to make a SpeechRecoGrammar without using xml format. Like creating it on runtime on delphi. heres the outcome look like. <GRAMMAR LANGID="409"> i am OK roche edwin allan

Read the article

How to make HTML5 speech recognition not ask permission every time

- by user2081044

I have created a script that requires my microphone. It uses the HTML5 speech recognition API. Chrome asks permission every time I want to perform a speech recognition test. Javascript (partial) code that I am using: var recognition = new webkitSpeechRecognition(); recognition.continuous = true; recognition.interimResults = true; recognition.onresult = function(event) { console.log(event.results[0][0].transcript); if(event.results[0][0].transcript === 'print') { console.log(''); } }; recognition.start(); I have tried to add it into the list of exceptions in either Chrome and Flash player, but it still asks for permission. Printscreen: That message pops up everytime I click the button. Is there any way to disable Chrome for asking permission?

Read the article

- by DesigningCode

Today I was asked to write a wee application for someone so that they could turn pages on their ebooks without having to reach for their keyboard or mouse… that way they could do craft or knit or whatever they are doing while they are reading. I vaguely remember that windows has something built in, but have never really played with it before. I have in the past turned on the screen reader and impressed my kids by making the computer saying “amusing” phrases along the lines of “Zac has a smelly bum”. So instead of firing up Visual Studio and getting stuck into the juciy task of writing a speech recognition program…. I typed “speech recognition” into the start menu of my windows 7 computer. And wow! I’ve been playing with it for the last 40 minutes or so and have been most impressed. Dictation wise it certainly misses stuff or gets the wrong words, but I did the training and it certainly improved. But what I’m enjoying is controlling windows. for instance, to start this blog entry I said “Open Writer” and it worked no problem. In fact after I muddled my way through getting going with speech recognition I enjoyed saying “Open notepad” … “close” over and over again. It allows you to click anywhere on the screen, just say “mousegrid” and a 1-9 numbered grid comes up, say a number and it puts a smaller 1-9 numbered grid, and you hone in, till the middle square is on a place you want to click, then you say “click” or “double click”. if you want to enter a key, say “Press Tab” for example. inside programs it understands menu entries. In fact, while writing this I just said “File” “Save” and it happily saved. I think I will play around with this for a while more and try it out in visual studio. Might be quite good for being able to do menu entries instead of grabbing for my mouse…. can keep my hands on the keyboard. ok, wasn’t the first post I wanted to do on geeks with blogs! but hey… will do some techy posts soon.

Read the article

Oracle ERP Cloud Solution Defines Revenue Recognition Software Market

- by Steve Dalton

Normal 0 false false false EN-US X-NONE X-NONE Revenue is a fundamental yardstick of a company's performance, and one of the most important metrics for investors in the capital markets. So it’s no surprise that the accounting standard boards have devoted significant resources to this topic, with a key goal of ensuring that companies use a consistent method of recognizing revenue. Due to the myriad of revenue-generating transactions, and the divergent ways organizations recognize revenue today, the IFRS and FASB have been working for 12 years on a common set of accounting standards that apply to all industries in virtually all countries. Through their joint efforts on May 28, 2014 the FASB and IFRS released the IFRS 15 / ASU 2014-9 (Revenue from Contracts with Customers) converged accounting standard. This standard applies to revenue in all public companies, but heavily impacts organizations in any industry that might have complex sales contracts with multiple distinct deliverables (obligations). For example, an auto dealer who bundles free service with the sale of a car can only recognize the service revenue once the owner of the car brings it in for work. Similarly, high-tech companies that bundle software licenses, consulting, and support services on a sales contract will recognize bundled service revenue once the services are delivered. Now all companies need to review their revenue for hidden bundling and implicit obligations. Numerous time-consuming and judgmental activities must be performed to properly recognize revenue for complex sales contracts. To illustrate, after the contract is identified, organizations must identify and examine the distinct deliverables, determine the estimated selling price (ESP) for each deliverable, then allocate the total contract price to each deliverable based on the ESPs. In terms of accounting, organizations must determine whether the goods or services have been delivered or performed to the customer’s satisfaction, then either book revenue in the current period or record a liability for the obligation if revenue will be recognized in a future accounting period. Oracle Revenue Management Cloud was architected and developed so organizations can simplify and streamline revenue recognition. Among other capabilities, the solution uses business rules to efficiently identify and examine contracts, intelligently calculate and allocate deliverable prices based on prescribed inputs, and accurately recognize revenue for each deliverable based on customer satisfaction. "Oracle works very closely with our customers, the Big 4 accounting firms, and the accounting standard boards to deliver an adaptive, comprehensive, new generation revenue recognition solution,” said Rondy Ng, Senior Vice President, Applications Development. “With the recently announced IFRS 15 / ASU 2014-9, Oracle is ready to support customer adoption of the new standard with our Revenue Management Cloud,” said Rondy. Oracle Revenue Management Cloud, an integral part of Oracle Financials Cloud, helps organizations comply with accounting standards, provides them with confidence that reported revenue is materially accurate, and simplifies the accounting process for revenue recognition. Stay tuned to this blog for regular updates on Oracle Revenue Management Cloud. We also invite you to review our new oracle.com ERP pages @ oracle.com/erp. We will be updating these pages very soon with more information about Oracle Revenue Management Cloud.

Search Results

Search found 916 results on 37 pages for 'speech recognition'.

Page 5/37 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

- by Omry

- by Jakob Gade

- by Sebastián Grignoli

- by Parrot owner

- by Daniel Mošmondor

- by Chris Kugler

- by cMinor

- by eg123

- by user2966744

- by manuel

- by RaymondBelonia

- by user3237134

- by startuper

- by Andreas Rejbrand

- by Phil

- by Kim Major

- by Murat

- by phihag

- by manuel

- by arcoraven

- by XBasic3000

- by user2081044

- by DesigningCode

- by Steve Dalton

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >