Search Results

Search found 477 results on 20 pages for 'speech synthesis'.

Page 7/20 | < Previous Page | 3 4 5 6 7 8 9 10 11 12 13 14  | Next Page >

  • onServiceConnected never called after bindService method

    - by Tobia Loschiavo
    Hi, I have a particular situation: a service started by a broadcast receiver starts an activity. I want to make it possible for this activity to communicate back to the service. I have chosen to use AIDL to make it possible. Everything seems works good except for bindService() method called in onCreate() of the activity. bindService(), in fact, throws a null pointer exception because onServiceConnected() is never called while onBind() method of the service is. Anyway bindService() returns true. The service is obviously active because it starts the activity. I know that calling an activity from a service could sound strange, but unfortunately this is the only way to have speech recognition in a service. Thanks in advance

    Read the article

  • How can I prevent IIS from trying to load a dll?

    - by Abtin Forouzandeh
    My project is a Speech Server application using Windows Workflow. It runs as an app under IIS. It supports a plug-in system. Here is what is happening: Load DLL into memory and set the type on an InvokeWorkflow control. When the InvokeWorkflow control runs, it appears to correctly instantiate the workflow from the loaded assembly - it completes the Initialize method. Everything crashes an burns, the target workflow is never executed. I can resolve this by putting a copy of the DLL in the application's executing directory. The workflow then executes correctly So it appears that IIS is trying to reload the assembly, even though its already in memory. Is there anyway to alter or disable this behavior in IIS? Perhaps a hook I can write that will intercept the request to load the dll and use my own logic to do so?

    Read the article

  • CoInitialize fails in dll

    - by Quandary
    Question: I have the following program, which uses COM to use the Microsoft Speech API (SAPI) to take a text and output it as sound. Now it works fine as long as I have it in a .exe. When I load it as .dll, it fails. Why? I used dependencywalker, and saw the exe doesn't have MSVCR100D and ole32, so I loaded them like this: LoadLibraryA("MSVCR100D.DLL"); LoadLibraryA("ole32.dll"); but it didn't help... Any idea why ? #include <windows.h> #include <sapi.h> #include <cstdlib> int main(int argc, char* argv[]) { ISpVoice * pVoice = NULL; if (FAILED(::CoInitialize(NULL))) return FALSE; HRESULT hr = CoCreateInstance(CLSID_SpVoice, NULL, CLSCTX_ALL, IID_ISpVoice, (void **) &pVoice); if( SUCCEEDED( hr ) ) { hr = pVoice->Speak(L"Noobie was fragged by GSG9 Googlebot", 0, NULL); hr = pVoice->Speak(L"Test Test", 0, NULL); hr = pVoice->Speak(L"This sounds normal <pitch middle = '-10'/> but the pitch drops half way through", SPF_IS_XML, NULL ); pVoice->Release(); pVoice = NULL; } ::CoUninitialize(); return EXIT_SUCCESS; }

    Read the article

  • What does "Value does not fall within expected range" mean in runtime error?

    - by manuel
    Hi, I'm writing a plugin (dll file) using /clr and trying to implement speech recognition using .NET. But when I run it, I got a runtime error saying "Value does not fall within expected range", what does the message mean? public ref class Dialog : public System::Windows::Forms::Form { public: SpeechRecognitionEngine^ sre; private: System::Void btnSpeak_Click(System::Object^ sender, System::EventArgs^ e) { Initialize(); } protected: void Initialize() { //create the recognition engine sre = gcnew SpeechRecognitionEngine(); //set our recognition engine to use the default audio device sre->SetInputToDefaultAudioDevice(); //create a new GrammarBuilder to specify which commands we want to use GrammarBuilder^ grammarBuilder = gcnew GrammarBuilder(); //append all the choices we want for commands. //we want to be able to move, stop, quit the game, and check for the cake. grammarBuilder->Append(gcnew Choices("play", "stop")); //create the Grammar from th GrammarBuilder Grammar^ customGrammar = gcnew Grammar(grammarBuilder); //unload any grammars from the recognition engine sre->UnloadAllGrammars(); //load our new Grammar sre->LoadGrammar(customGrammar); //add an event handler so we get events whenever the engine recognizes spoken commands sre->SpeechRecognized += gcnew EventHandler<SpeechRecognizedEventArgs^> (this, &Dialog::sre_SpeechRecognized); //set the recognition engine to keep running after recognizing a command. //if we had used RecognizeMode.Single, the engine would quite listening after //the first recognized command. sre->RecognizeAsync(RecognizeMode::Multiple); //this->init(); } void sre_SpeechRecognized(Object^ sender, SpeechRecognizedEventArgs^ e) { //simple check to see what the result of the recognition was if (e->Result->Text == "play") { MessageBox(plugin.hwndParent, L"play", 0, 0); } if (e->Result->Text == "stop") { MessageBox(plugin.hwndParent, L"stop", 0, 0); } } };

    Read the article

  • How to use SAPI's SetNotifyCallbackFunction() in a CLR project with Windows Form as the interface wi

    - by manuel
    Hi, I'm trying to write a dll plugin for Winamp. I'm using Microsoft Visual Studio 2008 and Microsoft SAPI 5.1. I created the interface window using Windows Form (System::Windows::Forms::Form). I tried to use SetNotifyWIndowMessage(), but the method is never called when I speak to the microphone. So I tried using SetNotifyCallbackFunction(), but I got a compile error saying that I should use '&' in front of the method name in the parameter. However, when I add the '&', I got another compile error saying that i can't take the address of the method unless creating delegate instance. What should I do? Someone please help me..

    Read the article

  • LoadDictation with SAPI

    - by Naveen
    I am able to create alternate dictation grammars using the dictation resource kit or directions given here. I am not able to load the new dictation topic with c++. I am trying to modify the simpledict sample provided with the sapi5.1 sdk. The following doesn't work. std::wstring stemp = s2ws("grammar:dictation#Genre"); LPCWSTR mygrammar = stemp.c_str(); hr = m_cpDictationGrammar-LoadDictation(mygrammar, SPLO_STATIC);

    Read the article

  • Freetts problem in Java

    - by manugupt1
    I am trying to run a program using freetts. I am able to compile the program however I am not able to use kevin or mbrola voices I get the follwing o/p msgs at the end System property "mbrola.base" is undefined. Will not use MBROLA voices. LINE UNAVAILABLE: Format is pcm_signed 16000.0 Hz 16 bits 1 channel big endian

    Read the article

  • Jacob + Microsoft SpeechAPI trouble

    - by guai
    Following groovy code kills JVM on SetInterest method invocation It uses Scriptom, the groovy DSL for Jacob Spend whole day searching SAPI's SPFEI macro to rewrite it in groovy. Don't sure does it came out right. Or maybe other bugs %) Help welcomed import org.codehaus.groovy.scriptom.* import static org.codehaus.groovy.scriptom.tlb.sapi.SpeechVoiceSpeakFlags.* import static org.codehaus.groovy.scriptom.tlb.sapi.SpeechRunState.* import static org.codehaus.groovy.scriptom.tlb.sapi.SpeechLib.* import static org.codehaus.groovy.scriptom.tlb.sapi.SPEVENTENUM.* def spfei(Integer[] args) { res = 1<<SPEI_RESERVED1 | 1<<SPEI_RESERVED2 args.each {res |= 1<<it} res } Scriptom.inApartment { def voice = new ActiveXObject('SAPI.SpVoice') evtsrc = voice.toInterface(ISpEventSource) evtsrc.setInterest(spfei(SPEI_WORD_BOUNDARY), spfei(SPEI_WORD_BOUNDARY)) evtsrc.events.Callbacks = {args -> println 'jjj'} voice.speak "Hello, world", SVSFlagsAsync }

    Read the article

  • TextToSpeech setOnUtteranceCompletedListener always returns -1 error?

    - by Robert Nekic
    I've been working with Android's TTS functions with general success however, one piece of it refuses to work for me; I can not successfully assign an OnUtteranceCompletedListener to my TextToSpeech object. I've tried implementing OnUtteranceCompletedListener in one of my classes and I've tried creating a new, stand-alone OnUtteranceCompletedListener instance. Both approaches are simple enough to implement and appear to yield proper listeners without exceptions...yet setOnUtteranceCompleteListener(myListener) ALWAYS returns -1 (ERROR). The documentation for this seems straight forward. Has anyone gotten this to work? I'm targeting SDK 4. Are there known issues with this with SDK4/v1.6?

    Read the article

  • Why doesn't SetNotifyWindowMessage() call my WndProc()?

    - by manuel
    I'm using WinForms, and I'm trying to get SetNotifyWindowMessage() to call the WndProc, but it does not do so. The function call: HRESULT initSAPI(HWND hWnd) { ... if(FAILED( g_cpRecoCtxt->SetNotifyWindowMessage( hWnd, WM_RECOEVENT, 0, 0 ))) MessageBoxW(hWnd, L"Error sending window message", L"SAPI Initialization Error", 0); ... } The WndProc: LRESULT WndProc (HWND hWnd, UINT message, WPARAM wparam, LPARAM lparam) { case WM_RECOEVENT: ProcessRecoEvent(hWnd); break; default: return DefWindowProc(hWnd, message, wParam, lParam); } Note: initSAPI() is called on a mouse click event.

    Read the article

  • Playback of opus-codec on Android

    - by qdot
    I'm looking for a way to integrate opus-codec (the decoder part) with my Android application. Do you know of any implementations that have done so? We are currently using ogg-vorbis for spoken prompts, considering going with either speex (deprecated, but with few documented attempts) or opus (currently no documented attempts). If we would have to go the NDK route, do you think it should provide us with a application size improvement? OggVorbis is supported by the platform, neither speex nor opus are.

    Read the article

  • Great computer-science speeches

    - by sub
    I've looked into some questions here where the "best" programming books are listed and then thought why there isn't a question concerning speeches yet. I think that speeches or presentations from developers or even creators of programming languages which were or are heavily used at some point are particulary interesting. One of my favorite speeches was recommended to me by someone here on SO: The future of C# I also like Guido van Rossum's speeches but he sometimes seems pretty nervous. Another in my opinion good presentation would be the Google tech talk about Go. Which (recorded) programming presentations/speeches are worth watching? edit: Made this a community wiki as the answer would probably be a pretty long list.

    Read the article

  • Recognizing individual voices

    - by raheel
    I plan to write a conversation analysis software, which will recognize the individual speakers, their pitch and intensity. Pitch and intensity are somewhat straightforward (pitch via autocorrelation). How would I go about recognizing individual speakers? For starters I can assume that only one person speaks at a time.

    Read the article

  • Manipulating a NSTextField via AppleScript

    - by Garry
    A little side project I'm working on is a digital life assistant, much like project JARVIS. What I'm trying to do is speak to my mac, have my words translated to text and then have the text interpreted by my program. Currently, my app is very simple, consisting of a single window containing a single wrapped NSTextView. Using MacSpeech Dictate, When I say the custom command "Jeeves", MacSpeech ensures that my app is frontmost, highlights any text in the TextField and clears it, then presses the Return key to trigger the textDidEndEditing method of NSTextField. This is done via Applescript. MacSpeech then switches to dictation mode and the next sentence I say will appear in the NSTextField. What I can't figure out is how to signify that I have finished saying a command to my program. I could simply say another keyword like "execute" or something similar that would send an AppleScript return keystroke to my app (thereby triggering the textDidEndEditing event) but this is cumbersome. Is there a notification that happens when text is pasted into a NSTextField? Would a timer work that would fire after maybe three seconds once my program becomes frontmost (three seconds should be sufficient for me to say a command)? Thanks,

    Read the article

  • Call RecognizerIntent from service

    - by Tobia Loschiavo
    Hi, I am working on an Android service. I need to call RecognizerIntent from a service in order to use in the service the recognized text. I have no startActivityForResult() method in Service class so I have problem understanding how to achieve this task. Is it possible? Many thanks

    Read the article

  • Queries with developing a Voice to Text Based Software

    - by harigm
    I am looking for any software which converts the voice to the text. I can get some software which can easily convert the english launguage voice to English text. But my intention is be it any language, whatever the system gets voice that should give the output in the text format in English. Is it possible to get this kind of software? If yes any open source available to help me to use this? If not, Is this feasible to develop this kind of software, Can any one guide how to and where to begin with? I am looking for windows based software

    Read the article

  • How can I change how OS X's 'say' command pronounces a word?

    - by jwhitlock
    OS X's say command is useful for some tasks (such as Skype's 'notify me when a contact comes online), but it is pronouncing some names incorrectly. Is there a way to teach say to pronounce a word differently? For example, try: say "Hi, Joel Spolsky" The 'ol' sounds like 'ball' rather than 'old'. I'd like to add an exception that say "Pronounce Spolsky like this", rather than try to teach new linguistic rules. I bet there is a way since it can pronounce "iphone" as Apple wants. Update - After some research, here's what I've learned: Text-to-speech is split between turning the text to phonemes, and then the phonemes are turned into audio using a voice. Changing the voice doesn't effect the phonemes. The Speech Synthesis Manager has some functions for turning text to phonemes, and a method for registering a speech dictionary that will add new text-phoneme maps. However, Apple's speech dictionary must be in a binary form - I didn't find any plist XML. Using dtrace while running say, I found some interesting files opened in /System/Library/PrivateFrameworks/SpeechDictionary.framework/Resources. This is probably the speech dictionary, but they are all binary, except for Homophones, which is XML. Adding entries to Homophones does nothing - it is probably used in speech-to-text. They are also code signed by Apple - changing them may prevent some programs from working. PrefixDictionary CartNames CartLite SymbolDictionary Homophones There are ways to add text versions of application interface elements so VoiceOver works, a lot of which a developer gets for free, but there are tricky bits. The standard here appears to be to use a phonetic spelling as needed. My guesses are: say is a light layer of code on top of the Speech Synthesis Manager. It would be easy for the Apple devs to add a command line option to take the path to a speech dictionary plist for alternate phoneme mapping, but they didn't. It may be a useful open-source project to write a better say. Skype probably uses Speech Synthesis Manager directly, leaving no hooks to change the way my friend's names are pronounced, other than spelling them phonetically, which is silly. The easiest way to make a command line version of say is how JRobert suggested. Here's my quick implementation, using Doug Harris's spelling suggestion: #!/bin/sh echo $@ | tr '[A-Z]' '[a-z]' | sed "s/spolsky/spowlsky/g" | /usr/bin/say Finally, some fun command line stuff: # Apple is weird sqlite3 /System/Library/PrivateFrameworks/SpeechDictionary.framework/Resources/Tuples .dump # Get too much information about what files are being opened sudo dtrace -n 'syscall::open*:entry { printf("%s %s",execname,copyinstr(arg0)); }' # Just fun say -v bad "Joel Spolsky Spolsky Spolsky Spolsky Spolsky, Joel Spolsky Spolsky Spolsky Spolsky Spolsky" echo "scale=1000; 4*a(1)" | bc -l | say

    Read the article

  • Reading part of system file contents with PHP

    - by zx
    Hi, I have a config file, in etc/ called 1.conf Here is the contents.. [2-main] exten => s,1,Macro(speech,"hi {$VAR1} how is your day going?") exten => s,4,Macro(dial,2,555555555) exten => s,2,Macro(speech,"lkqejqe;j") exten => s,3,Macro(speech,"hi there") exten => s,5,Macro(speech,"this is a test ") exten => s,6,Macro(speech,"testing 2") exten => s,7,Macro(speech,"this is a test") exten => 7,1,Goto(2-tester2,s,1) exten => 1,1,Goto(2-aa,s,1) [2-tester] [2-aa] exten => 1,1,Goto(2-main,s,1) How can I read the content in between speech for example.. exten => s,6,Macro(speech,"testing 2") Just get "testing 2" from that. Thank you in advance!

    Read the article

  • Conceal packet loss in PCM stream

    - by ZeroDefect
    I am looking to use 'Packet Loss Concealment' to conceal lost PCM frames in an audio stream. Unfortunately, I cannot find a library that is accessible without all the licensing restrictions and code bloat (...up for some suggestions though). I have located some GPL code written by Steve Underwood for the Asterisk project which implements PLC. There are several limitations; although, as Steve suggests in his code, his algorithm can be applied to different streams with a bit of work. Currently, the code works with 8kHz 16-bit signed mono streams. Variations of the code can be found through a simple search of Google Code Search. My hope is that I can adapt the code to work with other streams. Initially, the goal is to adjust the algorithm for 8+ kHz, 16-bit signed, multichannel audio (all in a C++ environment). Eventually, I'm looking to make the code available under the GPL license in hopes that it could be of benefit to others... Attached is the code below with my efforts. The code includes a main function that will "drop" a number of frames with a given probability. Unfortunately, the code does not quite work as expected. I'm receiving EXC_BAD_ACCESS when running in gdb, but I don't get a trace from gdb when using 'bt' command. Clearly, I'm trampimg on memory some where but not sure exactly where. When I comment out the *amdf_pitch* function, the code runs without crashing... int main (int argc, char *argv[]) { std::ifstream fin("C:\\cc32kHz.pcm"); if(!fin.is_open()) { std::cout << "Failed to open input file" << std::endl; return 1; } std::ofstream fout_repaired("C:\\cc32kHz_repaired.pcm"); if(!fout_repaired.is_open()) { std::cout << "Failed to open output repaired file" << std::endl; return 1; } std::ofstream fout_lossy("C:\\cc32kHz_lossy.pcm"); if(!fout_lossy.is_open()) { std::cout << "Failed to open output repaired file" << std::endl; return 1; } audio::PcmConcealer Concealer; Concealer.Init(1, 16, 32000); //Generate random numbers; srand( time(NULL) ); int value = 0; int probability = 5; while(!fin.eof()) { char arr[2]; fin.read(arr, 2); //Generate's random number; value = rand() % 100 + 1; if(value <= probability) { char blank[2] = {0x00, 0x00}; fout_lossy.write(blank, 2); //Fill in data; Concealer.Fill((int16_t *)blank, 1); fout_repaired.write(blank, 2); } else { //Write data to file; fout_repaired.write(arr, 2); fout_lossy.write(arr, 2); Concealer.Receive((int16_t *)arr, 1); } } fin.close(); fout_repaired.close(); fout_lossy.close(); return 0; } PcmConcealer.hpp /* * Code adapted from Steve Underwood of the Asterisk Project. This code inherits * the same licensing restrictions as the Asterisk Project. */ #ifndef __PCMCONCEALER_HPP__ #define __PCMCONCEALER_HPP__ /** 1. What does it do? The packet loss concealment module provides a suitable synthetic fill-in signal, to minimise the audible effect of lost packets in VoIP applications. It is not tied to any particular codec, and could be used with almost any codec which does not specify its own procedure for packet loss concealment. Where a codec specific concealment procedure exists, the algorithm is usually built around knowledge of the characteristics of the particular codec. It will, therefore, generally give better results for that particular codec than this generic concealer will. 2. How does it work? While good packets are being received, the plc_rx() routine keeps a record of the trailing section of the known speech signal. If a packet is missed, plc_fillin() is called to produce a synthetic replacement for the real speech signal. The average mean difference function (AMDF) is applied to the last known good signal, to determine its effective pitch. Based on this, the last pitch period of signal is saved. Essentially, this cycle of speech will be repeated over and over until the real speech resumes. However, several refinements are needed to obtain smooth pleasant sounding results. - The two ends of the stored cycle of speech will not always fit together smoothly. This can cause roughness, or even clicks, at the joins between cycles. To soften this, the 1/4 pitch period of real speech preceeding the cycle to be repeated is blended with the last 1/4 pitch period of the cycle to be repeated, using an overlap-add (OLA) technique (i.e. in total, the last 5/4 pitch periods of real speech are used). - The start of the synthetic speech will not always fit together smoothly with the tail of real speech passed on before the erasure was identified. Ideally, we would like to modify the last 1/4 pitch period of the real speech, to blend it into the synthetic speech. However, it is too late for that. We could have delayed the real speech a little, but that would require more buffer manipulation, and hurt the efficiency of the no-lost-packets case (which we hope is the dominant case). Instead we use a degenerate form of OLA to modify the start of the synthetic data. The last 1/4 pitch period of real speech is time reversed, and OLA is used to blend it with the first 1/4 pitch period of synthetic speech. The result seems quite acceptable. - As we progress into the erasure, the chances of the synthetic signal being anything like correct steadily fall. Therefore, the volume of the synthesized signal is made to decay linearly, such that after 50ms of missing audio it is reduced to silence. - When real speech resumes, an extra 1/4 pitch period of sythetic speech is blended with the start of the real speech. If the erasure is small, this smoothes the transition. If the erasure is long, and the synthetic signal has faded to zero, the blending softens the start up of the real signal, avoiding a kind of "click" or "pop" effect that might occur with a sudden onset. 3. How do I use it? Before audio is processed, call plc_init() to create an instance of the packet loss concealer. For each received audio packet that is acceptable (i.e. not including those being dropped for being too late) call plc_rx() to record the content of the packet. Note this may modify the packet a little after a period of packet loss, to blend real synthetic data smoothly. When a real packet is not available in time, call plc_fillin() to create a sythetic substitute. That's it! */ /*! Minimum allowed pitch (66 Hz) */ #define PLC_PITCH_MIN(SAMPLE_RATE) ((double)(SAMPLE_RATE) / 66.6) /*! Maximum allowed pitch (200 Hz) */ #define PLC_PITCH_MAX(SAMPLE_RATE) ((SAMPLE_RATE) / 200) /*! Maximum pitch OLA window */ //#define PLC_PITCH_OVERLAP_MAX(SAMPLE_RATE) ((PLC_PITCH_MIN(SAMPLE_RATE)) >> 2) /*! The length over which the AMDF function looks for similarity (20 ms) */ #define CORRELATION_SPAN(SAMPLE_RATE) ((20 * (SAMPLE_RATE)) / 1000) /*! History buffer length. The buffer must also be at leat 1.25 times PLC_PITCH_MIN, but that is much smaller than the buffer needs to be for the pitch assessment. */ //#define PLC_HISTORY_LEN(SAMPLE_RATE) ((CORRELATION_SPAN(SAMPLE_RATE)) + (PLC_PITCH_MIN(SAMPLE_RATE))) namespace audio { typedef struct { /*! Consecutive erased samples */ int missing_samples; /*! Current offset into pitch period */ int pitch_offset; /*! Pitch estimate */ int pitch; /*! Buffer for a cycle of speech */ float *pitchbuf;//[PLC_PITCH_MIN]; /*! History buffer */ short *history;//[PLC_HISTORY_LEN]; /*! Current pointer into the history buffer */ int buf_ptr; } plc_state_t; class PcmConcealer { public: PcmConcealer(); ~PcmConcealer(); void Init(int channels, int bit_depth, int sample_rate); //Process a block of received audio samples. int Receive(short amp[], int frames); //Fill-in a block of missing audio samples. int Fill(short amp[], int frames); void Destroy(); private: int amdf_pitch(int min_pitch, int max_pitch, short amp[], int channel_index, int frames); void save_history(plc_state_t *s, short *buf, int channel_index, int frames); void normalise_history(plc_state_t *s); /** Holds the states of each of the channels **/ std::vector< plc_state_t * > ChannelStates; int plc_pitch_min; int plc_pitch_max; int plc_pitch_overlap_max; int correlation_span; int plc_history_len; int channel_count; int sample_rate; bool Initialized; }; } #endif PcmConcealer.cpp /* * Code adapted from Steve Underwood of the Asterisk Project. This code inherits * the same licensing restrictions as the Asterisk Project. */ #include "audio/PcmConcealer.hpp" /* We do a straight line fade to zero volume in 50ms when we are filling in for missing data. */ #define ATTENUATION_INCREMENT 0.0025 /* Attenuation per sample */ #if !defined(INT16_MAX) #define INT16_MAX (32767) #define INT16_MIN (-32767-1) #endif #ifdef WIN32 inline double rint(double x) { return floor(x + 0.5); } #endif inline short fsaturate(double damp) { if (damp > 32767.0) return INT16_MAX; if (damp < -32768.0) return INT16_MIN; return (short)rint(damp); } namespace audio { PcmConcealer::PcmConcealer() : Initialized(false) { } PcmConcealer::~PcmConcealer() { Destroy(); } void PcmConcealer::Init(int channels, int bit_depth, int sample_rate) { if(Initialized) return; if(channels <= 0 || bit_depth != 16) return; Initialized = true; channel_count = channels; this->sample_rate = sample_rate; ////////////// double min = PLC_PITCH_MIN(sample_rate); int imin = (int)min; double max = PLC_PITCH_MAX(sample_rate); int imax = (int)max; plc_pitch_min = imin; plc_pitch_max = imax; plc_pitch_overlap_max = (plc_pitch_min >> 2); correlation_span = CORRELATION_SPAN(sample_rate); plc_history_len = correlation_span + plc_pitch_min; ////////////// for(int i = 0; i < channel_count; i ++) { plc_state_t *t = new plc_state_t; memset(t, 0, sizeof(plc_state_t)); t->pitchbuf = new float[plc_pitch_min]; t->history = new short[plc_history_len]; ChannelStates.push_back(t); } } void PcmConcealer::Destroy() { if(!Initialized) return; while(ChannelStates.size()) { plc_state_t *s = ChannelStates.at(0); if(s) { if(s->history) delete s->history; if(s->pitchbuf) delete s->pitchbuf; memset(s, 0, sizeof(plc_state_t)); delete s; } ChannelStates.erase(ChannelStates.begin()); } ChannelStates.clear(); Initialized = false; } //Process a block of received audio samples. int PcmConcealer::Receive(short amp[], int frames) { if(!Initialized) return 0; int j = 0; for(int k = 0; k < ChannelStates.size(); k++) { int i; int overlap_len; int pitch_overlap; float old_step; float new_step; float old_weight; float new_weight; float gain; plc_state_t *s = ChannelStates.at(k); if (s->missing_samples) { /* Although we have a real signal, we need to smooth it to fit well with the synthetic signal we used for the previous block */ /* The start of the real data is overlapped with the next 1/4 cycle of the synthetic data. */ pitch_overlap = s->pitch >> 2; if (pitch_overlap > frames) pitch_overlap = frames; gain = 1.0 - s->missing_samples * ATTENUATION_INCREMENT; if (gain < 0.0) gain = 0.0; new_step = 1.0/pitch_overlap; old_step = new_step*gain; new_weight = new_step; old_weight = (1.0 - new_step)*gain; for (i = 0; i < pitch_overlap; i++) { int index = (i * channel_count) + j; amp[index] = fsaturate(old_weight * s->pitchbuf[s->pitch_offset] + new_weight * amp[index]); if (++s->pitch_offset >= s->pitch) s->pitch_offset = 0; new_weight += new_step; old_weight -= old_step; if (old_weight < 0.0) old_weight = 0.0; } s->missing_samples = 0; } save_history(s, amp, j, frames); j++; } return frames; } //Fill-in a block of missing audio samples. int PcmConcealer::Fill(short amp[], int frames) { if(!Initialized) return 0; int j =0; for(int k = 0; k < ChannelStates.size(); k++) { short *tmp = new short[plc_pitch_overlap_max]; int i; int pitch_overlap; float old_step; float new_step; float old_weight; float new_weight; float gain; short *orig_amp; int orig_len; orig_amp = amp; orig_len = frames; plc_state_t *s = ChannelStates.at(k); if (s->missing_samples == 0) { // As the gap in real speech starts we need to assess the last known pitch, //and prepare the synthetic data we will use for fill-in normalise_history(s); s->pitch = amdf_pitch(plc_pitch_min, plc_pitch_max, s->history + plc_history_len - correlation_span - plc_pitch_min, j, correlation_span); // We overlap a 1/4 wavelength pitch_overlap = s->pitch >> 2; // Cook up a single cycle of pitch, using a single of the real signal with 1/4 //cycle OLA'ed to make the ends join up nicely // The first 3/4 of the cycle is a simple copy for (i = 0; i < s->pitch - pitch_overlap; i++) s->pitchbuf[i] = s->history[plc_history_len - s->pitch + i]; // The last 1/4 of the cycle is overlapped with the end of the previous cycle new_step = 1.0/pitch_overlap; new_weight = new_step; for ( ; i < s->pitch; i++) { s->pitchbuf[i] = s->history[plc_history_len - s->pitch + i]*(1.0 - new_weight) + s->history[plc_history_len - 2*s->pitch + i]*new_weight; new_weight += new_step; } // We should now be ready to fill in the gap with repeated, decaying cycles // of what is in pitchbuf // We need to OLA the first 1/4 wavelength of the synthetic data, to smooth // it into the previous real data. To avoid the need to introduce a delay // in the stream, reverse the last 1/4 wavelength, and OLA with that. gain = 1.0; new_step = 1.0/pitch_overlap; old_step = new_step; new_weight = new_step; old_weight = 1.0 - new_step; for (i = 0; i < pitch_overlap; i++) { int index = (i * channel_count) + j; amp[index] = fsaturate(old_weight * s->history[plc_history_len - 1 - i] + new_weight * s->pitchbuf[i]); new_weight += new_step; old_weight -= old_step; if (old_weight < 0.0) old_weight = 0.0; } s->pitch_offset = i; } else { gain = 1.0 - s->missing_samples*ATTENUATION_INCREMENT; i = 0; } for ( ; gain > 0.0 && i < frames; i++) { int index = (i * channel_count) + j; amp[index] = s->pitchbuf[s->pitch_offset]*gain; gain -= ATTENUATION_INCREMENT; if (++s->pitch_offset >= s->pitch) s->pitch_offset = 0; } for ( ; i < frames; i++) { int index = (i * channel_count) + j; amp[i] = 0; } s->missing_samples += orig_len; save_history(s, amp, j, frames); delete [] tmp; j++; } return frames; } void PcmConcealer::save_history(plc_state_t *s, short *buf, int channel_index, int frames) { if (frames >= plc_history_len) { /* Just keep the last part of the new data, starting at the beginning of the buffer */ //memcpy(s->history, buf + len - plc_history_len, sizeof(short)*plc_history_len); int frames_to_copy = plc_history_len; for(int i = 0; i < frames_to_copy; i ++) { int index = (channel_count * (i + frames - plc_history_len)) + channel_index; s->history[i] = buf[index]; } s->buf_ptr = 0; return; } if (s->buf_ptr + frames > plc_history_len) { /* Wraps around - must break into two sections */ //memcpy(s->history + s->buf_ptr, buf, sizeof(short)*(plc_history_len - s->buf_ptr)); short *hist_ptr = s->history + s->buf_ptr; int frames_to_copy = plc_history_len - s->buf_ptr; for(int i = 0; i < frames_to_copy; i ++) { int index = (channel_count * i) + channel_index; hist_ptr[i] = buf[index]; } frames -= (plc_history_len - s->buf_ptr); //memcpy(s->history, buf + (plc_history_len - s->buf_ptr), sizeof(short)*len); frames_to_copy = frames; for(int i = 0; i < frames_to_copy; i ++) { int index = (channel_count * (i + (plc_history_len - s->buf_ptr))) + channel_index; s->history[i] = buf[index]; } s->buf_ptr = frames; return; } /* Can use just one section */ //memcpy(s->history + s->buf_ptr, buf, sizeof(short)*len); short *hist_ptr = s->history + s->buf_ptr; int frames_to_copy = frames; for(int i = 0; i < frames_to_copy; i ++) { int index = (channel_count * i) + channel_index; hist_ptr[i] = buf[index]; } s->buf_ptr += frames; } void PcmConcealer::normalise_history(plc_state_t *s) { short *tmp = new short[plc_history_len]; if (s->buf_ptr == 0) return; memcpy(tmp, s->history, sizeof(short)*s->buf_ptr); memcpy(s->history, s->history + s->buf_ptr, sizeof(short)*(plc_history_len - s->buf_ptr)); memcpy(s->history + plc_history_len - s->buf_ptr, tmp, sizeof(short)*s->buf_ptr); s->buf_ptr = 0; delete [] tmp; } int PcmConcealer::amdf_pitch(int min_pitch, int max_pitch, short amp[], int channel_index, int frames) { int i; int j; int acc; int min_acc; int pitch; pitch = min_pitch; min_acc = INT_MAX; for (i = max_pitch; i <= min_pitch; i++) { acc = 0; for (j = 0; j < frames; j++) { int index1 = (channel_count * (i+j)) + channel_index; int index2 = (channel_count * j) + channel_index; //std::cout << "Index 1: " << index1 << ", Index 2: " << index2 << std::endl; acc += abs(amp[index1] - amp[index2]); } if (acc < min_acc) { min_acc = acc; pitch = i; } } std::cout << "Pitch: " << pitch << std::endl; return pitch; } } P.S. - I must confess that digital audio is not my forte...

    Read the article

  • How does text-to-speech shortcut read any highlighted text?

    - by TP
    Hi, I am trying to develop a cocoa application that requires to read highlighted text from any application. But so far I cannot find any decent solution because the accessibility API isn't always work. (e.g. Firefox) Does anyone know it is implemented in text-to-speech included in Leopard? I have found this SO question but it isn't solved.

    Read the article

  • Attempting to calculate width of Map Overlays on the fly

    - by Bloudermilk
    Hey all- I am working on an Android app that utilizes the Google Maps API MapView, MapController, MapActivity, and ItemizedOverlay. I am basically trying to recreate certain functionalities of the Maps app (damn Google for not providing speech bubbles—for lack of a better name—for items!), particularly those speech bubbles. I have an invisible XML structure for the speech bubble in the XML layout file containing my MapView. The first time I show a speech bubble I grab that XML and remove it from it's current parent, applying some ItemizedOverlay.LayoutParams to it, and add it to the MapView as an Overlay. I position it above the item that was selected, fill it with the proper text, then set it to visible. This all works great. The goal here, though, is to also automatically animate the map to reveal any parts of a speech bubble that may be off-screen when it opens. So I'm trying popup.getWidth() (popup is the instance of my LinearLayout that is the speech bubble) after I do all the manipulation to the bubble, even after I display it to the user. Problem is, popup.getWidth() is returning me the width of the previously displayed popup, not the currently displayed one. I can't figure out why this would be happening if I'm fetching the width after I set it to visible with its new dimensions (which, by the way, are relative when I'm setting them with LayoutParams: fill_content for both width and height).. I have even tried forcing both the MapView and the "popup" to invalidate() before trying to fetch the width. Any ideas why this may be happening? How can I force the View to settle into its new dimensions before trying to fetch them? Thanks! Nick

    Read the article

< Previous Page | 3 4 5 6 7 8 9 10 11 12 13 14  | Next Page >