speech - Page 7 - Developer IT

TextToSpeech setOnUtteranceCompletedListener always returns -1 error?

- by Robert Nekic

I've been working with Android's TTS functions with general success however, one piece of it refuses to work for me; I can not successfully assign an OnUtteranceCompletedListener to my TextToSpeech object. I've tried implementing OnUtteranceCompletedListener in one of my classes and I've tried creating a new, stand-alone OnUtteranceCompletedListener instance. Both approaches are simple enough to implement and appear to yield proper listeners without exceptions...yet setOnUtteranceCompleteListener(myListener) ALWAYS returns -1 (ERROR). The documentation for this seems straight forward. Has anyone gotten this to work? I'm targeting SDK 4. Are there known issues with this with SDK4/v1.6?

Read the article

New Dragon NaturallySpeaking Command

- by Danni

Can you make silence a command for Dragon NaturallySpeaking? For example, if I wanted a new line after every two second pause, is that possible?

Read the article

How to get Phonemes on voice recognition?

- by XBasic3000

i am working on Voice Recognition to Display the Phonemes and its wave form using the built-in voice recognition on vista and windows 7 using Delphi2009.

Read the article

Playback of opus-codec on Android

- by qdot

I'm looking for a way to integrate opus-codec (the decoder part) with my Android application. Do you know of any implementations that have done so? We are currently using ogg-vorbis for spoken prompts, considering going with either speex (deprecated, but with few documented attempts) or opus (currently no documented attempts). If we would have to go the NDK route, do you think it should provide us with a application size improvement? OggVorbis is supported by the platform, neither speex nor opus are.

Read the article

Why doesn't SetNotifyWindowMessage() call my WndProc()?

- by manuel

I'm using WinForms, and I'm trying to get SetNotifyWindowMessage() to call the WndProc, but it does not do so. The function call: HRESULT initSAPI(HWND hWnd) { ... if(FAILED( g_cpRecoCtxt->SetNotifyWindowMessage( hWnd, WM_RECOEVENT, 0, 0 ))) MessageBoxW(hWnd, L"Error sending window message", L"SAPI Initialization Error", 0); ... } The WndProc: LRESULT WndProc (HWND hWnd, UINT message, WPARAM wparam, LPARAM lparam) { case WM_RECOEVENT: ProcessRecoEvent(hWnd); break; default: return DefWindowProc(hWnd, message, wParam, lParam); } Note: initSAPI() is called on a mouse click event.

Read the article

Great computer-science speeches

- by sub

I've looked into some questions here where the "best" programming books are listed and then thought why there isn't a question concerning speeches yet. I think that speeches or presentations from developers or even creators of programming languages which were or are heavily used at some point are particulary interesting. One of my favorite speeches was recommended to me by someone here on SO: The future of C# I also like Guido van Rossum's speeches but he sometimes seems pretty nervous. Another in my opinion good presentation would be the Google tech talk about Go. Which (recorded) programming presentations/speeches are worth watching? edit: Made this a community wiki as the answer would probably be a pretty long list.

Read the article

Recognizing individual voices

- by raheel

I plan to write a conversation analysis software, which will recognize the individual speakers, their pitch and intensity. Pitch and intensity are somewhat straightforward (pitch via autocorrelation). How would I go about recognizing individual speakers? For starters I can assume that only one person speaks at a time.

Read the article

Manipulating a NSTextField via AppleScript

- by Garry

A little side project I'm working on is a digital life assistant, much like project JARVIS. What I'm trying to do is speak to my mac, have my words translated to text and then have the text interpreted by my program. Currently, my app is very simple, consisting of a single window containing a single wrapped NSTextView. Using MacSpeech Dictate, When I say the custom command "Jeeves", MacSpeech ensures that my app is frontmost, highlights any text in the TextField and clears it, then presses the Return key to trigger the textDidEndEditing method of NSTextField. This is done via Applescript. MacSpeech then switches to dictation mode and the next sentence I say will appear in the NSTextField. What I can't figure out is how to signify that I have finished saying a command to my program. I could simply say another keyword like "execute" or something similar that would send an AppleScript return keystroke to my app (thereby triggering the textDidEndEditing event) but this is cumbersome. Is there a notification that happens when text is pasted into a NSTextField? Would a timer work that would fire after maybe three seconds once my program becomes frontmost (three seconds should be sufficient for me to say a command)? Thanks,

Read the article

Call RecognizerIntent from service

- by Tobia Loschiavo

Hi, I am working on an Android service. I need to call RecognizerIntent from a service in order to use in the service the recognized text. I have no startActivityForResult() method in Service class so I have problem understanding how to achieve this task. Is it possible? Many thanks

Read the article

Queries with developing a Voice to Text Based Software

- by harigm

I am looking for any software which converts the voice to the text. I can get some software which can easily convert the english launguage voice to English text. But my intention is be it any language, whatever the system gets voice that should give the output in the text format in English. Is it possible to get this kind of software? If yes any open source available to help me to use this? If not, Is this feasible to develop this kind of software, Can any one guide how to and where to begin with? I am looking for windows based software

Read the article

How can I change how OS X's 'say' command pronounces a word?

- by jwhitlock

OS X's say command is useful for some tasks (such as Skype's 'notify me when a contact comes online), but it is pronouncing some names incorrectly. Is there a way to teach say to pronounce a word differently? For example, try: say "Hi, Joel Spolsky" The 'ol' sounds like 'ball' rather than 'old'. I'd like to add an exception that say "Pronounce Spolsky like this", rather than try to teach new linguistic rules. I bet there is a way since it can pronounce "iphone" as Apple wants. Update - After some research, here's what I've learned: Text-to-speech is split between turning the text to phonemes, and then the phonemes are turned into audio using a voice. Changing the voice doesn't effect the phonemes. The Speech Synthesis Manager has some functions for turning text to phonemes, and a method for registering a speech dictionary that will add new text-phoneme maps. However, Apple's speech dictionary must be in a binary form - I didn't find any plist XML. Using dtrace while running say, I found some interesting files opened in /System/Library/PrivateFrameworks/SpeechDictionary.framework/Resources. This is probably the speech dictionary, but they are all binary, except for Homophones, which is XML. Adding entries to Homophones does nothing - it is probably used in speech-to-text. They are also code signed by Apple - changing them may prevent some programs from working. PrefixDictionary CartNames CartLite SymbolDictionary Homophones There are ways to add text versions of application interface elements so VoiceOver works, a lot of which a developer gets for free, but there are tricky bits. The standard here appears to be to use a phonetic spelling as needed. My guesses are: say is a light layer of code on top of the Speech Synthesis Manager. It would be easy for the Apple devs to add a command line option to take the path to a speech dictionary plist for alternate phoneme mapping, but they didn't. It may be a useful open-source project to write a better say. Skype probably uses Speech Synthesis Manager directly, leaving no hooks to change the way my friend's names are pronounced, other than spelling them phonetically, which is silly. The easiest way to make a command line version of say is how JRobert suggested. Here's my quick implementation, using Doug Harris's spelling suggestion: #!/bin/sh echo $@ | tr '[A-Z]' '[a-z]' | sed "s/spolsky/spowlsky/g" | /usr/bin/say Finally, some fun command line stuff: # Apple is weird sqlite3 /System/Library/PrivateFrameworks/SpeechDictionary.framework/Resources/Tuples .dump # Get too much information about what files are being opened sudo dtrace -n 'syscall::open*:entry { printf("%s %s",execname,copyinstr(arg0)); }' # Just fun say -v bad "Joel Spolsky Spolsky Spolsky Spolsky Spolsky, Joel Spolsky Spolsky Spolsky Spolsky Spolsky" echo "scale=1000; 4*a(1)" | bc -l | say

Read the article

Webinar: Speech & Mobility: The "New Touch"?

Online event to provide insight into the shift from touch to text input.

Read the article

Reading part of system file contents with PHP

- by zx

Hi, I have a config file, in etc/ called 1.conf Here is the contents.. [2-main] exten => s,1,Macro(speech,"hi {$VAR1} how is your day going?") exten => s,4,Macro(dial,2,555555555) exten => s,2,Macro(speech,"lkqejqe;j") exten => s,3,Macro(speech,"hi there") exten => s,5,Macro(speech,"this is a test ") exten => s,6,Macro(speech,"testing 2") exten => s,7,Macro(speech,"this is a test") exten => 7,1,Goto(2-tester2,s,1) exten => 1,1,Goto(2-aa,s,1) [2-tester] [2-aa] exten => 1,1,Goto(2-main,s,1) How can I read the content in between speech for example.. exten => s,6,Macro(speech,"testing 2") Just get "testing 2" from that. Thank you in advance!

Read the article

Conceal packet loss in PCM stream

- by ZeroDefect

I am looking to use 'Packet Loss Concealment' to conceal lost PCM frames in an audio stream. Unfortunately, I cannot find a library that is accessible without all the licensing restrictions and code bloat (...up for some suggestions though). I have located some GPL code written by Steve Underwood for the Asterisk project which implements PLC. There are several limitations; although, as Steve suggests in his code, his algorithm can be applied to different streams with a bit of work. Currently, the code works with 8kHz 16-bit signed mono streams. Variations of the code can be found through a simple search of Google Code Search. My hope is that I can adapt the code to work with other streams. Initially, the goal is to adjust the algorithm for 8+ kHz, 16-bit signed, multichannel audio (all in a C++ environment). Eventually, I'm looking to make the code available under the GPL license in hopes that it could be of benefit to others... Attached is the code below with my efforts. The code includes a main function that will "drop" a number of frames with a given probability. Unfortunately, the code does not quite work as expected. I'm receiving EXC_BAD_ACCESS when running in gdb, but I don't get a trace from gdb when using 'bt' command. Clearly, I'm trampimg on memory some where but not sure exactly where. When I comment out the *amdf_pitch* function, the code runs without crashing... int main (int argc, char *argv[]) { std::ifstream fin("C:\\cc32kHz.pcm"); if(!fin.is_open()) { std::cout << "Failed to open input file" << std::endl; return 1; } std::ofstream fout_repaired("C:\\cc32kHz_repaired.pcm"); if(!fout_repaired.is_open()) { std::cout << "Failed to open output repaired file" << std::endl; return 1; } std::ofstream fout_lossy("C:\\cc32kHz_lossy.pcm"); if(!fout_lossy.is_open()) { std::cout << "Failed to open output repaired file" << std::endl; return 1; } audio::PcmConcealer Concealer; Concealer.Init(1, 16, 32000); //Generate random numbers; srand( time(NULL) ); int value = 0; int probability = 5; while(!fin.eof()) { char arr[2]; fin.read(arr, 2); //Generate's random number; value = rand() % 100 + 1; if(value <= probability) { char blank[2] = {0x00, 0x00}; fout_lossy.write(blank, 2); //Fill in data; Concealer.Fill((int16_t *)blank, 1); fout_repaired.write(blank, 2); } else { //Write data to file; fout_repaired.write(arr, 2); fout_lossy.write(arr, 2); Concealer.Receive((int16_t *)arr, 1); } } fin.close(); fout_repaired.close(); fout_lossy.close(); return 0; } PcmConcealer.hpp /* * Code adapted from Steve Underwood of the Asterisk Project. This code inherits * the same licensing restrictions as the Asterisk Project. */ #ifndef __PCMCONCEALER_HPP__ #define __PCMCONCEALER_HPP__ /** 1. What does it do? The packet loss concealment module provides a suitable synthetic fill-in signal, to minimise the audible effect of lost packets in VoIP applications. It is not tied to any particular codec, and could be used with almost any codec which does not specify its own procedure for packet loss concealment. Where a codec specific concealment procedure exists, the algorithm is usually built around knowledge of the characteristics of the particular codec. It will, therefore, generally give better results for that particular codec than this generic concealer will. 2. How does it work? While good packets are being received, the plc_rx() routine keeps a record of the trailing section of the known speech signal. If a packet is missed, plc_fillin() is called to produce a synthetic replacement for the real speech signal. The average mean difference function (AMDF) is applied to the last known good signal, to determine its effective pitch. Based on this, the last pitch period of signal is saved. Essentially, this cycle of speech will be repeated over and over until the real speech resumes. However, several refinements are needed to obtain smooth pleasant sounding results. - The two ends of the stored cycle of speech will not always fit together smoothly. This can cause roughness, or even clicks, at the joins between cycles. To soften this, the 1/4 pitch period of real speech preceeding the cycle to be repeated is blended with the last 1/4 pitch period of the cycle to be repeated, using an overlap-add (OLA) technique (i.e. in total, the last 5/4 pitch periods of real speech are used). - The start of the synthetic speech will not always fit together smoothly with the tail of real speech passed on before the erasure was identified. Ideally, we would like to modify the last 1/4 pitch period of the real speech, to blend it into the synthetic speech. However, it is too late for that. We could have delayed the real speech a little, but that would require more buffer manipulation, and hurt the efficiency of the no-lost-packets case (which we hope is the dominant case). Instead we use a degenerate form of OLA to modify the start of the synthetic data. The last 1/4 pitch period of real speech is time reversed, and OLA is used to blend it with the first 1/4 pitch period of synthetic speech. The result seems quite acceptable. - As we progress into the erasure, the chances of the synthetic signal being anything like correct steadily fall. Therefore, the volume of the synthesized signal is made to decay linearly, such that after 50ms of missing audio it is reduced to silence. - When real speech resumes, an extra 1/4 pitch period of sythetic speech is blended with the start of the real speech. If the erasure is small, this smoothes the transition. If the erasure is long, and the synthetic signal has faded to zero, the blending softens the start up of the real signal, avoiding a kind of "click" or "pop" effect that might occur with a sudden onset. 3. How do I use it? Before audio is processed, call plc_init() to create an instance of the packet loss concealer. For each received audio packet that is acceptable (i.e. not including those being dropped for being too late) call plc_rx() to record the content of the packet. Note this may modify the packet a little after a period of packet loss, to blend real synthetic data smoothly. When a real packet is not available in time, call plc_fillin() to create a sythetic substitute. That's it! */ /*! Minimum allowed pitch (66 Hz) */ #define PLC_PITCH_MIN(SAMPLE_RATE) ((double)(SAMPLE_RATE) / 66.6) /*! Maximum allowed pitch (200 Hz) */ #define PLC_PITCH_MAX(SAMPLE_RATE) ((SAMPLE_RATE) / 200) /*! Maximum pitch OLA window */ //#define PLC_PITCH_OVERLAP_MAX(SAMPLE_RATE) ((PLC_PITCH_MIN(SAMPLE_RATE)) >> 2) /*! The length over which the AMDF function looks for similarity (20 ms) */ #define CORRELATION_SPAN(SAMPLE_RATE) ((20 * (SAMPLE_RATE)) / 1000) /*! History buffer length. The buffer must also be at leat 1.25 times PLC_PITCH_MIN, but that is much smaller than the buffer needs to be for the pitch assessment. */ //#define PLC_HISTORY_LEN(SAMPLE_RATE) ((CORRELATION_SPAN(SAMPLE_RATE)) + (PLC_PITCH_MIN(SAMPLE_RATE))) namespace audio { typedef struct { /*! Consecutive erased samples */ int missing_samples; /*! Current offset into pitch period */ int pitch_offset; /*! Pitch estimate */ int pitch; /*! Buffer for a cycle of speech */ float *pitchbuf;//[PLC_PITCH_MIN]; /*! History buffer */ short *history;//[PLC_HISTORY_LEN]; /*! Current pointer into the history buffer */ int buf_ptr; } plc_state_t; class PcmConcealer { public: PcmConcealer(); ~PcmConcealer(); void Init(int channels, int bit_depth, int sample_rate); //Process a block of received audio samples. int Receive(short amp[], int frames); //Fill-in a block of missing audio samples. int Fill(short amp[], int frames); void Destroy(); private: int amdf_pitch(int min_pitch, int max_pitch, short amp[], int channel_index, int frames); void save_history(plc_state_t *s, short *buf, int channel_index, int frames); void normalise_history(plc_state_t *s); /** Holds the states of each of the channels **/ std::vector< plc_state_t * > ChannelStates; int plc_pitch_min; int plc_pitch_max; int plc_pitch_overlap_max; int correlation_span; int plc_history_len; int channel_count; int sample_rate; bool Initialized; }; } #endif PcmConcealer.cpp /* * Code adapted from Steve Underwood of the Asterisk Project. This code inherits * the same licensing restrictions as the Asterisk Project. */ #include "audio/PcmConcealer.hpp" /* We do a straight line fade to zero volume in 50ms when we are filling in for missing data. */ #define ATTENUATION_INCREMENT 0.0025 /* Attenuation per sample */ #if !defined(INT16_MAX) #define INT16_MAX (32767) #define INT16_MIN (-32767-1) #endif #ifdef WIN32 inline double rint(double x) { return floor(x + 0.5); } #endif inline short fsaturate(double damp) { if (damp > 32767.0) return INT16_MAX; if (damp < -32768.0) return INT16_MIN; return (short)rint(damp); } namespace audio { PcmConcealer::PcmConcealer() : Initialized(false) { } PcmConcealer::~PcmConcealer() { Destroy(); } void PcmConcealer::Init(int channels, int bit_depth, int sample_rate) { if(Initialized) return; if(channels <= 0 || bit_depth != 16) return; Initialized = true; channel_count = channels; this->sample_rate = sample_rate; ////////////// double min = PLC_PITCH_MIN(sample_rate); int imin = (int)min; double max = PLC_PITCH_MAX(sample_rate); int imax = (int)max; plc_pitch_min = imin; plc_pitch_max = imax; plc_pitch_overlap_max = (plc_pitch_min >> 2); correlation_span = CORRELATION_SPAN(sample_rate); plc_history_len = correlation_span + plc_pitch_min; ////////////// for(int i = 0; i < channel_count; i ++) { plc_state_t *t = new plc_state_t; memset(t, 0, sizeof(plc_state_t)); t->pitchbuf = new float[plc_pitch_min]; t->history = new short[plc_history_len]; ChannelStates.push_back(t); } } void PcmConcealer::Destroy() { if(!Initialized) return; while(ChannelStates.size()) { plc_state_t *s = ChannelStates.at(0); if(s) { if(s->history) delete s->history; if(s->pitchbuf) delete s->pitchbuf; memset(s, 0, sizeof(plc_state_t)); delete s; } ChannelStates.erase(ChannelStates.begin()); } ChannelStates.clear(); Initialized = false; } //Process a block of received audio samples. int PcmConcealer::Receive(short amp[], int frames) { if(!Initialized) return 0; int j = 0; for(int k = 0; k < ChannelStates.size(); k++) { int i; int overlap_len; int pitch_overlap; float old_step; float new_step; float old_weight; float new_weight; float gain; plc_state_t *s = ChannelStates.at(k); if (s->missing_samples) { /* Although we have a real signal, we need to smooth it to fit well with the synthetic signal we used for the previous block */ /* The start of the real data is overlapped with the next 1/4 cycle of the synthetic data. */ pitch_overlap = s->pitch >> 2; if (pitch_overlap > frames) pitch_overlap = frames; gain = 1.0 - s->missing_samples * ATTENUATION_INCREMENT; if (gain < 0.0) gain = 0.0; new_step = 1.0/pitch_overlap; old_step = new_step*gain; new_weight = new_step; old_weight = (1.0 - new_step)*gain; for (i = 0; i < pitch_overlap; i++) { int index = (i * channel_count) + j; amp[index] = fsaturate(old_weight * s->pitchbuf[s->pitch_offset] + new_weight * amp[index]); if (++s->pitch_offset >= s->pitch) s->pitch_offset = 0; new_weight += new_step; old_weight -= old_step; if (old_weight < 0.0) old_weight = 0.0; } s->missing_samples = 0; } save_history(s, amp, j, frames); j++; } return frames; } //Fill-in a block of missing audio samples. int PcmConcealer::Fill(short amp[], int frames) { if(!Initialized) return 0; int j =0; for(int k = 0; k < ChannelStates.size(); k++) { short *tmp = new short[plc_pitch_overlap_max]; int i; int pitch_overlap; float old_step; float new_step; float old_weight; float new_weight; float gain; short *orig_amp; int orig_len; orig_amp = amp; orig_len = frames; plc_state_t *s = ChannelStates.at(k); if (s->missing_samples == 0) { // As the gap in real speech starts we need to assess the last known pitch, //and prepare the synthetic data we will use for fill-in normalise_history(s); s->pitch = amdf_pitch(plc_pitch_min, plc_pitch_max, s->history + plc_history_len - correlation_span - plc_pitch_min, j, correlation_span); // We overlap a 1/4 wavelength pitch_overlap = s->pitch >> 2; // Cook up a single cycle of pitch, using a single of the real signal with 1/4 //cycle OLA'ed to make the ends join up nicely // The first 3/4 of the cycle is a simple copy for (i = 0; i < s->pitch - pitch_overlap; i++) s->pitchbuf[i] = s->history[plc_history_len - s->pitch + i]; // The last 1/4 of the cycle is overlapped with the end of the previous cycle new_step = 1.0/pitch_overlap; new_weight = new_step; for ( ; i < s->pitch; i++) { s->pitchbuf[i] = s->history[plc_history_len - s->pitch + i]*(1.0 - new_weight) + s->history[plc_history_len - 2*s->pitch + i]*new_weight; new_weight += new_step; } // We should now be ready to fill in the gap with repeated, decaying cycles // of what is in pitchbuf // We need to OLA the first 1/4 wavelength of the synthetic data, to smooth // it into the previous real data. To avoid the need to introduce a delay // in the stream, reverse the last 1/4 wavelength, and OLA with that. gain = 1.0; new_step = 1.0/pitch_overlap; old_step = new_step; new_weight = new_step; old_weight = 1.0 - new_step; for (i = 0; i < pitch_overlap; i++) { int index = (i * channel_count) + j; amp[index] = fsaturate(old_weight * s->history[plc_history_len - 1 - i] + new_weight * s->pitchbuf[i]); new_weight += new_step; old_weight -= old_step; if (old_weight < 0.0) old_weight = 0.0; } s->pitch_offset = i; } else { gain = 1.0 - s->missing_samples*ATTENUATION_INCREMENT; i = 0; } for ( ; gain > 0.0 && i < frames; i++) { int index = (i * channel_count) + j; amp[index] = s->pitchbuf[s->pitch_offset]*gain; gain -= ATTENUATION_INCREMENT; if (++s->pitch_offset >= s->pitch) s->pitch_offset = 0; } for ( ; i < frames; i++) { int index = (i * channel_count) + j; amp[i] = 0; } s->missing_samples += orig_len; save_history(s, amp, j, frames); delete [] tmp; j++; } return frames; } void PcmConcealer::save_history(plc_state_t *s, short *buf, int channel_index, int frames) { if (frames >= plc_history_len) { /* Just keep the last part of the new data, starting at the beginning of the buffer */ //memcpy(s->history, buf + len - plc_history_len, sizeof(short)*plc_history_len); int frames_to_copy = plc_history_len; for(int i = 0; i < frames_to_copy; i ++) { int index = (channel_count * (i + frames - plc_history_len)) + channel_index; s->history[i] = buf[index]; } s->buf_ptr = 0; return; } if (s->buf_ptr + frames > plc_history_len) { /* Wraps around - must break into two sections */ //memcpy(s->history + s->buf_ptr, buf, sizeof(short)*(plc_history_len - s->buf_ptr)); short *hist_ptr = s->history + s->buf_ptr; int frames_to_copy = plc_history_len - s->buf_ptr; for(int i = 0; i < frames_to_copy; i ++) { int index = (channel_count * i) + channel_index; hist_ptr[i] = buf[index]; } frames -= (plc_history_len - s->buf_ptr); //memcpy(s->history, buf + (plc_history_len - s->buf_ptr), sizeof(short)*len); frames_to_copy = frames; for(int i = 0; i < frames_to_copy; i ++) { int index = (channel_count * (i + (plc_history_len - s->buf_ptr))) + channel_index; s->history[i] = buf[index]; } s->buf_ptr = frames; return; } /* Can use just one section */ //memcpy(s->history + s->buf_ptr, buf, sizeof(short)*len); short *hist_ptr = s->history + s->buf_ptr; int frames_to_copy = frames; for(int i = 0; i < frames_to_copy; i ++) { int index = (channel_count * i) + channel_index; hist_ptr[i] = buf[index]; } s->buf_ptr += frames; } void PcmConcealer::normalise_history(plc_state_t *s) { short *tmp = new short[plc_history_len]; if (s->buf_ptr == 0) return; memcpy(tmp, s->history, sizeof(short)*s->buf_ptr); memcpy(s->history, s->history + s->buf_ptr, sizeof(short)*(plc_history_len - s->buf_ptr)); memcpy(s->history + plc_history_len - s->buf_ptr, tmp, sizeof(short)*s->buf_ptr); s->buf_ptr = 0; delete [] tmp; } int PcmConcealer::amdf_pitch(int min_pitch, int max_pitch, short amp[], int channel_index, int frames) { int i; int j; int acc; int min_acc; int pitch; pitch = min_pitch; min_acc = INT_MAX; for (i = max_pitch; i <= min_pitch; i++) { acc = 0; for (j = 0; j < frames; j++) { int index1 = (channel_count * (i+j)) + channel_index; int index2 = (channel_count * j) + channel_index; //std::cout << "Index 1: " << index1 << ", Index 2: " << index2 << std::endl; acc += abs(amp[index1] - amp[index2]); } if (acc < min_acc) { min_acc = acc; pitch = i; } } std::cout << "Pitch: " << pitch << std::endl; return pitch; } } P.S. - I must confess that digital audio is not my forte...

Read the article

How does text-to-speech shortcut read any highlighted text?

- by TP

Hi, I am trying to develop a cocoa application that requires to read highlighted text from any application. But so far I cannot find any decent solution because the accessibility API isn't always work. (e.g. Firefox) Does anyone know it is implemented in text-to-speech included in Leopard? I have found this SO question but it isn't solved.

Read the article

Attempting to calculate width of Map Overlays on the fly

- by Bloudermilk

Hey all- I am working on an Android app that utilizes the Google Maps API MapView, MapController, MapActivity, and ItemizedOverlay. I am basically trying to recreate certain functionalities of the Maps app (damn Google for not providing speech bubbles—for lack of a better name—for items!), particularly those speech bubbles. I have an invisible XML structure for the speech bubble in the XML layout file containing my MapView. The first time I show a speech bubble I grab that XML and remove it from it's current parent, applying some ItemizedOverlay.LayoutParams to it, and add it to the MapView as an Overlay. I position it above the item that was selected, fill it with the proper text, then set it to visible. This all works great. The goal here, though, is to also automatically animate the map to reveal any parts of a speech bubble that may be off-screen when it opens. So I'm trying popup.getWidth() (popup is the instance of my LinearLayout that is the speech bubble) after I do all the manipulation to the bubble, even after I display it to the user. Problem is, popup.getWidth() is returning me the width of the previously displayed popup, not the currently displayed one. I can't figure out why this would be happening if I'm fetching the width after I set it to visible with its new dimensions (which, by the way, are relative when I'm setting them with LayoutParams: fill_content for both width and height).. I have even tried forcing both the MapView and the "popup" to invalidate() before trying to fetch the width. Any ideas why this may be happening? How can I force the View to settle into its new dimensions before trying to fetch them? Thanks! Nick

Read the article

How to use the Repeat After Me application in Mac OS developer tools

- by overboming

under /Developer/Applications/Utilities/Speech any idea?

Read the article

What program to use to create subtitles for a video?

- by user4124

I have a video that I want to create subtitles for. Is there a program that can perform rudimentary speech-to-text in order to set the correct start/stop of each individual subtitle create rudimentary text subtitles (using some sort of speech-to-text) I know about gnome-subtitles. However, it requires extensive effort to create those subtitles manually. You need to select yourself the start and stop for each sentence. Youtube has the above features (creates rudimentary text subtitles at the correct timings, using speech-to-text). However I would rather not upload the videos to Youtube just to get my subtitles. Is it possible to do the subtitles efficiently in Ubuntu? Update: I plan to use the .srt subtitles only, and do not need to hard code them on the videos. My biggest requirement is to have the program automatically find the start/stop for each sentence, so that I write the text in it.

Read the article

using a PHP print_r array result in javascript/jquery

- by Phil Jackson

Hello all.I have a simple jquery/ajax request to the server which returns the structure and data of an array. I was wondering if there was a quick way in which I can use this array structure and data using jquery; A simple request; var token = $("#token").val(); $.ajax({ type: 'POST', url: './', data: 'token=' + token + '&re=8', cache: false, timeout: 5000, success: function(html){ // do something here with the html var } }); the result ( actual result from PHP's print_r(); ); Array ( [0] => Array ( [username] => Emmalene [contents] => <ul><li class="name">ACTwebDesigns</li><li class="speech">helllllllo</li></ul> <ul><li class="name">ACTwebDesigns</li><li class="speech">sds</li></ul> <ul><li class="name">ACTwebDesigns</li><li class="speech">Sponge</li><li class="speech">dick</li></ul> <ul><li class="name">ACTwebDesigns</li><li class="speech">arghh</li></ul> ) ) I was thinking along the lines of var demo = Array(html); // and then do something with the demo var Not sure if that would work it just sprang to mind. Any help is much appreciated.

Read the article

IE adding a attribute 'done[number]' ??

- by Phil Jackson

Hi all im struggling to find an answer to my problem here. I've made a IM application the same as facebooks but it is having problems in IE. The problem started as I kept seeing rn at the beginnning of every post made via IE. That was due to stripslashes function. But as I was investigating I noticed my tag was being added an attribut 'done'; <li><UL done67="7">rn<LI class=name>ACTwebDesigns</LI>rn<LI class=speech>hello</LI></UL></li> <li><UL done1="4">rn<LI class=name>ACTwebDesigns</LI>rn<LI class=speech>foo</LI></UL></li> <li><UL done84="10">rn<LI class=name>ACTwebDesigns</LI>rn<LI class=speech>barr</LI>rn<LI class=speech ?>foobar</LI></UL></li> <li><UL done88="14">rn<LI class=name>ACTwebDesigns</LI>rn<LI class=speech>this is a test</LI></UL></li> does anyone know of a reason why IE would add this attribute? EDIT: function checkForm() { $(".chat_input").keydown(function(e){ if ( e.keyCode == 13 ) { var data = strip_tags($(this).val()); var username = $("#users_username").val(); var box = $(this).parents('div:eq(0)'); $(this).val(""); if( box.find('.conversation_box li.' + session_number ).length == 0 ) { var conversation_list = box.find('.conversation_box').html(); var insert_data = '<li class="' + session_number + '"><ul><li class="name">' + username + '</li><li class="speech">' + data + '</li></ul></li>'; box.find('.conversation_box').html(conversation_list + insert_data); bottom(); }else{ var conversation_list = box.find('.conversation_box li.' + session_number + ' ul').html(); var insert_data = '<li class="speech"">' + data + '</li>'; box.find('.conversation_box li.' + session_number + ' ul').html(conversation_list + insert_data); bottom(); } return false; } }); } function store_chat(){ try{ var token = $("#token").val(); var openedBoxes = $("li.conversation_list"); openedBoxes.each(function(){ var boxContainer = $(this).parents('div:eq(0)'); var amount = boxContainer.find('.conversation_box li').length; var p = boxContainer.find('.open_trigger').html(); var u = $("#users_username").val(); if( amount != 0 ){ if( $(this).parents('div:eq(0)').find('.conversation_box li.' + session_number ).length != 0 ) { var session_contents = $(this).parents('div:eq(0)').find('.conversation_box li.' + session_number ).html(); alert( session_contents ); $.ajax({ type: 'POST', url: './', data: 'token=' + token + '&re=7&s=' + amount + '&sd=' + session_contents + '&u=' + u + '&p=' + p, cache: false, timeout: 5000, success: function(html){ auth(html); boxContainer.find('.conversation_box').html(html); bottom(); } }); } } }); }catch(er){} }

Read the article

looping through JSON array

- by Phil Jackson

Hi all. I have recently posted another question which straight away users pointed me in the right direction. $.ajax({ type: 'POST', url: './', data: 'token=' + token + '&re=8', cache: false, timeout: 5000, success: function(html){ auth(html); var JSON_array = eval(html); alert(JSON_array[0].username); } }); this returns the data correctly but I want to perform a kind of 'foreach'. the array contains data about multiple incoming and outgoing Instant Messages. So if a user is talking to more than one person at a time i need to loop through. the array's structure is as follows. Array ( [0] => Array ( [username] => Emmalene [contents] => <ul><li class="name">ACTwebDesigns</li><li class="speech">helllllllo</li></ul> <ul><li class="name">ACTwebDesigns</li><li class="speech">sds</li></ul> <ul><li class="name">ACTwebDesigns</li><li class="speech">Sponge</li><li class="speech">dick</li></ul> <ul><li class="name">ACTwebDesigns</li><li class="speech">arghh</li></ul> ) ) any help very much appreciated.

Read the article

Jquery CPU usage

- by nharry

I am using Jquery to make an image scroll across my page horizontally. The only problem is that it uses a serious amount of cpu usage. Up to 100% on a single core laptop in firefox. What could cause this??? Jquery <script> jQuery(document).ready(function() { $(".speech").animate({backgroundPosition: "-6000px 0px"}, 400000, null); }); </script> CSS .speech { /*position:fixed;*/ top:0; left:0px; height:400px; width:100%; z-index:-1; background:url(/images/speech.png) -300px -500px repeat-x; margin-right: auto; margin-left: auto; position: fixed; } HTML <div class="speech"></div>

Read the article

How can I use the voice recognition used by Android on Ubuntu?

- by aking1012

If I'm developing an Android app that uses TTS and Voice recognition, which libraries are used for the same voice recognition and speech on Ubuntu? I'm assuming espeak for text to speech, but I'm unsure which voice recognition library and dictionary/learning/calibration system is used for voice recognition. I'ld like to make the app available on Ubuntu Desktop. as well as test it outside an emulator

Read the article

Fix elements within objects at their position? (Prevent movement when resizing)

- by Skadier

I would like to know if it is possible to fix elements at their absolute position within custom elements in InDesign CS5? I created a kind of speech bubble and I would like to place a stripline within this bubble to separate two content areas. Just a little scheme to show the desired layout as Pseudo-Markup :D <speech-bubble> <textbox>HEADER SECTION</textbox> <stripline> <textbox>Some other text</textbox> </speech-bubble> I created something like this but with two separate elements which aren't connected. So I have to select both of them in order to move the whole bubble. Then I tried to connect them using Object->Paths->Create linked path but then the stripline moves and the HEADER SECTION moves too. All in all I would like to have a speech bubble which can be resized in order to hold more text but it shouldn't make the HEADER_SECTION larger or move the stripline. Hope you understand what I mean :D Thanks in advance!

Read the article

Use a custom domain and point to Tumblr blog

- by jskye

My domain mydomain.com is registered with GoDaddy. I wish to host my Tumblr blog on this domain with Nearly Free Speech hosting. My active nameservers at GoDaddy already point to my authoritative ones at Nearly Free Speech which is working. However I'm baffled as to how to get my correct configuration to point to my Tumblr. Preferably I'd like (A) my domain http://mydomain.com to host the blog and have http://www.mydomain.com redirect also to http://mydomain.com. If this is too difficult my next preference is (B) to have http://www.mydomain.com host the blog whilst http://mydomain.com redirects to http://www.mydomain.com My third preference is to have (C) a sub-domain like http://tumblr.mydomain.com or http://tumblr.mydomain.com to host the blog and I guess have http://mydomain.com and http://www.mydomain.com both redirect to it. I've tried having two aliases mydomain.com and www.mydomain.com pointing to my permanent Nearly Free Speech IP at mydomain.nfshost.com and when I try to add: (1) an A record pointing mydomain.com to the IP 66.6.44.4 as per Tumblr's instructions it tells me I already have the bare domain as an alias so I cant do that. (2) the A record on the www.mydomain.com alias. I can do this with either www.mydomain.com set as an alias or not. But when I tried this with mydomain.com set as the canonical name the result when visiting either mydomain.com or www.mydomain.com was both of them continually redirecting to each other until an error was thrown. So I was wondering if there is a ninja that could save me some hair-pulling and tell me the correct way to config A, or else B, or else C.

Search Results

Search found 436 results on 18 pages for 'speech'.

Page 7/18 | < Previous Page | 3 4 5 6 7 8 9 10 11 12 13 14 | Next Page >

- by Robert Nekic

- by Danni

- by XBasic3000

- by qdot

- by manuel

- by sub

- by raheel

- by Garry

- by Tobia Loschiavo

- by harigm

- by jwhitlock

- by zx

- by ZeroDefect

- by TP

- by Bloudermilk

- by overboming

- by user4124

- by Phil Jackson

- by Phil Jackson

- by Phil Jackson

- by nharry

- by aking1012

- by Skadier

- by jskye

< Previous Page | 3 4 5 6 7 8 9 10 11 12 13 14 | Next Page >