Legal claims defining the scope of protection, as filed with the USPTO.
1. A method comprising: performing, by a processor, operations of: recording input speech where a pace of the input speech varies over time; automatically monitoring, the pace of the input speech; presenting the input speech to at least one of an automated speech recognizer or external transcribers at a presentation pace; at least one of performing speech recognition on the input speech which produces a text transcription of the input speech or receiving text transcription from the external transcribers; performing text-to-speech synthesis on the text transcription of the input speech which produces synthesized speech containing spoken versions of the words in the text transcription of the input speech, where the pace of the synthesized speech varies over time; automatically monitoring the pace of the synthesized speech; dynamically adjusting the presentation pace to match the pace of the synthesized speech based on an acoustic match between the synthesized speech and the input speech; continuously re-adjusting the presentation pace based on the acoustic match to compensate for changes in the input speech including changes in the input speech pace, vocabulary use, repetitions, and non-speech sounds.
2. The method of claim 1 wherein the input speech is a live conversation.
3. The method of claim 1 wherein the input speech is previously recorded conversations in non-real time.
4. The method of claim 1 wherein the input speech is broken into a multitude of small snippets and is transcribed by multiple transcribers in at least one of sequentially or in parallel.
Unknown
December 11, 2012
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.