US-9065914

System and method of providing generated speech via a network

PublishedJune 23, 2015

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

A system and method of operating an automatic speech recognition application over an Internet Protocol network is disclosed. The ASR application communicates over a packet network such as an Internet Protocol network or a wireless network. A grammar for recognizing received speech from a user over the IP network is selected from a plurality of grammars according to a user-selected application. A server receives information representing speech over the IP network, performs speech recognition using the selected grammar, and returns information based upon the recognized speech. Sub-grammars may be included within the grammar to recognize speech from sub-portions of a dialog with the user.

Patent Claims

20 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A method comprising: selecting a spoken dialog application from a plurality of spoken dialog applications; transmitting, over a network, an identification of the selected spoken dialog application, the spoken dialog application having a grammar identifier; selecting a grammar from a plurality of grammars based on the grammar identifier, wherein the grammar is provided by the selected spoken dialog application and chosen from a predetermined group of grammars based upon information provided by the selected spoken dialog application; transmitting digitized user speech over the network while receiving user speech which is digitized into the digitized user speech; receiving partially synthesized speech in response to the digitized user speech, wherein the selected spoken dialog application recognizes the digitized user speech using the grammar; and receiving final synthesized speech in response to the digitized user speech, wherein the receiving of the final synthesized speech occurs after receiving the partially synthesized speech.

2. The method of claim 1 , wherein the digitized user speech is recognized using a sub-grammar based on a sub-component of the user speech.

3. The method of claim 2 , wherein the sub-grammar is associated with a task.

4. The method of claim 1 , wherein the network is an internet protocol network.

5. The method of claim 1 , wherein the spoken dialog application carries on a dialog with a user communicating with a client device.

6. The method of claim 1 , further comprising: receiving information associated with the final synthesized speech over the network from a client device.

7. The method of claim 1 , further comprising: modifying the grammar based on the digitized user speech.

8. A system comprising: a processor; and a computer-readable storage medium having instructions stored which, when executed by the processor, cause the processor to perform operations comprising: selecting a spoken dialog application from a plurality of spoken dialog applications; transmitting, over a network, an identification of the selected spoken dialog application, the spoken dialog application having a grammar identifier; selecting a grammar from a plurality of grammars based on the grammar identifier, wherein the grammar is provided by the selected spoken dialog application and chosen from a predetermined group of grammars based upon information provided by the selected spoken dialog application; transmitting digitized user speech over the network while receiving user speech which is digitized into the digitized user speech; receiving partially synthesized speech in response to the digitized user speech, wherein the selected spoken dialog application recognizes the digitized user speech using the grammar; and receiving final synthesized speech in response to the digitized user speech, wherein the receiving of the final synthesized speech occurs after receiving the partially synthesized speech.

9. The system of claim 8 , wherein the digitized user speech is recognized using a sub-grammar based on a sub-component of the digitized user speech.

10. The system of claim 9 , wherein the sub-grammar is associated with a task.

11. The system of claim 8 , wherein the network is an internet protocol network.

12. The system of claim 8 , wherein the spoken dialog application carries on a dialog with a user communicating with a client device.

13. The system of claim 8 , the computer-readable storage medium having additional instructions stored which, when executed by the processor, result in operations comprising: receiving information associated with the final synthesized speech over the network from a client device.

14. The system of claim 8 , the computer-readable storage medium having additional instructions stored which, when executed by the processor, result in operations comprising: modifying the grammar based on the digitized user speech.

15. A computer-readable storage device having instructions stored which, when executed by a processor, cause the processor to perform operations comprising: selecting a spoken dialog application from a plurality of spoken dialog applications; transmitting, over a network, an identification of the selected spoken dialog application, the spoken dialog application having a grammar identifier; selecting a grammar from a plurality of grammars based on the grammar identifier, wherein the grammar is provided by the selected spoken dialog application and chosen from a predetermined group of grammars based upon information provided by the selected spoken dialog application; transmitting digitized user speech over the network while receiving user speech which is digitized into the digitized user speech; receiving partially synthesized speech in response to the digitized user speech, wherein the selected spoken dialog application recognizes the digitized user speech using the grammar; and receiving final synthesized speech in response to the digitized user speech, wherein the receiving of the final synthesized speech occurs after receiving the partially synthesized speech.

16. The computer-readable storage device of claim 15 , wherein the user speech is recognized using a sub-grammar based on a sub-component of the digitized user speech.

17. The computer-readable storage device of claim 16 , wherein the sub-grammar is associated with a task.

18. The computer-readable storage device of claim 15 , wherein the network is an internet protocol network.

19. The computer-readable storage device of claim 15 , wherein the spoken dialog application carries on a dialog with a user communicating with a client device.

20. The computer-readable storage device of claim 15 , having additional instructions stored which, when executed by the computing device, result in operations comprising: receiving information associated with the final synthesized speech over the network from a client device.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

G10L H04M

Patent Metadata

Filing Date

June 19, 2012

Publication Date

June 23, 2015

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search