Automated Software Execution Using Intelligent Speech Recognition

PublishedJune 5, 2018

Assigneenot available in USPTO data we have

Technical Abstract

Patent Claims

19 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A method for automated execution of computer software using intelligent speech recognition techniques, the method comprising: capturing, by a server computing device, a bitstream containing a digitized voice segment from a remote device as a speech file, the first digitized voice segment corresponding to speech submitted by a user of the remote device during a voice call; parsing, by the server computing device, the bitstream to locate the digitized voice segment; adjusting, by the server computing device, compression of the bitstream containing the digitized voice segment to enhance audio quality of the bitstream; analyzing, by the server computing device, the speech file to convert the speech file into text and extract a set of keywords from the converted text; displaying, by a client computing device coupled to the server computing device, the extracted keywords in a user interface of a display device; determining, by the server computing device, one or more computer software applications accessible to the client computing device; selecting, by the server computing device, at least one of the computer software applications that include functionality responsive to the keywords, comprising: generating an input vector comprising a sequence of numeric values, each value associated with a keyword and weighted according to a relative position of the keyword in the set of keywords, matching the input vector against a predefined set of vectors to determine one or more vectors that are similar to the input vector, identifying a label corresponding to each matched vector, wherein the label is associated with computer software functionality, and selecting one or more computer software applications that are associated with a most common label of the identified labels; and executing, by the client computing device, the functionality of the selected computer software applications that are responsive to the keywords.

2. The method of claim 1 , wherein matching the input vector comprises determining, by the server computing device, a distance between the input vector and each vector in the predefined set of vectors; and choosing, by the server computing device, one or more of vectors in the predefined set of vectors where the distance is within a predetermined threshold.

3. The method of claim 1 , wherein the label is an identifier that corresponds to a computer software application.

4. The method of claim 1 , further comprising establishing, by the server computing device, a voice connection between the remote device and the client computing device before capturing the digitized voice segment.

5. The method of claim 1 , further comprising establishing, by the server computing device, a voice connection between the remote device and an interactive voice response system before capturing the digitized voice segment.

6. The method of claim 1 , further comprising displaying, by the client computing device, one or more user interface elements in the user interface that correspond to the executed functionality of the selected software applications.

7. The method of claim 1 , wherein extracting a set of keywords from the converted text comprises filtering, by the server computing device, the converted text to remove stopwords.

8. The method of claim 1 , wherein converting the digitized voice segment into text comprises executing, by the server computing device, a speech recognition engine on a digital file containing the digitized voice segment to generate the text.

9. The method of claim 8 , further comprising analyzing, by the server computing device, the text using a grammar recognition engine to validate the generated text.

10. A system for automated execution of computer software using intelligent speech recognition techniques, the system comprising: a server computing device configured to capture a bitstream containing a digitized voice segment from a remote device as a speech file, the digitized voice segment corresponding to speech submitted by a user of the remote device during a voice call; parse the bitstream to locate the digitized voice segment; adjust compression of the bitstream containing the digitized voice segment to enhance audio quality of the bitstream; analyze the speech file to convert the speech file into text and extract a set of keywords from the converted text; determine one or more computer software applications accessible to the client computing device; and select at least one of the computer software applications that include functionality responsive to the keywords, comprising: generating, using a sequenced bag-of-words processing model, an input vector comprising a sequence of numeric values, each value associated with a keyword and weighted according to a relative position of the keyword in the set of keywords, matching, using a K-Nearest Neighbor processing model, the input vector against a predefined set of vectors to determine one or more vectors that are similar to the input vector, identifying a label corresponding to each matched vector, wherein the label is associated with computer software functionality, and selecting one or more computer software applications that are associated with a most common label of the identified labels; and a client computing device coupled to the server computing device, the client computing device configured to display the extracted keywords in a user interface of a display device; and execute the functionality of the selected computer software applications that is responsive to the keywords.

11. The system of claim 10 , wherein when matching the input vector, the server computing device is configured to determine a distance between the input vector and each vector in the predefined set of vectors; and choose one or more of vectors in the predefined set of vectors where the distance is within a predetermined threshold.

12. The system of claim 10 , wherein the label is an identifier that corresponds to a computer software application.

13. The system of claim 10 , wherein the server computing device is configured to establish a voice connection between the remote device and the client computing device before capturing the digitized voice segment.

14. The system of claim 10 , wherein the server computing device is configured to establish a voice connection between the remote device and an interactive voice response system before capturing the digitized voice segment.

15. The system of claim 10 , wherein the server computing device is configured to display one or more user interface elements in the user interface that correspond to the executed functionality of the selected software applications.

16. The system of claim 10 , wherein extracting a set of keywords from the converted text comprises filtering the converted text to remove stopwords.

17. The system of claim 10 , wherein converting the digitized voice segment into text comprises executing a speech recognition engine on a digital file containing the digitized voice segment to generate the text.

18. The system of claim 17 , wherein the server computing device is configured to analyze the text using a grammar recognition engine to validate the generated text.

19. A computer program product, tangibly embodied in a non-transitory computer readable storage device, for automated execution of computer software using intelligent speech recognition techniques, the computer program product including instructions operable to cause a server computing device to capture a bitstream containing a digitized voice segment from a remote device as a speech file, the digitized voice segment corresponding to speech submitted by a user of the remote device during a voice call; parse the bitstream to locate the digitized voice segment; adjust compression of the bitstream containing the digitized voice segment to enhance audio quality of the bitstream; analyze the speech file to convert the speech file into text and extract a set of keywords from the converted text; determine one or more computer software applications accessible to the client computing device; and select at least one of the computer software applications that include functionality responsive to the keywords, comprising: generating, using a sequenced bag-of-words processing model, an input vector comprising a sequence of numeric values, each value associated with a keyword and weighted according to a relative position of the keyword in the set of keywords, matching, using a K-Nearest Neighbor processing model, the input vector against a predefined set of vectors to determine one or more vectors that are similar to the input vector, identifying a label corresponding to each matched vector, wherein the label is associated with computer software functionality, and selecting one or more computer software applications that are associated with a most common label of the identified labels; and the computer program product instructions operable to cause a client computing device coupled to the server computing device to display the extracted keywords in a user interface of a display device; and execute the functionality of the selected computer software applications that is responsive to the keywords.

Patent Metadata

Filing Date

Unknown

Publication Date

June 5, 2018

Inventors

Pu Li

Yu Zhang

Jianhua Sun

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search