A method of providing a platform for configuring device-specific speech recognition is provided. The method includes providing a user interface for developers to select a set of at least two acoustic models appropriate for a specific type of a device, receiving, from a developer, a selection of the set of the at least two acoustic models, and configuring a speech recognition system to perform device-specific speech recognition by using one acoustic model selected from the at least two acoustic models of the set.
Legal claims defining the scope of protection, as filed with the USPTO.
1. A method of providing a platform for configuring device-specific speech recognition, the method comprising: providing a user interface for developers to select a set of at least two acoustic models appropriate for a specific type of a device; receiving, from a developer, a selection of the set of the at least two acoustic models; and configuring a speech recognition system to perform device-specific speech recognition by using one acoustic model selected from the at least two acoustic models of the set.
2. The method of claim 1 , wherein the device-specific speech recognition includes: receiving, from the device of the specific type, speech audio including natural language utterances and metadata associated with the received speech audio; selecting one acoustic model of the at least two acoustic models of the set in dependence upon the received metadata; and using the selected acoustic model to recognize speech from the natural language utterances included in the received speech audio.
3. The method of claim 2 , wherein the metadata identifies an acoustic model of the set according to the specific type of the device.
4. The method of claim 2 , wherein the metadata identifies a specific device condition of the device and the speech recognition system selects the one acoustic model from the set in dependence upon the specific device condition.
5. The method of claim 1 , further comprising: receiving a custom acoustic model appropriate for the specific type of the device; and providing the custom acoustic model within the user interface to be selected as one of the acoustic models of the set.
6. The method of claim 5 , further comprising: receiving custom noise data from the developer; training the custom acoustic model using the custom noise data and clean speech data; and providing the trained custom acoustic model to be selected to be used in the device-specific speech recognition.
7. The method of claim 1 , further comprising: receiving training data appropriate for the specific type of the device; training an acoustic model using the received training data; and providing the trained acoustic model within the user interface to be selected as one of the acoustic models of the set.
8. A method of using a platform for configuring device-specific speech recognition, the method comprising: selecting, through a developer interface provided by a computer system, a set of at least two acoustic models appropriate for a specific type of a device; providing custom noise data through the developer interface; receiving, through the developer interface, a trained custom acoustic model that has been trained using (i) the custom noise data provided though the developer interface and (ii) clean speech data; and providing speech audio with metadata to a speech recognition system associated with the platform.
9. The method of claim 8 , further comprising: providing, to the developer interface, a custom acoustic model appropriate for the specific type of the device, wherein the set of selected acoustic models includes the provided custom acoustic model.
10. The method of claim 8 , further comprising: providing training data for training an acoustic model appropriate to the specific type of the device; and selecting, from within the developer interface, an acoustic model trained on the provided training data.
11. The method of claim 8 , wherein the metadata identifies an acoustic model of the set according to the specific type of the device.
12. The method of claim 8 , wherein the metadata identifies a specific device condition and the computer system selects one acoustic model from the set in dependence upon the specific device condition.
13. The method of claim 8 , further comprising using the trained custom acoustic model on a local device of the specific type.
14. The method of claim 8 , further comprising selecting the trained custom acoustic model to be used in speech recognition in the speech recognition system.
15. A non-transitory computer-readable recording medium having computer instructions recorded thereon, the computer instructions, when executed by one or more processors, causing the one or more processors to perform operations comprising: providing a user interface for developers to select a set of at least two acoustic models appropriate for a specific type of a device; receiving, from a developer, a selection of the set of the at least two acoustic models; and configuring a speech recognition system to perform device-specific speech recognition by using one acoustic model selected from the at least two acoustic models of the set.
16. A computer system comprising one or more processors coupled to memory, the memory storing computer instructions thereon, the computer instructions, when executed by the one or more processors, causing the one or more processors to perform operations comprising: providing a user interface for developers to select a set of at least two acoustic models appropriate for a specific type of a device; receiving, from a developer, a selection of the set of the at least two acoustic models; and configuring a speech recognition system to perform device-specific speech recognition by using one acoustic model selected from the at least two acoustic models of the set.
17. The computer system of claim 16 , wherein the device-specific speech recognition includes: receiving, from the device of the specific type, speech audio including natural language utterances and metadata associated with the received speech audio; selecting one acoustic model of the at least two acoustic models of the set in dependence upon the received metadata; and using the selected acoustic model to recognize speech from the natural language utterances included in the received speech audio.
18. The computer system of claim 17 , wherein the metadata identifies an acoustic model of the set according to the specific type of the device.
19. The computer system of claim 17 , wherein the metadata identifies a specific device condition of the device and the speech recognition system selects the one acoustic model from the set in dependence upon the specific device condition.
20. The computer system of claim 16 , wherein the operations further comprise: receiving a custom acoustic model appropriate for the specific type of the device; and providing the custom acoustic model within the user interface to be selected as one of the acoustic models of the set.
21. The computer system of claim 20 , wherein the operations further comprise: receiving custom noise data from the developer; training the custom acoustic model using the custom noise data and clean speech data; and providing the trained custom acoustic model to be selected to be used in the device-specific speech recognition.
22. The computer system of claim 16 , wherein the operations further comprise: receiving training data appropriate for the specific type of the device; training an acoustic model using the received training data; and providing the trained acoustic model within the user interface to be selected as one of the acoustic models of the set.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
April 21, 2021
June 21, 2022
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.