Legal claims defining the scope of protection, as filed with the USPTO.
1. A method of speech recognition comprising the steps of: (a) receiving speech in a vehicle; (b) extracting acoustic data from the received speech; and (c) applying a vehicle-specific inverse impulse response function to the extracted acoustic data to produce normalized acoustic data.
2. The method of claim 1 , further comprising the steps of: (d) pre-processing the normalized acoustic data to extract normalized acoustic feature vectors; and (e) decoding the normalized acoustic feature vectors using as input at least one of a plurality of global acoustic models, wherein each model is distinguished from the other models based on a Lombard level.
3. The method of claim 2 , further comprising the steps of: (f) calculating the Lombard level of vehicle noise; and (g) selecting the global acoustic model of the plurality of global acoustic models that corresponds to the calculated Lombard level for application during the decoding step (e).
4. The method of claim 2 , wherein the global acoustic models are based on a Lombard speech corpus developed in a sound-controlled environment using a plurality of speakers, a plurality of different levels of noises, and a plurality of different utterances.
5. The method of claim 4 , wherein the global acoustic models are built after extracting acoustic features of speech from the Lombard speech corpus without noise reduction.
6. The method of claim 4 , wherein the global acoustic models are built by selecting a Lombard level, gathering speech data corresponding to the selected Lombard level from the Lombard speech corpus, selecting one or more vehicles that exhibit noise at the selected Lombard level, mixing the gathered speech data with acoustic data from the selected one or more vehicles, and extracting acoustic features of speech from the Lombard speech corpus with noise reduction.
7. The method of claim 2 , wherein the decoding step is performed in-vehicle.
8. The method of claim 2 , wherein the decoding step is performed in a remote server.
9. The method of claim 1 , wherein the inverse impulse response function is determined by first determining an impulse response function for the vehicle and then mathematically calculating the inverse of the impulse response function.
10. The method of claim 1 , wherein the inverse impulse response function is determined by determining correlation and/or covariance between an audio signal received from an integrated vehicle microphone (IVM) on a first channel and an audio signal received from a mouth reference position (MRP) microphone at a second channel, wherein the IVM is designated as an input and the MRP microphone is designated as an output.
11. A method of speech recognition for a plurality of vehicles, comprising the steps of: (a) developing a corpus of Lombard speech data; (b) building a plurality of global acoustic models based on the corpus of Lombard speech data; (c) receiving speech in a vehicle using an integrated vehicle microphone; (d) generating an inverse impulse response function for each of the plurality of vehicles; (e) extracting acoustic data from the received speech; and (f) applying the vehicle-specific inverse impulse response function to the extracted acoustic data to produce normalized acoustic data.
12. The method of claim 11 , wherein the corpus of Lombard speech data includes a plurality of Lombard levels, and wherein each model of the plurality of global acoustic models is distinguished from the other models based on a Lombard level of the plurality of Lombard levels.
13. The method of claim 11 , wherein the inverse impulse response function is determined by first determining an impulse response function for the vehicle and then mathematically calculating the inverse of the impulse response function.
14. The method of claim 11 , wherein the inverse impulse response function is determined by determining correlation and/or covariance between an audio signal received from an integrated vehicle microphone (IVM) on a first channel and an audio signal received from a mouth reference position (MRP) microphone at a second channel, wherein the IVM is designated as an input and the MRP microphone is designated as an output.
15. The method of claim 11 , wherein the Lombard speech corpus is developed in a sound-controlled environment using a plurality of speakers, a plurality of different levels of noises, and a plurality of different utterances.
16. The method of claim 15 , wherein the global acoustic models are built after extracting acoustic features of speech from the Lombard speech corpus without noise reduction.
17. The method of claim 16 , wherein the global acoustic models are built by selecting a Lombard level, gathering speech data corresponding to the selected Lombard level from the Lombard speech corpus, selecting one or more vehicles that exhibit noise at the selected Lombard level, mixing the gathered speech data with acoustic data from the selected one or more vehicles, and extracting acoustic features of speech from the Lombard speech corpus with noise reduction.
18. The method of claim 11 , further comprising the steps of: (g) pre-processing the normalized acoustic data to extract acoustic feature vectors; (h) decoding the normalized acoustic feature vectors using as input at least one of a plurality of global acoustic models, wherein each model is distinguished from the other models based on a Lombard level of a Lombard speech corpus covering a plurality of vehicles; (i) calculating the Lombard level of vehicle noise; and (j) selecting the at least one of the plurality of global acoustic models that corresponds to the calculated Lombard level for application during the decoding step (h).
19. The method of claim 18 wherein the decoding step is performed in a remote server.
20. A method of speech recognition for a plurality of vehicles, comprising the steps of: (a) developing a corpus of Lombard speech including a plurality of Lombard levels; (b) building a plurality of global acoustic models based on the corpus of Lombard speech data, wherein each model of the plurality of global acoustic models is distinguished from the other models based on a Lombard level of the plurality of Lombard levels; (c) receiving speech in a vehicle using an integrated vehicle microphone; (d) generating an inverse impulse response function for each of the plurality of vehicles, wherein the inverse impulse response function is determined by at least one of first determining an impulse response function for the vehicle and then mathematically calculating the inverse of the impulse response function, or determining correlation and/or covariance between an audio signal received from an integrated vehicle microphone (IVM) on a first channel and an audio signal received from a mouth reference position (MRP) microphone at a second channel wherein the IVM is designated as an input and the MRP microphone is designated as an output; (e) extracting acoustic data from the received speech; (f) applying the vehicle-specific inverse impulse response function to the extracted acoustic data to produce normalized acoustic data; (g) pre-processing the normalized acoustic data to extract acoustic feature vectors; (h) decoding the normalized acoustic feature vectors using as input at least one of a plurality of global acoustic models, wherein each model is distinguished from the other models based on a Lombard level of a Lombard speech corpus covering a plurality of vehicles; (i) calculating the Lombard level of vehicle noise; and (j) selecting the at least one of the plurality of global acoustic models that corresponds to the calculated Lombard level for application during the decoding step (h).
Unknown
March 9, 2010
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.