Globally Optimized Least-Squares Post-Filtering for Speech Enhancement

PublishedAugust 1, 2017

Assigneenot available in USPTO data we have

InventorsYiteng HUANG Alejandro LUEBS Jan SKOGLUND Willem Bastiaan KLEIJN

Technical Abstract

Patent Claims

17 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A computer-implemented method, comprising: receiving audio signals via a microphone array from sound sources in an environment; hypothesizing multiple sound field scenarios to generate multiple output signals, including hypothesizing a point interferer, diffuse noise, and white noise, based on the received audio signals; calculating fixed beamformer coefficients based on the received audio signals; determining covariance matrix models based on the multiple output signals; calculating a covariance matrix based on the received audio signals; estimating power of the sound sources to find a solution that minimizes the difference between the determined covariance matrix models and the calculated covariance matrix; calculating and applying post-filter coefficients based on the estimated power; and generating an output audio signal based on the received audio signals and the post-filter coefficients.

2. The method of claim 1 , wherein the multiple generated output signals are compared and the output signal with the highest signal-to-noise ratio among the multiple output generated signals is selected as the final output signal.

3. The method of claim 1 , wherein the estimating of the power is based on a Frobenius norm.

4. The method of claim 3 , wherein the Frobenius norm is computed using the Hermitian symmetry of the covariance matrices.

5. The method of claim 1 , further comprising: determining the location of at least one of the sound sources using sound-source location methods to hypothesize the sound field scenarios, determine the covariance matrix models, and calculate the covariance matrix.

6. The method of claim 1 , wherein the covariance matrix models are generated based on the plurality of hypothesized sound field scenarios.

7. The method of claim 6 , wherein a covariance matrix model is selected to maximize an objective function that reduces noise.

8. The method of claim 7 , wherein an objective function is the sample variance of the final output audio signal.

9. An apparatus, comprising: one or more processing devices and one or more storage devices storing instructions that, when executed by the one or more processing devices, cause the one or processing devices to: receive audio signals via a microphone array from sound sources in an environment; hypothesize sound field scenarios to generate multiple output signals, including hypothesizing a point interferer, diffuse noise, and white noise, based on the received audio signals; calculate fixed beamformer coefficients based on the received audio signals; determine covariance matrix models based on the multiple output signals; calculate a covariance matrix based on the received audio signals; estimate power of the sound sources to find a solution that minimizes the difference between the determined covariance matrix models and the calculated covariance matrix; calculate and applying post-filter coefficients based on the estimated power; and generate an output audio signal based on the received audio signals and the post-filter coefficients.

10. An apparatus of claim 9 , wherein the multiple generated output signals are compared and the output signal with the highest signal-to-noise ratio among the multiple output generated signals.

11. An apparatus of claim 9 , wherein the estimating of the power is based on a Frobenius norm.

12. An apparatus of claim 11 , wherein the Frobenius norm is computed using a Hermitian symmetry of the covariance matrices.

13. An apparatus of claim 9 , further comprising: determining the location of at least one of the sound sources using sound-source location methods to hypothesize the sound field scenarios, determine the covariance matrix models, and calculate the covariance matrix.

14. A non-transitory computer-readable medium, comprising sets of instructions for: receiving audio signals via a microphone array from sound sources in an environment; hypothesizing sound field scenarios to generate multiple output signals, including hypothesizing a point interferer, diffuse noise, and white noise, based on the received audio signals; calculating fixed beamformer coefficients based on the received audio signals; determining covariance matrix models based on the multiple output signals; calculating a covariance matrix based on the received audio signals; estimating power of the sound sources to find a solution that minimizes the difference between the determined covariance matrix models and the calculated covariance matrix; calculating and applying post-filter coefficients based on the estimated power; and generating an output audio signal based on the received audio signals and the post-filter coefficients.

15. A non-transitory computer-readable medium of claim 14 , wherein the multiple generated output signals are compared and the output signal with the highest signal-to-noise ratio among the multiple output generated signals.

16. A non-transitory computer-readable medium of claim 14 , wherein the estimating of the power is based on a Frobenius norm.

17. A non-transitory computer-readable medium of claim 16 , wherein the Frobenius norm is computed using a Hermitian symmetry of the covariance matrices.

Patent Metadata

Filing Date

Unknown

Publication Date

August 1, 2017

Inventors

Yiteng HUANG

Alejandro LUEBS

Jan SKOGLUND

Willem Bastiaan KLEIJN

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search