US-8346545

Model-based distortion compensating noise reduction apparatus and method for speech recognition

PublishedJanuary 1, 2013

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

A model-based distortion compensating noise reduction apparatus for speech recognition, includes: a speech absence probability calculator for calculating the probability distribution for absence and existence of a speech using the sound absence and existence information for the frames; a noise estimation updater for estimating a more accurate noise component by updating the variance of the clean speech and noise for each frame; and a speech absence probability-based noise filter for outputting a first clean speech through the speech absence probability transmitted from the speech absence probability calculator and a first noise filter. Further, the model-based distortion compensating noise reduction apparatus includes a post probability calculator for calculating post probabilities for mixtures using a GMM containing a clean speech in the first clean speech; and a final filter designer for forming a second noise filter and outputting an improved final clean speech signal using the second noise filter.

Patent Claims

20 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A model-based distortion compensating noise reduction apparatus for speech recognition, the apparatus comprising: a speech absence probability calculator for calculating the probability distribution for absence and existence of a speech by using the sound absence and existence information for frames; a noise estimation updater for estimating a more accurate noise component by updating the variance of the clean speech and noise for each frame; a speech absence probability-based noise filter for outputting a first clean speech through the speech absence probability transmitted from the speech absence probability calculator and a first noise filter; a post probability calculator for calculating post probabilities for mixtures using a Gaussian mixture model (GMM) containing a clean speech in the first clean speech; and a final filter designer for forming a second noise filter and outputting an improved final clean speech signal using the second noise filter.

2. The apparatus of claim 1 , further comprising a frame divider for converting the input speech signal into a digital signal and dividing the converted digital signal into the frames of a predetermined length.

3. The apparatus of claim 1 , further comprising a noise estimator for estimating noise for the frames.

4. The apparatus of claim 1 , wherein the first and second noise filters are based on a Wiener filter.

5. The apparatus of claim 1 , wherein the first noise filter uses a clean speech obtained from a previous frame and a first prior signal-to-noise ratio calculated using a preset smoothing parameter value.

6. The apparatus of claim 1 , wherein the second noise filter uses a clean speech calculated through a previous frame, a variance ratio of the clean speech to noise, and a second prior signal-to-noise ratio calculated using a preset smoothing parameter value.

7. The apparatus of claim 1 , wherein the speech absence probability calculator calculates the probability distribution of absence and existence of a speech, and calculates the speech absence probability of the speech for the current frame based on the probability distribution.

8. The apparatus of claim 1 , wherein the noise estimation updater outputs a final estimation value of noise by updating the variance of a clean speech and noise for the frames using the smoothing parameters for the temporal frames determined based on the speech absence probabilities.

9. The apparatus of claim 1 , further comprising a clean speech estimator for moving the first clean speech to a clean speech distribution region to compensate for distortion by using an average value of mixtures larger than a preset value in the calculated post probability value.

10. The apparatus of claim 9 , wherein the clean speech estimator for obtaining a clean speech estimation value from which distortion is removed, by moving the first clean speech to a speech distribution region having no distortion using the average value of the mixtures close to the first clean speech.

11. A model-based distortion compensating noise reduction method for speech recognition, the method comprising: calculating the probability distribution for absence and existence of a speech by using the sound absence and existence information for the frames; estimating a more accurate noise component by updating the variance of the clean speech and noise for each frame; outputting a first clean speech through the speech absence probability transmitted from the speech absence probability calculator and a first noise filter; calculating post probabilities for mixtures using a Gaussian mixture model (GMM) containing a clean speech in the first clean speech; and forming a second noise filter and outputting an improved second clean speech signal using the second noise filter using a clean speech estimation value obtained through the post probabilities.

12. The method of claim 11 , further comprising converting the input speech signal into a digital signal, and dividing the converted digital signal into frames of a predetermined length.

13. The method of claim 11 , wherein said calculating a speech absence probability includes estimating noise by calculating the probability distribution of absence and existence of a speech for the frames.

14. The method of claim 11 , wherein the first and second noise filters are based on a Wiener filter.

15. The method of claim 11 , wherein the first noise filter uses a clean speech obtained from a previous frame and a first prior signal-to-noise ratio calculated using a preset smoothing parameter value.

16. The method of claim 11 , wherein the second noise filter uses a clean speech obtained from a previous frame, a variance ratio of the clean speech to noise, and a second prior signal-to-noise ratio calculated using a preset smoothing parameter value.

17. The method of claim 11 , wherein said outputting a second clean speech signal further comprising moving the first clean speech to a clean speech distribution region to compensate for distortion by using an average value of mixtures larger than a preset value in the calculated post probability value.

18. The method of claim 17 , wherein, by adding the average value of the mixtures to a preset weight, the clean speech estimation value from which distortion is removed is obtained by compensating for the first clean speech.

19. The method of claim 11 , wherein said calculating a speech absence probability calculates the probability distribution of absence and existence of a speech, and calculates the speech absence probability of the speech for the current frame based on the probability distribution.

20. The method of claim 11 , wherein said estimating a more accurate noise component outputs a final estimation value of noise by updating a variance of a clean speech and noise for the frames using the smoothing parameters for the temporal frames determined based on the speech absence probabilities.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

G10L

Patent Metadata

Filing Date

November 25, 2009

Publication Date

January 1, 2013

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search