US-11312164

Frequency band extension in an audio signal decoder

PublishedApril 26, 2022

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

A method is provided for extending the frequency band of an audio signal during a decoding or improvement process. The method includes obtaining the decoded signal in a first frequency band, referred to as a low band. Tonal components and a surround signal are extracted from the signal from the low-band signal, and the tonal components and the surround signal are combined by adaptive mixing using energy-level control factors to obtain an audio signal, referred to as a combined signal. The low-band decoded signal before the extraction step or the combined signal after the combination step are extended over at least one second frequency band which is higher than the first frequency band. Also proved are a frequency-band extension device which implements the described method and a decoder including a device of this type.

Patent Claims

22 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A method, comprising: obtaining a decoded audio signal, wherein the decoded audio signal is decoded in a first frequency band; extending frequencies of the decoded audio signal into a second frequency band, wherein the extension of frequencies is arranged to produce a frequency-extended decoded audio signal, wherein the second frequency band is higher than the first frequency band; obtaining an ambience signal by computing a mean value of a frequency spectrum of the frequency-extended decoded audio signal; obtaining dominant tonal components from the frequency-extended decoded audio signal, wherein the dominant tonal components are tonal components, wherein the tonal components comprise magnitudes, wherein the magnitudes exceed a threshold, wherein obtaining the dominant tonal components comprises subtracting the obtained ambience signal from the frequency-extended decoded audio signal; and combining the dominant tonal components and the ambience signal using adaptive mixing and energy level control factors to obtain a combined signal.

2. The method of claim 1 , wherein the decoded audio signal is a decoded audio excitation signal.

3. The method of claim 1 , wherein an energy level control factor is computed as a function of the total energy of the frequency-extended decoded audio signal and of the dominant tonal components, wherein the adaptive mixing uses the energy level factor.

4. The method of claim 1 , further comprising transforming or filter bank-based sub-band decomposing the decoded audio signal, wherein obtaining the dominant tonal components uses the frequency domain or a sub-band domain, wherein the ambience signal is created in the frequency domain or a sub-band domain, wherein the combining is created in the frequency domain or a sub-band domain.

5. The method of claim 1 , wherein extending the frequencies of the decoded audio signal into the second frequency band employs the following equation: U HB ⁢ ⁢ 1 ⁡ ( k ) = { 0 k = 0 , … , 199 U ⁡ ( k ) k = 200 , … , 239 U ⁢ ( k + ⁢ start_band - 240 ) k = 240 , … , 319 wherein k is the index of the sample, wherein U(k) is the spectrum of the decoded audio signal obtained after a frequency domain transform of the decoded audio signal, wherein U HB1 (k) is the spectrum of the frequency-extended decoded audio signal, wherein start_band is a predefined variable.

6. The method of claim 1 , wherein obtaining the dominant tonal components comprises detecting the dominant tonal components of the frequency-extended decoded audio signal in the frequency domain, wherein the ambience signal is created in the frequency domain.

7. A computer program stored on a non-transitory medium, wherein the computer program when executed on a processor performs the method as claimed in claim 1 .

8. A computer program stored on a non-transitory medium, wherein the computer program when executed on a processor performs the method as claimed in claim 2 .

9. A computer program stored on a non-transitory medium, wherein the computer program when executed on a processor performs the method as claimed in claim 3 .

10. A computer program stored on a non-transitory medium, wherein the computer program when executed on a processor performs the method as claimed in claim 4 .

11. A computer program stored on a non-transitory medium, wherein the computer program when executed on a processor performs the method as claimed in claim 5 .

12. A computer program stored on a non-transitory medium, wherein the computer program when executed on a processor performs the method as claimed in claim 6 .

13. A method, comprising: obtaining a decoded audio signal, wherein the decoded audio signal has been decoded in a first frequency band; obtaining an ambience signal by computing a mean value of a frequency spectrum of the decoded audio signal; obtaining dominant tonal components from the decoded audio signal, wherein the dominant tonal components are tonal components, wherein the tonal components comprise magnitudes, wherein the magnitudes exceed a threshold, wherein obtaining the dominant tonal components comprises subtracting the ambience signal from the decoded audio signal; combining the dominant tonal components and the ambience signal by adaptive mixing using energy level control factors to obtain a combined signal; and extending frequencies of the combined signal into a second frequency band to produce a frequency-extended combined signal, wherein the second frequency band is higher than the first frequency band.

14. The method of claim 13 , wherein obtaining the dominant tonal components comprises detecting the dominant tonal components of the frequency-extended decoded audio signal in the frequency domain, wherein the ambience signal is created in the frequency domain.

15. The method of claim 13 , wherein extending the frequencies of the combined audio signal into the second frequency band employs the following equation: U HB ⁢ ⁢ 1 ⁡ ( k ) = { 0 k = 0 , … , 199 U ⁡ ( k ) k = 200 , … , 239 U ⁢ ( k + ⁢ start_band - 240 ) k = 240 , … , 319 wherein k is the index of the sample, wherein U(k) is the spectrum of the combined audio signal obtained after a frequency domain transform of the combined audio signal, wherein U HB1 (k) is the spectrum of the frequency-extended combined audio signal, wherein start_band is a predefined variable.

16. A computer program stored on a non-transitory medium, wherein the computer program when executed on a processor performs the method as claimed in claim 13 .

17. A computer program stored on a non-transitory medium, wherein the computer program when executed on a processor performs the method as claimed in claim 14 .

18. A computer program stored on a non-transitory medium, wherein the computer program when executed on a processor performs the method as claimed in claim 15 .

19. A method, comprising: obtaining a decoded audio signal, wherein the decoded audio signal is decoded in a first frequency band; extending frequencies of the decoded audio signal into a second frequency band, wherein the extension of frequencies is arranged to produce a frequency-extended decoded audio signal, wherein the second frequency band is higher than the first frequency band; obtaining dominant tonal components from the frequency-extended decoded audio signal, wherein the dominant tonal components are tonal components, wherein the tonal components comprise magnitudes, wherein the magnitudes exceed a threshold; removing the dominant tonal components from the frequency-extended decoded audio signal to obtain an ambience signal; and combining the dominant tonal components and the ambience signal by adaptive mixing using energy level control factors to obtain a frequency-extended combined signal.

20. The method of claim 19 , wherein an energy level control factor is computed as a function of the total energy of the frequency-extended decoded audio signal and of the dominant tonal components, wherein the adaptive mixing uses the energy level factor.

21. The method of claim 19 , further comprising transforming or filter bank-based sub-band decomposing the decoded audio signal, wherein obtaining the dominant tonal components uses the frequency domain or a sub-band domain, wherein the ambience signal is created in the frequency domain or a sub-band domain, wherein the combining is created in the frequency domain or a sub-band domain.

22. The method of claim 19 , wherein extending the frequencies of the decoded audio signal into the second frequency band employs the following equation: U HB ⁢ ⁢ 1 ⁡ ( k ) = { 0 k = 0 , … ⁢ , 199 U ⁡ ( k ) k = 2 ⁢ 0 ⁢ 0 , … ⁢ , 239 U ⁡ ( k + ⁢ start_band - 24 ⁢ 0 ) k = 2 ⁢ 4 ⁢ 0 , … ⁢ , 319 wherein k is the index of the sample, wherein U(k) is the spectrum of the decoded audio signal obtained after a frequency domain transform of the decoded audio signal, wherein U HB1 (k) is the spectrum of the frequency-extended decoded audio signal, wherein start_band is a predefined variable.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

G10L

Patent Metadata

Filing Date

July 13, 2020

Publication Date

April 26, 2022

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search