US-9734837

Method, medium, and system scalably encoding/decoding audio/speech

PublishedAugust 15, 2017

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

A method, medium, and system scalably encoding/decoding audio/speech. The method includes splitting an input signal into a low frequency band signal that is lower than a predetermined frequency and a high frequency band signal that is higher than the predetermined frequency, scalably encoding the split low frequency band signal into a core layer and one or more extension layers and then decoding the encoded core layer and the encoded extension layers, generating an error signal by using the split low frequency band signal and a decoded signal of the encoded core layer and the encoded extension layers, and encoding the error signal and the high frequency band signal into a signal-to-noise ratio (SNR) enhancement layer and a bandwidth extension layer.

Patent Claims

12 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A method for scalably encoding an input audio/speech signal, the method comprising: encoding, performed by using at least one processing device, a core layer signal associated with a core bandwidth, from the input audio/speech signal; encoding, performed by using at least one processing device, one or more enhancement layer signal associated with one or more extended bandwidth, respectively, from the input audio/speech signal; generating a bitstream by multiplexing the encoded core layer signal and the one or more encoded enhancement layer signal; and transmitting the bitstream to a decoding side, wherein the encoding one or more enhancement layer signal comprises: obtaining one or more extension signal from the core bandwidth and the one or more extended bandwidth; transforming the one or more extension signal from a time domain into a frequency domain; and generating the one or more encoded enhancement layer signal by encoding the one or more transformed extension signal.

2. The method of claim 1 further comprising: decoding the encoded core layer signal and the one or more encoded enhancement layer signal; generating an error signal by using the decoded core layer signal and the one or more decoded enhancement signal; and encoding the error signal into one or more signal-to-noise ratio (SNR) enhancement layer signal.

3. The method of claim 2 , wherein the generating of the error signal comprises generating the error signal by subtracting the decoded core layer signal and the one or more decoded enhancement layer signal from the one or more enhancement layer signal.

4. The method of claim 3 , further comprising transforming the error signal from the time domain to the frequency domain, wherein the encoding of the error signal comprises encoding the transformed error signal into the one or more SNR enhancement layer signal.

5. A method for scalably decoding an audio/speech signal, the method comprising: receiving a bitstream transmitted from an encoding side, the bitstream including an encoded core layer signal and one or more encoded enhancement layer signal; decoding, performed by using at least one processing device, the encoded core layer signal associated with a core bandwidth; decoding, performed by using at least one processing device, the one or more encoded enhancement layer signal associated with one or more extended bandwidth, respectively; and reconstructing a bandwidth extended signal for reproduction, based on the decoded core layer signal and the one or more decoded enhancement layer signal, wherein the decoding the one or more encoded enhancement layer comprises: decoding one or more encoded extension signal from the core bandwidth and the one or more extended bandwidth, included in the bitstream; transforming the one or more decoded extension signal from a frequency domain into a time domain; and generating the one or more transformed extension signal as the one or more decoded enhancement layer signal.

6. The method of claim 5 further comprising: decoding one or more encoded SNR enhancement layer signal, included in the bitstream; and adding the one or more decoded SNR enhancement signal to the decoded core layer signal and the one or more decoded enhancement layer signal.

7. A non-transitory computer readable recording medium having recorded thereon a computer program for executing the method of claim 5 .

8. The non-transitory computer readable recording medium of claim 7 , further comprising: decoding one or more encoded SNR enhancement layer signal, included in the bitstream; and adding the one or more decoded SNR enhancement signal to the decoded core layer signal and the one or more decoded enhancement layer signal.

9. A system for scalably encoding an input audio/speech signal, the system comprising: at least one processing device configured to: encode a core layer signal associated with a core bandwidth, from the input audio/speech signal; encode one or more enhancement layer signal associated with one or more extended bandwidth, respectively, from the input audio/speech signal; generate a bitstream by multiplexing the encoded core layer signal and the one or more encoded enhancement layer signal; and transmit the bitstream to a decoding side, wherein the processing device is configured to: obtain one or more extension signal from the core bandwidth and the one or more extended bandwidth; transform the one or more extension signal from a time domain into a frequency domain; and generate the one or more encoded enhancement layer signal by encoding the one or more transformed extension signal.

10. The system of claim 9 , wherein the processing device is further configured to: decode the encoded core layer signal and the one or more encoded enhancement layer signal; generate an error signal by using the decoded core layer signal and the one or more decoded enhancement signal; and encode the error signal into one or more signal-to-noise ratio (SNR) enhancement layer signal.

11. A system for scalably decoding an audio/speech signal, the system comprising: at least one processing device configured to: receive a bitstream transmitted from an encoding side, the bitstream including an encoded core layer signal and one or more encoded enhancement layer signal; decode the encoded core layer signal associated with a core bandwidth; decode the one or more encoded enhancement layer signal associated with one or more extended bandwidth, respectively; and reconstruct a bandwidth extended signal for reproduction, based on the decoded core layer signal and the one or more decoded enhancement layer signal, wherein the processing device is configured to: decode one or more encoded extension signal from the core bandwidth and the one or more extended bandwidth, included in the bitstream; transform the one or more decoded extension signal from a frequency domain into a time domain; and generate the one or more transformed extension signal as the one or more decoded enhancement layer signal.

12. The system of claim 11 , wherein the processing device is further configured to: decode one or more encoded SNR enhancement layer, included in the bitstream; add the one or more decoded SNR enhancement signal to the decoded core layer and one or more decoded enhancement layer signal.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

G10L

Patent Metadata

Filing Date

October 5, 2012

Publication Date

August 15, 2017

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search