Legal claims defining the scope of protection, as filed with the USPTO.
1. An audio processing system configured to accept an audio bitstream, the audio processing system comprising: a decoder adapted to receive the bitstream and to output quantized spectral coefficients; a front-end component, which includes: a dequantization stage adapted to receive the quantized spectral coefficients and to output a first frequency-domain representation of an intermediate signal; and an inverse transform stage for receiving the first frequency-domain representation of the intermediate signal and synthesizing, based thereon, a time-domain representation of the intermediate signal; a processing stage, which includes: an analysis filterbank for receiving the time-domain representation of the intermediate signal and outputting a second frequency-domain representation of the intermediate signal; at least one processing component for receiving said second frequency-domain representation of the intermediate signal and outputting a frequency-domain representation of a processed audio signal; and a synthesis filterbank for receiving the frequency-domain representation of the processed audio signal and outputting a time-domain representation of the processed audio signal; and a sample rate converter for receiving said time-domain representation of the processed audio signal and outputting a reconstructed audio signal sampled at a target sampling frequency, wherein the respective internal sampling rates of the time-domain representation of the intermediate audio signal and of the time-domain representation of the processed audio signal are equal, and wherein said at least one processing component includes: a parametric upmix stage for receiving a downmix signal with M channels and outputting, based thereon, a signal with N channels, wherein the parametric upmix stage is operable at least in a mode where 1≦M<N, associated with a delay, and a mode where 1≦M=N; and a first delay stage configured to incur a delay, when the parametric upmix stage is in the mode where 1≦M=N, to compensate for the delay associated with the mode where 1≦M<N in order for the processing stage to have a constant total delay independently of a current operating mode of the parametric upmix stage.
2. The audio processing system of claim 1 , wherein the front-end component is operable in an audio mode and a voice-specific mode, and wherein a mode change from the audio mode into the voice-specific mode of the front-end component includes reducing a maximal frame length of the inverse transform stage.
3. The audio processing system of claim 2 , wherein the sample rate converter is operable to provide a reconstructed audio signal sampled at the target sampling frequency differing by up to 5% from the internal sampling rate of said time-domain representation of the processed audio signal.
4. The audio processing system of claim 1 , further comprising a bypass line arranged parallel to the processing stage and comprising a second delay stage configured to incur a delay equal to the constant total delay of the processing stage.
5. The audio processing system of claim 1 , wherein the parametric upmix stage is further operable at least in a mode where M=3 and N=5.
6. The audio processing system of claim 5 , wherein the front-end component is configured, in that mode of the parametric upmix stage where M=3 and N=5, to provide an intermediate signal comprising a downmix signal where the front-end component derives two channels out of the M=3 channels from jointly coded channels in the audio bitstream.
7. The audio processing system of claim 1 , wherein said at least one processing component further includes a spectral band replication module arranged upstream of the parametric upmix stage and operable to reconstruct high-frequency content, wherein the spectral band replication module is configured to be active at least in those modes of the parametric upmix stage where M<N; and is operable independently of the current mode of the parametric upmix stage when the parametric upmix stage is in any of the modes where M=N.
8. The audio processing system of claim 7 , wherein said at least one processing component further includes a waveform coding stage arranged parallel to or downstream of the parametric upmix stage and operable to augment each of the N channels with waveform-coded low-frequency content, wherein the waveform coding stage is activatable and deactivatable independently of the current mode of the parametric upmix stage and the spectral band replication module.
9. The audio processing system of claim 8 , operable at least in a decoding mode where the parametric upmix stage is in a M=N mode with M >2.
10. The audio processing system of claim 9 , operable at least in the following decoding modes: i) parametric upmix stage in M=N=1 mode; ii) parametric upmix stage in M=N=1 mode and spectral band replication module active; iii) parametric upmix stage in M=1, N=2 mode and spectral band replication module active; iv) parametric upmix stage in M=1, N=2 mode, spectral band replication module active and waveform coding stage active; v) parametric upmix stage in M=2, N=5 mode and spectral band replication module active; vi) parametric upmix stage in M=2, N=5 mode, spectral band replication module active and waveform coding stage active; vii) parametric upmix stage in M=3, N=5 mode and spectral band replication module active; viii) parametric upmix stage in M=N=2 mode; ix) parametric upmix stage in M=N=2 mode and spectral band replication module active; x) parametric upmix stage in M=N=7 mode; xi) parametric upmix stage in M=N=7 mode and spectral band replication module active.
11. The audio processing system of claim 1 , further comprising the following components arranged downstream of the processing stage: a phase shifting component configured to receive the time-domain representation of the processed audio signal, in which at least one channel represents a surround channel, and to perform a 90-degree phase shift on said at least one surround channel; and a downmix component configured to receive the processed audio signal from the phase shifting component and to output, based thereon, a downmix signal with two channels.
12. The audio processing system of claim 1 , further comprising an Lfe decoder configured to prepare at least one additional channel based on the audio bitstream and include said additional channel(s) in the reconstructed audio signal.
13. A method of processing an audio bitstream, the method comprising: providing quantized spectral coefficients based on the bitstream; receiving the quantized spectral coefficients and performing inverse quantization followed by a frequency-to-time transformation, whereby a time-domain representation of an intermediate audio signal is obtained; providing a frequency-domain representation of the intermediate audio signal based on the time-domain representation of the intermediate audio signal; providing a frequency-domain representation of a processed audio signal by performing at least one processing step on the frequency-domain representation of the intermediate audio signal; providing a time-domain representation of the processed audio signal based on the frequency-domain representation of the processed audio signal; and changing the sampling rate of the time-domain representation of the processed audio signal into a target sampling frequency, whereby a reconstructed audio signal is obtained, wherein the respective internal sampling rates of the time-domain representation of the intermediate audio signal and of the time-domain representation of the processed audio signal are equal, wherein the method further comprises: determining a current mode among at least a mode where 1≦M<N, associated with a delay, and a mode where 1≦M=N, wherein the at least one processing step includes: receiving a downmix signal with M channels and outputting, based thereon, a signal with N channels; in response to the current mode being the mode where 1≦M=N, incurring a delay to compensate for the delay associated with the mode where 1≦M<N in order for a total delay of the processing step to be constant independently of the current mode.
14. The method of claim 13 , wherein said inverse quantization and/or frequency-to-time transformation are performed in a hardware component operable at least in an audio mode and a voice-specific mode, a current mode being selected in accordance with metadata associated with the quantized spectral coefficients, and wherein a mode change from the audio mode into the voice-specific mode includes reducing a maximal frame length of the frequency-to-time transformation.
15. A non-transitory computer program product comprising a non-transitory computer-readable medium with instructions for performing the method of claim 13 .
Unknown
October 25, 2016
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.