Device and Method for Manipulating an Audio Signal

PublishedJuly 15, 2025

Assigneenot available in USPTO data we have

InventorsSascha DISCH Frederik NAGEL Max NEUENDORF Christian HELMRICH Dominik ZORN

Technical Abstract

Patent Claims

28 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. An apparatus for manipulating an audio signal, comprising: a windower configured for generating a plurality of consecutive blocks of audio samples, the plurality of the consecutive blocks comprising at least one padded block of audio samples, the padded block comprising padded values and audio signal values; a first converter configured for converting the padded block into a spectral representation comprising spectral values; a phase modifier configured for modifying phases of the spectral values to achieve a modified spectral representation; a second converter configured for converting the modified spectral representation into a modified time domain audio signal; and a transient detector configured for detecting a transient event in the audio signal, wherein the first converter is configured for converting the padded block, when the transient detector detects the transient event in a block of the audio signal corresponding to the padded block, wherein the first converter is configured for converting a non-padded block comprising audio signal values only, the non-padded block corresponding to the non-padded block of the audio signal, when the transient detector does not detect the transient event in the non-padded block of the audio signal, and wherein at least one of the windower, the phase modifier, the second converter, and the transient detector comprises a hardware implementation.

2. The apparatus according to claim 1, further comprising: a decimator configured for decimating the modified time domain audio signal or overlap-added blocks of modified time domain audio samples to acquire a decimated time domain signal, wherein a decimation characteristic depends on a phase modification characteristic applied by the phase modifier.

3. The apparatus in accordance with claim 2, which is adapted for performing a bandwidth extension using the audio signal, further comprising: a band pass filter configured for extracting a bandpass signal from the spectral representation or from the audio signal, wherein a bandpass characteristic of the bandpass filter is selected depending on a phase modification characteristic applied by the phase modifier, so that the bandpass signal is transformed by subsequent processing in a bandwidth extension scheme to a target frequency range, the target frequency range comprising a frequency range not included in a frequency range of the audio signal.

4. The apparatus in accordance with claim 2, further comprising: an overlap adder configured for adding overlapping blocks of decimated audio samples or modified time domain audio samples of the modified time domain audio signal to acquire a signal in a target frequency range of a bandwidth extension algorithm.

5. The apparatus according to claim 2, further comprising: a synthesis windower configured for windowing the decimated time domain signal or the modified time domain audio signal comprising a synthesis window function matched to an analysis function applied by the windower.

6. The apparatus according to claim 2, the apparatus being configured for performing a bandwidth extension algorithm, the bandwidth extension algorithm comprising a bandwidth extension factor, the bandwidth extension factor controlling a frequency shift between a band of the audio signal and a target frequency band, wherein the first converter, the phase modifier, the second converter and the decimator are configured to operate using different bandwidth extension factors, so that different modified time audio signals comprising different target frequency bands are achieved, wherein the apparatus comprises an overlap adder configured for performing an overlap add based on the different bandwidth extension factors, and a combiner configured for combining overlap add results to acquire a combined signal comprising the different target frequency bands.

7. The apparatus according to claim 4, further comprising: a scaler configured for scaling the spectral values by a factor, wherein the factor depends on an overlap add characteristic in that a relation of a first time distance for an overlap-add applied by the windower and a different time distance applied by the overlap adder and a window characteristics is accounted for.

8. The apparatus according to claim 4, further comprising: an envelope adjuster configured for adjusting an envelope of the signal in the target frequency range of the bandwidth extension algorithm or a combined signal based on transmitted parameters to acquire a corrected signal; and a further combiner configured for combining the audio signal and the corrected signal to acquire a manipulated signal which is extended in bandwidth.

9. The apparatus according to claim 1, wherein the windower comprises: an analysis window processor configured for generating a plurality of consecutive blocks having identical sizes; and a padder configured for padding a block of the plurality of the consecutive blocks of audio samples to achieve the padded block by inserting the padded values at specified time positions before a first sample of a consecutive block of audio samples or after a last sample of the consecutive block of audio samples.

10. The apparatus according to claim 1, in which the windower is configured for inserting the padded values at specified time positions before a first sample of a consecutive block of audio samples or after a last sample of the consecutive block of audio samples, the apparatus further comprising: a padding remover configured for removing samples at time positions of the modified time domain audio signal, the time positions corresponding to the specified time positions applied by the windower.

11. The apparatus according to claim 10, in which the windower is configured for symmetrically inserting the padded values before the first sample of the consecutive block of audio samples and after the last sample of the consecutive block of audio samples, so that the padded block is adapted to a conversion by the first converter and the second converter.

12. The apparatus according to claim 1, in which the windower is configured for inserting the padded values at specified time positions before a first sample of a consecutive block of audio samples or after a last sample of the consecutive block of audio samples, wherein a sum of a number of the padded values and a number of values in the consecutive block of audio samples is at least 1.4 times the number of values in the consecutive block of audio samples.

13. The apparatus according to claim 1, wherein the windower is configured for applying a window function comprising at least one guard zone at a start position of the window function or at an end position of the window function.

14. The apparatus according to claim 1, the apparatus being configured for performing a bandwidth extension algorithm, the bandwidth extension algorithm comprising a bandwidth extension factor, the bandwidth extension factor controlling a frequency shift between a band of the audio signal and a target frequency band, wherein the phase modifier is configured to scale phases of spectral values of the band of the audio signal by the bandwidth extension factor, so that at least one sample of a consecutive block of audio samples is cyclically convolved into a block.

15. The apparatus according to claim 1, wherein the windower comprises: a padder configured for inserting the padded values at specified time positions before a first sample of a consecutive block of audio samples or after a last sample of the consecutive block of audio samples, the apparatus further comprising: a switch which is controlled by the transient detector, wherein the switch is configured to control the padder so that the padded block is generated when a transient event is detected by the transient detector, the padded block comprising the padded values and the audio signal values, and to control the padder, so that a non-padded block is generated when the transient event is not detected by the transient detector, the non-padded block comprising audio signal values only, wherein the first converter comprises a first sub-converter and a second sub-converter, wherein the switch is furthermore configured to feed the padded block to the first sub-converter to perform a conversion comprising a first conversion length when the transient event is detected by the transient detector and to feed the non-padded block to the second sub-converter to perform a conversion comprising a second length shorter than the first length when the transient event is not detected by the transient detector.

16. The apparatus according to claim 1, wherein the windower comprises an analysis window processor configured for applying an analysis window function to a consecutive block of audio samples, the analysis window processor being controllable so that the analysis window function comprises a guard zone at a start position of the analysis window function or an end position of the analysis window function, the apparatus further comprising: a guard window switch which is controlled by the transient detector, wherein the guard window switch is configured to control the analysis window processor, so that a padded block is generated from a consecutive block of audio samples by use of the analysis window function comprising the guard zone, the padded block comprising the padded values and the audio signal values when a transient event is detected by the transient detector, and to control the analysis window processor, so that a non-padded block is generated, the non-padded block comprising the audio signal values only, when the transient event is not detected by the transient detector, wherein the first converter comprises a first sub-converter and a second sub-converter, wherein the guard window switch is furthermore configured to feed the padded block to the first sub-converter to perform a conversion comprising a first conversion length when a transient event is detected by the transient detector and to feed the non-padded block to the second sub-converter to perform a conversion comprising a second length shorter than the first length when the transient event is not detected by the transient detector.

17. The apparatus according to claim 1, wherein the windower is configured for generating the plurality of the consecutive blocks of the audio samples, the plurality of the consecutive blocks comprising at least a first pair of a non-padded block and a consecutive padded block and a second pair of a padded block and a consecutive non-padded block, the apparatus further comprising: a decimator configured for decimating modified time domain audio samples or overlap-added blocks of the modified time domain audio samples of the first pair to acquire decimated audio samples of the first pair or for decimating the modified time domain audio samples or overlap-added blocks of the modified time domain audio samples of the second pair to acquire decimated audio samples of the second pair, and an overlap adder, wherein the overlap adder is configured for adding overlapping blocks of the decimated audio samples or the modified time domain audio samples of the first pair or the second pair, wherein for the first pair a time distance between a first sample of the non-padded block and a first sample of audio signal values of the padded block is supplied by the overlap adder, or wherein for the second pair a time distance between a first sample of the audio signal values of the padded block and a first sample of the non-padded block is supplied by the overlap adder, to acquire a signal in a target frequency range of a bandwidth extension algorithm.

18. A method for manipulating an audio signal, comprising: generating, by a windower, a plurality of consecutive blocks of audio samples, the plurality of the consecutive blocks of the audio samples comprising at least one padded block of audio samples, the padded block comprising padded values and audio signal values; converting, by a first converter, the padded block into a spectral representation comprising spectral values; modifying, by a phase modifier, phases of the spectral values to achieve a modified spectral representation; and converting, by a second converter, the modified spectral representation into a modified time domain audio signal, determining, by a transient detector, a transient event in the audio signal, wherein the padded block is converted into the spectral representation, when the transient event is detected in a block of the audio signal corresponding to the padded block, and wherein a non-padded block comprising audio signal values only is converted into the spectral representation, the non-padded block corresponding to the block of the audio signal, when the transient event is not detected in the block of the audio signal, and wherein at least one of the windower, the phase modifier, the second converter, and the transient detector comprises a hardware implementation.

19. A non-transitory storage medium having stored thereon a computer program comprising a program code for performing a method for manipulating an audio signal when the computer program is executed on a computer, said method comprising: generating a plurality of consecutive blocks of audio samples, the plurality of the consecutive blocks of the audio samples comprising at least one padded block of audio samples, the padded block comprising padded values and audio signal values; converting the padded block into a spectral representation comprising spectral values; modifying phases of the spectral values to achieve a modified spectral representation; converting the modified spectral representation into a modified time domain audio signal; and determining a transient event in the audio signal, wherein the padded block is converted into the spectral representation, when the transient event is detected in a block of the audio signal corresponding to the padded block, and wherein a non-padded block comprising audio signal values only is converted into the spectral representation, the non-padded block corresponding to the block of the audio signal, when the transient event is not detected in the block of the audio signal.

20. A method for manipulating an audio signal, comprising: generating, by a windower, a plurality of consecutive blocks of audio samples, the plurality of the consecutive blocks of the audio samples comprising at least one padded block of audio samples, the padded block comprising padded values and audio signal values; converting, by a first converter, the padded block into a spectral representation comprising spectral values; modifying, by a phase modifier, phases of the spectral values to achieve a modified spectral representation; converting, by a second converter, the modified spectral representation into a modified time domain audio signal; windowing, by a synthesis windower, the modified time domain audio signal comprising a synthesis window function matched to an analysis function applied by the windower; and determining, by a transient detector, a transient event in the audio signal, wherein the padded block is converted into the spectral representation, when the transient event is detected in a block of the audio signal corresponding to the padded block, wherein a non-padded block comprising audio signal values only is converted into the spectral representation, the non-padded block corresponding to the block of the audio signal, when the transient event is not detected in the block of the audio signal, and wherein at least one of the windower, the phase modifier, the second converter, the synthesis windower, and the transient detector comprises a hardware implementation.

21. A non-transitory storage medium having stored thereon a computer program comprising a program code for performing a method for manipulating an audio signal when the computer program is executed on a computer, said method comprising: generating a plurality of consecutive blocks of audio samples, the plurality of the consecutive blocks of the audio samples comprising at least one padded block of audio samples, the padded block comprising padded values and audio signal values; converting the padded block into a spectral representation comprising spectral values; modifying phases of the spectral values to achieve a modified spectral representation; converting the modified spectral representation into a modified time domain audio signal; synthesis windowing the modified time domain audio signal comprising a synthesis window function matched to an analysis function applied by the generating; and determining a transient event in the audio signal, wherein the padded block is converted into the spectral representation, when the transient event is detected in a block of the audio signal corresponding to the padded block, and wherein a non-padded block comprising audio signal values only is converted into the spectral representation, the non-padded block corresponding to the block of the audio signal, when the transient event is not detected in the block of the audio signal.

22. The method in accordance with claim 20, which is adapted for performing a bandwidth extension using the audio signal, further comprising: using a band pass filter configured for extracting a bandpass signal from the spectral representation or from the audio signal, wherein a bandpass characteristic of the bandpass filter is selected depending on a phase modification characteristic applied by the phase modifier, so that the bandpass signal is transformed by a subsequent processing in a bandwidth extension scheme to a target frequency range, the target frequency range comprising a frequency range not included in a frequency range of the audio signal.

23. The method according to claim 20, further comprising: scaling the spectral values by a factor, wherein the factor depends on an overlap add characteristic in that a relation of a first time distance for an overlap-add applied by the windower and a different time distance applied by the overlap adder and a window characteristics is accounted for.

24. The method according to claim 20, in which the windower is configured for inserting the padded values at specified time positions before a first sample of a consecutive block of audio samples or after a last sample of the consecutive block of audio samples, the method further comprising: removing samples at time positions of the modified time domain audio signal, the time positions corresponding to the specified time positions applied by the windower.

25. The method according to claim 20, in which the windower is configured for symmetrically inserting the padded values before the first sample of the consecutive block of audio samples and after the last sample of the consecutive block of audio samples, so that the padded block is adapted to a conversion by the first converter and the second converter.

26. The method according to claim 20, the method being configured for performing a bandwidth extension algorithm, the bandwidth extension algorithm comprising a bandwidth extension factor, the bandwidth extension factor controlling a frequency shift between a band of the audio signal and a target frequency band, wherein the phase modifier is configured to scale phases of spectral values of the band of the audio signal by the bandwidth extension factor, so that at least one sample of a consecutive block of audio samples is cyclically convolved into a block.

27. The method according to claim 20, wherein the windower comprises: using a padder configured for inserting the padded values at specified time positions before a first sample of a consecutive block of audio samples or after a last sample of the consecutive block of audio samples, the method further comprising: using a switch which is controlled by the transient detector, wherein the switch is configured to control the padder so that the padded block is generated when a transient event is detected by the transient detector, the padded block comprising the padded values and the audio signal values, and to control the padder, so that a non-padded block is generated when the transient event is not detected by the transient detector, the non-padded block comprising audio signal values only, wherein the first converter comprises a first sub-converter and a second sub-converter, and wherein the switch is furthermore configured to feed the padded block to the first sub-converter to perform a conversion comprising a first conversion length when the transient event is detected by the transient detector and to feed the non-padded block to the second sub-converter to perform a conversion comprising a second conversion length being shorter than the first conversion length when the transient event is not detected by the transient detector.

28. The method according to claim 20, wherein the windower comprises an analysis window processor configured for applying an analysis window function to a consecutive block of audio samples, the analysis window processor being controllable so that the analysis window function comprises a guard zone at a start position of the analysis window function or an end position of the analysis window function, the method further comprising: using a guard window switch which is controlled by the transient detector, wherein the guard window switch is configured to control the analysis window processor, so that the padded block is generated from a consecutive block of audio samples by use of the analysis window function comprising the guard zone, the padded block comprising the padded values and the audio signal values when a transient event is detected by the transient detector, and to control the analysis window processor, so that a non-padded block is generated, the non-padded block comprising the audio signal values only, when the transient event is not detected by the transient detector, wherein the first converter comprises a first sub-converter and a second sub-converter, and wherein the guard window switch is furthermore configured to feed the padded block to the first sub-converter to perform a conversion comprising a first conversion length when a transient event is detected by the transient detector and to feed the non-padded block to the second sub-converter to perform a conversion comprising a second conversion length being shorter than the first conversion length when the transient event is not detected by the transient detector.

Patent Metadata

Filing Date

Unknown

Publication Date

July 15, 2025

Inventors

Sascha DISCH

Frederik NAGEL

Max NEUENDORF

Christian HELMRICH

Dominik ZORN

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search