Patentable/Patents/US-10891960
US-10891960

Temporal offset estimation

PublishedJanuary 12, 2021
Assigneenot available in USPTO data we have
Inventorsnot available in USPTO data we have
Technical Abstract

A method of coding for multi-channel audio signals includes estimating comparison values at an encoder indicative of an amount of temporal mismatch between a reference channel and a corresponding target channel. The method includes smoothing the comparison values to generate short-term and first long-term smoothed comparison values. The method includes calculating a cross-correlation value between the comparison values and the short-term smoothed comparison values. The method also includes adjusting the first long-term smoothed comparison values in response to comparing the cross-correlation value with a threshold. The method further includes estimating a tentative shift value and non-causally shifting the target channel by a non-causal shift value to generate an adjusted target channel. The non-causal shift value is based on the tentative shift value. The method further includes generating, based on reference channel and the adjusted target channel, at least one of a mid-band channel or a side-band channel.

Patent Claims
52 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1

1. A method for coding of multi-channel audio signals at an encoder of an electronic device, the method comprising: estimating comparison values, at the encoder, each comparison value indicative of an amount of temporal mismatch between a first reference frame of a reference channel and a corresponding first target frame of a target channel; smoothing, at the encoder, the comparison values to generate short-term smoothed comparison values; smoothing, at the encoder, the comparison values to generate first long-term smoothed comparison values based on a smoothing parameter; calculating, at the encoder, a cross-correlation value between the comparison values and the short-term smoothed comparison values; comparing, at the encoder, the cross-correlation value with a threshold; adjusting, at the encoder, the first long-term smoothed comparison values to generate second long-term smoothed comparison values, in response to determination that the cross-correlation value exceeds the threshold; estimating, at the encoder, a tentative shift value based on the second long-term smoothed comparison values; determining, at the encoder, a non-causal shift value based on the tentative shift value; non-causally shifting, at the encoder, a particular target channel by the non-causal shift value to generate an adjusted particular target channel that is temporally aligned with a particular reference channel; and generating, at the encoder, at least one of a mid-band channel or a side-band channel based on the particular reference channel and the adjusted particular target channel.

2

2. The method of claim 1 , wherein adjusting the first long-term smoothed comparison values comprises increasing values of a subset of the first long-term smoothed comparison values.

3

3. The method of claim 2 , wherein increasing the values of the subset of the first long-term smoothed comparison values comprises increasing at least a value of a first index, wherein the first index corresponds to a non-causal shift value of a second target frame, the second target frame immediately precedes the first target frame.

4

4. The method of claim 3 , wherein the subset of the first long-term smoothed comparison values includes a second index and a third index, wherein the second index is smaller than the first index by one and the third index is bigger than the first index by one.

5

5. The method of claim 1 , wherein the short-term smoothed comparison values are further based on short-term smoothed comparison values of at least one previous frame.

6

6. The method of claim 5 , wherein smoothing the comparison values to generate the short-term smoothed comparison values comprises finite impulse response (FIR) filtering the comparison values.

7

7. The method of claim 1 , wherein the first long-term smoothed comparison values are further based on a weighted mixture of the comparison values and second long-term smoothed comparison values of at least one previous frame.

8

8. The method of claim 7 , wherein smoothing the comparison values to generate the first long-term smoothed comparison values comprises infinite impulse response (IIR) filtering the comparison values.

9

9. The method of claim 1 , wherein calculating the cross-correlation value comprises multiplying each value of the comparison values with each value of the short-term smoothed comparison values.

10

10. The method of claim 1 , wherein the comparison values correspond to cross-correlation values of down-sampled reference channels and corresponding down-sampled target channels.

11

11. The method of claim 1 , further comprising adapting, at the encoder, the smoothing parameter based on variation in the short-term smoothed comparison values relative to the second long-term smoothed comparison values.

12

12. The method of claim 1 , wherein a value of the smoothing parameter is adjusted based on short-term energy indicator of input channels and long-term energy indicator of the input channels.

13

13. The method of claim 1 , wherein the electronic device comprises a mobile device.

14

14. The method of claim 1 , wherein the electronic device comprises a base station.

15

15. An apparatus for coding of multi-channel audio signals, comprising: a first microphone configured to capture a first reference frame of a reference channel; a second microphone configured to capture a corresponding first target frame of a target channel; and an encoder configured to: estimate comparison values each comparison value indicative of an amount of temporal mismatch between the first reference frame of the reference channel and the first target frame of the target channel; smooth the comparison values to generate short-term smoothed comparison values; smooth the comparison values to generate first long-term smoothed comparison values based on a smoothing parameter; calculate a cross-correlation value between the comparison values and the short-term smoothed comparison values; compare the cross-correlation value with a threshold; adjust the first long-term smoothed comparison values to generate second long-term smoothed comparison values, in response to determination that the cross-correlation value exceeds the threshold; estimate a tentative shift value based on the second long-term smoothed comparison values; determine a non-causal shift value based on the tentative shift value; non-causally shift a particular target channel by the non-causal shift value to generate an adjusted particular target channel that is temporally aligned with a particular reference channel; and generate at least one of a mid-band channel or a side-band channel based on the particular reference channel and the adjusted particular target channel.

16

16. The apparatus of claim 15 , wherein the encoder is configured to adjust the first long-term smoothed comparison values by increasing values of a subset of the first long-term smoothed comparison values.

17

17. The apparatus of claim 16 , wherein the encoder is configured to adjust the first long-term smoothed comparison values by increasing at least a value of a first index, wherein the first index corresponds to a non-causal shift value of a second target frame, the second target frame immediately precedes the first target frame.

18

18. The apparatus of claim 17 , wherein the subset of the first long-term smoothed comparison values includes a second index and a third index, wherein the second index is smaller than the first index by one and the third index is bigger than the first index by one.

19

19. The apparatus of claim 15 , wherein the encoder is configured to smooth the comparison values to generate short-term smoothed comparison values by finite impulse response (FIR) filtering the comparison values.

20

20. The apparatus of claim 15 , wherein the first long-term smoothed comparison values are further based on a weighted mixture of the comparison values and second long-term smoothed comparison values of at least one previous frame.

21

21. The apparatus of claim 20 , wherein the encoder is configured to smooth the comparison values to generate long-term smoothed comparison values by infinite impulse response (IIR) filtering the comparison values.

22

22. The apparatus of claim 15 , wherein the comparison values are cross-correlation values of down-sampled reference channels and corresponding down-sampled target channels.

23

23. The apparatus of claim 15 , wherein the encoder is integrated into a mobile device.

24

24. The apparatus of claim 15 , wherein the encoder is integrated into a base station.

25

25. A non-transitory computer-readable medium comprising instructions that, when executed by an encoder, cause the encoder to perform operations comprising: estimating comparison values, each comparison value indicative of an amount of temporal mismatch between a first reference frame of a reference channel and a corresponding first target frame of a target channel; smoothing the comparison values to generate short-term smoothed comparison values; smoothing the comparison values to generate first long-term smoothed comparison values based on a smoothing parameter; calculating a cross-correlation value between the comparison values and the short-term smoothed comparison values; comparing the cross-correlation value with a threshold; adjusting the first long-term smoothed comparison values to generate second long-term smoothed comparison values, in response to determination that the cross-correlation value exceeds the threshold; estimating a tentative shift value based on the second long-term smoothed comparison values; determining a non-causal shift value based on the tentative shift value; non-causally shifting a particular target channel by the non-causal shift value to generate an adjusted particular target channel that is temporally aligned with a particular reference channel; and generating at least one of a mid-band channel or a side-band channel based on the particular reference channel and the adjusted particular target channel.

26

26. The non-transitory computer-readable medium of claim 25 , wherein the operations further comprise adjusting the first long-term smoothed comparison values comprises increasing values of a subset of the first long-term smoothed comparison values.

27

27. The non-transitory computer-readable medium of claim 25 , wherein increasing the values of the subset of the first long-term smoothed comparison values comprises increasing at least a value of a first index, wherein the first index corresponds to a non-causal shift value of a second target frame, the second target frame immediately precedes the first target frame.

28

28. The non-transitory computer-readable medium of claim 25 , wherein calculating the cross-correlation value comprises multiplying each value of the comparison values with each value of the short-term smoothed comparison values.

29

29. An apparatus for coding of multi-channel audio signals, comprising: means for estimating comparison values each comparison value indicative of an amount of temporal mismatch between a first reference frame of a reference channel and a corresponding first target frame of a target channel; means for smoothing the comparison values to generate short-term smoothed comparison values; means for smoothing the comparison values to generate first long-term smoothed comparison values based on a smoothing parameter; means for calculating a cross-correlation value between the comparison values and the short-term smoothed comparison values; means for comparing the cross-correlation value with a threshold; means for adjusting the first long-term smoothed comparison values to generate second long-term smoothed comparison values, in response to determination that the cross-correlation value exceeds the threshold; means for estimating a tentative shift value based on the second long-term smoothed comparison values; means for determining a non-causal shift value based on the tentative shift value; means for non-causally shifting a particular target channel by the non-causal shift value to generate an adjusted particular target channel that is temporally aligned with a particular reference channel; and means for generating at least one of a mid-band channel or a side-band channel based on the particular reference channel and the adjusted particular target channel.

30

30. The apparatus of claim 29 , wherein the means for adjusting the first long-term smoothed comparison values comprises means for increasing values of a subset of the first long-term smoothed comparison values.

31

31. The apparatus of claim 29 , wherein the means for increasing the values of the subset of the first long-term smoothed comparison values comprises means for increasing at least a value of a first index, wherein the first index corresponds to a non-causal shift value of a second target frame, the second target frame immediately precedes the first target frame.

32

32. The apparatus of claim 29 , wherein the means for calculating the cross-correlation value comprises means for multiplying each value of the comparison values with each value of the short-term smoothed comparison values.

33

33. A method for coding of multi-channel audio signals at an encoder of an electronic device, the method comprising: estimating comparison values, at the encoder, each comparison value indicative of an amount of temporal mismatch between a first reference frame of a reference channel and a corresponding first target frame of a target channel; smoothing, at the encoder, the comparison values to generate first long-term smoothed comparison values based on a smoothing parameter; calculating, at the encoder, a gain parameter between a second reference frame of the reference channel and a corresponding second target frame of the target channel, the gain parameter based on an energy of the second reference frame and an energy of the second target frame, wherein the second reference frame precedes the first reference frame and the second target frame precedes the first target frame; comparing, at the encoder, the gain parameter with a first threshold; in response to the comparison, adjusting, at the encoder, a first subset of the first long-term smoothed comparison values to generate second long-term smoothed comparison values; estimating, at the encoder, a tentative shift value based on the second long-term smoothed comparison values; determining, at the encoder, a non-causal shift value based on the tentative shift value; non-causally shifting, at the encoder, a particular target channel by the non-causal shift value to generate an adjusted particular target channel that is temporally aligned with a particular reference channel; and generating, at the encoder, at least one of a mid-band channel or a side-band channel based on the particular reference channel and the adjusted particular target channel.

34

34. The method of claim 33 , wherein adjusting the first subset of the first long-term smoothed comparison values comprise emphasizing a positive shift side of the first long-term smoothed comparison values in response to the comparison that the gain parameter is greater than the first threshold.

35

35. The method of claim 33 , wherein adjusting the first subset of the first long-term smoothed comparison values comprise deemphasizing a negative shift side of the first long-term smoothed comparison values in response to the comparison that the gain parameter is greater than the first threshold.

36

36. The method of claim 33 , wherein adjusting the first subset of the first long-term smoothed comparison values comprise emphasizing a negative shift side of the first long-term smoothed comparison values in response to the comparison that the gain parameter is less than the first threshold.

37

37. The method of claim 33 , wherein adjusting the first subset of the first long-term smoothed comparison values comprise deemphasizing a positive shift side of the first long-term smoothed comparison values in response to the comparison that the gain parameter is greater than the first threshold.

38

38. An apparatus for coding of multi-channel audio signals, comprising: a first microphone configured to capture a first reference frame of a reference channel; a second microphone configured to capture a first target frame of a target channel; and an encoder configured to: estimate comparison values, each comparison value indicative of an amount of temporal mismatch between the first reference frame of the reference channel and the corresponding first target frame of the target channel; smooth the comparison values to generate first long-term smoothed comparison values based on a smoothing parameter; calculate a gain parameter between a second reference frame of the reference channel and a corresponding second target frame of the target channel, the gain parameter based on an energy of the second reference frame and an energy of the second target frame, wherein the second reference frame precedes the first reference frame and the second target frame precedes the first target frame; compare the gain parameter with a first threshold; in response to the comparison, adjust a first subset of the first long-term smoothed comparison values to generate second long-term smoothed comparison values; estimate a tentative shift value based on the second long-term smoothed comparison values; determine a non-causal shift value based on the tentative shift value; non-causally shift a particular target channel by the non-causal shift value to generate an adjusted particular target channel that is temporally aligned with a particular reference channel; and generate at least one of a mid-band channel or a side-band channel based on the particular reference channel and the adjusted particular target channel.

39

39. The apparatus of claim 38 , wherein the encoder is configured to adjust the first subset of the first long-term smoothed comparison values by emphasizing a positive shift side of the first long-term smoothed comparison values in response to the comparison that the gain parameter is greater than the first threshold.

40

40. The apparatus of claim 38 , wherein the encoder is configured to adjust the first subset of the first long-term smoothed comparison values by deemphasizing a negative shift side of the first long-term smoothed comparison values in response to the comparison that the gain parameter is greater than the first threshold.

41

41. The apparatus of claim 38 , wherein the encoder is configured to adjust the first subset of the first long-term smoothed comparison values by emphasizing a negative shift side of the first long-term smoothed comparison values in response to the comparison that the gain parameter is less than the first threshold.

42

42. The apparatus of claim 38 , wherein the encoder is configured to adjust the first subset of the first long-term smoothed comparison values by deemphasizing a positive shift side of the first long-term smoothed comparison values in response to the comparison that the gain parameter is greater than the first threshold.

43

43. A non-transitory computer-readable medium comprising instructions that, when executed by an encoder, cause the encoder to perform operations comprising: estimating comparison values each comparison value indicative of an amount of temporal mismatch between a first reference frame of a reference channel and a corresponding first target frame of a target channel; smoothing the comparison values to generate first long-term smoothed comparison values based on a smoothing parameter; calculating a gain parameter between a second reference frame of the reference channel and a corresponding second target frame of the target channel, the gain parameter based on an energy of the second reference frame and an energy of the second target frame, wherein the second reference frame precedes the first reference frame and the second target frame precedes the first target frame; comparing the gain parameter with a first threshold; in response to the comparison, adjusting, at the encoder, a first subset of the first long-term smoothed comparison values to generate second long-term smoothed comparison values; estimating a tentative shift value based on the second long-term smoothed comparison values; determining a non-causal shift value based on the tentative shift value; non-causally shifting a particular target channel by the non-causal shift value to generate an adjusted particular target channel that is temporally aligned with a particular reference channel; and generating at least one of a mid-band channel or a side-band channel based on the particular reference channel and the adjusted particular target channel.

44

44. The non-transitory computer-readable medium of claim 43 , wherein adjusting the first subset of the first long-term smoothed comparison values comprise emphasizing a positive shift side of the first long-term smoothed comparison values in response to the comparison that the gain parameter is greater than the first threshold.

45

45. The non-transitory computer-readable medium of claim 43 , wherein adjusting the first subset of the first long-term smoothed comparison values comprise deemphasizing a negative shift side of the first long-term smoothed comparison values in response to the comparison that the gain parameter is greater than the first threshold.

46

46. The non-transitory computer-readable medium of claim 43 , wherein adjusting the first subset of the first long-term smoothed comparison values comprise emphasizing a negative shift side of the first long-term smoothed comparison values in response to the comparison that the gain parameter is less than the first threshold.

47

47. The non-transitory computer-readable medium of claim 43 , wherein adjusting the first subset of the first long-term smoothed comparison values comprise deemphasizing a positive shift side of the first long-term smoothed comparison values in response to the comparison that the gain parameter is greater than the first threshold.

48

48. An apparatus for coding of multi-channel audio signals at an encoder of an electronic device, the method comprising: means for estimating comparison values, at the encoder, each comparison value indicative of an amount of temporal mismatch between a first reference frame of a reference channel and a corresponding first target frame of a target channel; means for smoothing, at the encoder, the comparison values to generate first long-term smoothed comparison values based on a smoothing parameter; means for calculating, at the encoder, a gain parameter between a second reference frame of the reference channel and a corresponding second target frame of the target channel, the gain parameter based on an energy of the second reference frame and an energy of the second target frame, wherein the second reference frame precedes the first reference frame and the second target frame precedes the first target frame; means for comparing the gain parameter with a first threshold; in response to the comparison, means for adjusting, at the encoder, a first subset of the first long-term smoothed comparison values to generate second long-term smoothed comparison values; means for estimating, at the encoder, a tentative shift value based on the second long-term smoothed comparison values; means for determining, at the encoder, a non-causal shift value based on the tentative shift value; means for non-causally shifting, at the encoder, a particular target channel by the non-causal shift value to generate an adjusted particular target channel that is temporally aligned with a particular reference channel; and means for generating, at the encoder, at least one of a mid-band channel or a side-band channel based on the particular reference channel and the adjusted particular target channel.

49

49. The apparatus of claim 48 , wherein means for adjusting the first subset of the first long-term smoothed comparison values comprises means for emphasizing a positive shift side of the first long-term smoothed comparison values in response to the comparison that the gain parameter is greater than the first threshold.

50

50. The apparatus of claim 48 , wherein means for adjusting the first subset of the first long-term smoothed comparison values comprises means for deemphasizing a negative shift side of the first long-term smoothed comparison values in response to the comparison that the gain parameter is greater than the first threshold.

51

51. The apparatus of claim 48 , wherein means for adjusting the first subset of the first long-term smoothed comparison values comprises means for emphasizing a negative shift side of the first long-term smoothed comparison values in response to the comparison that the gain parameter is less than the first threshold.

52

52. The apparatus of claim 48 , wherein means for adjusting the first subset of the first long-term smoothed comparison values comprises means for deemphasizing a positive shift side of the first long-term smoothed comparison values in response to the comparison that the gain parameter is greater than the first threshold.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

Patent Metadata

Filing Date

August 28, 2018

Publication Date

January 12, 2021

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “Temporal offset estimation” (US-10891960). https://patentable.app/patents/US-10891960

© 2026 Patentable. All rights reserved.

Patentable is a research and drafting-assistant tool, not a law firm, and does not provide legal advice. Documents we generate are drafts for review by a licensed patent attorney.