US-8184180

Spatially synchronized audio and video capture

PublishedMay 22, 2012

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

An audio/video (A/V) capture device and method that capture audio and video in a spatially synchronized manner. In one implementation, the device and method automatically adjust the shape of a spatial directivity pattern of a microphone array used for acquiring audio so that the pattern is spatially synchronized with an amount of video zoom being applied by a video acquisition section to acquire video. For example, a wider spatial directivity pattern may automatically be used during wide-angle shots and a narrower spatial directivity pattern may automatically be used during close-ups. This beneficially allows for the consistent attenuation of audio signals received from audio sources that lie outside the field of view of the video acquisition section while passing or even enhancing audio signals received from audio sources that lie within the field of view even though the width of the field of view has changed.

Patent Claims

23 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. An audio/video (A/V) capture device, comprising: a video acquisition section configured to generate a video signal; video zoom control logic configured to control an amount of video zoom applied by the video acquisition section in generating the video signal; and an audio acquisition section comprising a microphone array configured to generate a plurality of microphone signals and a beamformer configured to modify a shape of a spatial directivity pattern of the microphone array responsive to a change in the amount of video zoom applied by the video acquisition section and to process the microphone signals in accordance with the spatial directivity pattern to generate an audio signal, the beamformer comprising: a first beamformer configured to output a first vector of sensor weights corresponding to a first spatial directivity pattern; a second beamformer configured to output a second vector of sensor weights corresponding to a second spatial directivity pattern, wherein a main lobe of the second spatial directivity pattern is narrower than a main lobe of the first spatial directivity pattern; first logic that determines a first weight and a second weight based on the amount of video zoom currently being applied by the video acquisition section; a first multiplier that applies the first weight to the first vector to produce a weighted version of the first vector; a second multiplier that applies the second weight to the second vector to produce a weighted version of the second vector; a combiner configured to combine the weighted version of the first vector and the weighted version of the second vector to produce a third vector of sensor weights corresponding to a third spatial directivity pattern; and second logic configured to apply the third vector of sensor weights to the microphone signals to generate the audio signal.

2. The A/V capture device of claim 1 , further comprising: A/V processing logic configured to encode the video signal and the audio signal for subsequent storage and/or transmission.

3. The A/V capture device of claim 1 , wherein the first beamformer comprises a delay-sum beamformer and the second beamformer comprises a superdirective beamformer.

4. The A/V capture device of claim 3 , wherein the superdirective beamformer comprises a Minimum Variance Distortionless Response (MVDR) beamformer.

5. The A/V capture device of claim 1 , wherein the video acquisition section comprises a zoom lens and the video zoom control logic is configured to control an amount of optical video zoom applied by the zoom lens in generating the video signal, and wherein the beamformer is configured to modify the shape of the spatial directivity pattern of the microphone array responsive to at least a change in the amount of optical video zoom applied by the zoom lens.

6. The A/V capture device of claim 1 , wherein the video acquisition section comprises a digital zoom processor and the video zoom control logic is configured to control an amount of digital video zoom applied by the digital zoom processor in generating the video signal, and wherein the beamformer is configured to modify the shape of the spatial directivity pattern of the microphone array responsive to at least a change in the amount of digital video zoom applied by the digital zoom processor.

7. A method, comprising: monitoring an amount of video zoom applied by a video acquisition section of an A/V capture device in generating a video signal; modifying a shape of a spatial directivity pattern of a microphone array of the A/V capture device responsive to a change in the amount of video zoom applied by the video acquisition section; receiving a plurality of microphone signals from the microphone array; and processing the microphone signals in accordance with the modified spatial directivity pattern to generate an audio signal; wherein modifying the shape of the spatial directivity pattern of the microphone array comprises: receiving a first vector of sensor weights corresponding to a first spatial directivity pattern from a first beamformer; receiving a second vector of sensor weights corresponding to a second spatial directivity pattern from a second beamformer, wherein a main lobe of the second spatial directivity pattern is narrower than a main lobe of the first spatial directivity pattern; determining a first weight and a second weight based on the amount of video zoom currently being applied by the video acquisition section; applying the first weight to the first vector to produce a weighted version of the first vector; applying the second weight to the second vector to produce a weighted version of the second vector; combining the weighted version of the first vector and the weighted version of the second vector to produce a third vector of sensor weights corresponding to a third spatial directivity pattern.

8. The method of claim 7 , further comprising: encoding the video signal and the audio signal for subsequent storage and/or transmission.

9. The method of claim 7 , wherein processing the microphone signals in accordance with the modified spatial directivity pattern to generate an audio signal comprises applying the third vector of sensor weights to the microphone signals to generate the audio signal.

10. The method of claim 7 , wherein the first beamformer comprises a delay-sum beamformer and the second beamformer comprises a superdirective beamformer.

11. The method of claim 10 , wherein the superdirective beamformer comprises a Minimum Variance Distortionless Response (MVDR) beamformer.

12. The method of claim 7 , wherein monitoring the amount of video zoom applied by the video acquisition section comprises monitoring an amount of optical video zoom applied by a zoom lens, and wherein modifying the shape of the spatial directivity pattern of the microphone array responsive to a change in the amount of video zoom applied by the video acquisition section comprises modifying the shape of the spatial directivity pattern of the microphone array responsive to at least a change in the amount of optical video zoom applied by the zoom lens.

13. The method of claim 7 , wherein monitoring the amount of video zoom applied by the video acquisition section comprises monitoring an amount of digital video zoom applied by a digital zoom processor, and wherein modifying the shape of the spatial directivity pattern of the microphone array responsive to a change in the amount of video zoom applied by the video acquisition section comprises modifying the shape of the spatial directivity pattern of the microphone array responsive to at least a change in the amount of digital video zoom applied by the digital zoom processor.

14. The method of claim 7 , further comprising monitoring an amount of optical video zoom applied by an optical zoom lens within the video acquisition section, and wherein modifying the shape of the spatial directivity pattern of the microphone array responsive to at least a change in the amount of digital video zoom applied by the digital zoom processor comprises modifying the shape of the spatial directivity pattern of the microphone array responsive to both a change in the amount of optical video zoom applied by the zoom lens and a change in the amount of digital video zoom applied by the digital zoom processor.

15. An audio/video (A/V) capture device, comprising: a video acquisition section configured to generate a video signal, the video acquisition section comprising a digital zoom processor; video zoom control logic configured to control an amount of digital video zoom applied by the digital zoom processor in generating the video signal; and an audio acquisition section comprising a microphone array configured to generate a plurality of microphone signals and a beamformer configured to modify a shape of a spatial directivity pattern of the microphone array responsive to at least a change in the amount of digital video zoom applied by the digital zoom processor and to process the microphone signals in accordance with the spatial directivity pattern to generate an audio signal.

16. The A/V capture device of claim 15 , further comprising: A/V processing logic configured to encode the video signal and the audio signal for subsequent storage and/or transmission.

17. The A/V capture device of claim 15 , wherein the beamformer is configured to reduce a width of a main lobe of the spatial directivity pattern of the microphone array responsive to an increase in the amount of digital video zoom applied by the digital zoom processor and to increase the width of the main lobe of the spatial directivity pattern of the microphone array responsive to a reduction in the amount of digital video zoom applied by the digital zoom processor.

18. The A/V capture device of claim 15 , wherein the beamformer is configured to selectively place one or more nulls in and/or selectively remove one or more nulls from the spatial directivity pattern of the microphone array based on the amount of digital video zoom applied by the digital zoom processor.

19. The A/V capture device of claim 15 , wherein the video acquisition section further comprises a zoom lens and the video zoom control logic is further configured to control an amount of optical video zoom applied by the zoom lens in generating the video signal, and wherein the beamformer is configured to modify the shape of the spatial directivity pattern of the microphone array responsive to both a change in the amount of optical video zoom applied by the zoom lens and a change in the amount of digital video zoom applied by the digital zoom processor.

20. A method, comprising: monitoring an amount of digital video zoom applied by a digital zoom processor within a video acquisition section of an A/V capture device in generating a video signal; modifying a shape of a spatial directivity pattern of a microphone array of the A/V capture device responsive to at least a change in the amount of digital video zoom applied by the digital zoom processor; receiving a plurality of microphone signals from the microphone array; and processing the microphone signals in accordance with the modified spatial directivity pattern to generate an audio signal.

21. The method of claim 20 , further comprising: encoding the video signal and the audio signal for subsequent storage and/or transmission.

22. The method of claim 20 , wherein modifying the shape of the spatial directivity pattern of the microphone array responsive to at least the change in the amount of digital video zoom applied by the digital zoom processor comprises: reducing a width of a main lobe of the spatial directivity pattern of the microphone array responsive to an increase in the amount of digital video zoom applied by the digital zoom processor and increasing the width of the main lobe of the spatial directivity pattern of the microphone array responsive to a reduction in the amount of digital video zoom applied by the digital zoom processor.

23. The method of claim 20 , wherein modifying the shape of the spatial directivity pattern of the microphone array responsive to the change in the amount of digital video zoom applied by the digital zoom processor comprises: selectively placing one or more nulls in and/or removing one or more nulls from the spatial directivity pattern of the microphone array based on the amount of digital video zoom applied by the digital zoom processor.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

H04N

Patent Metadata

Filing Date

March 25, 2009

Publication Date

May 22, 2012

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search