Smart Audio with User Input

PublishedDecember 14, 2021

Assigneenot available in USPTO data we have

InventorsPratik Mukesh Kamdar Subhash Chandra Bose Naripeddy Vamsi Mynampati Sridhar Pilli

Technical Abstract

Patent Claims

20 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A method comprising: establishing, by a communication device, a communication session with another communication device via a network; capturing, via one or more cameras of the communication device, a local area within a field of view of the one or more cameras, the local area including one or more audio sources; receiving, from the other communication device via the network, a selection of a respective audio source from the one or more audio sources in the local area, wherein the selection indicates that audio originating from the respective audio source is to be prioritized over other audio sources in the local area; and tuning one or more microphones of the communication device based on a depth of the respective audio source relative to the communication device to optimize reception of audio originating at the respective audio source by the one or more microphones.

2. The method of claim 1 , wherein tuning the one or more microphones based on the depth comprises: determining whether the depth of the respective audio source relative to the communication device is outside a distance threshold; and tuning the one or more microphones in accordance with a far-field mode in response to determining that the depth of the respective audio source relative to the communication device is outside the distance threshold.

3. The method of claim 2 , wherein tuning the one or more of the microphones based on the depth comprises: tuning the one or more microphones in accordance with a near-field mode in response to determining that the depth of the respective audio source relative to the communication device is within the distance threshold.

4. The method of claim 1 , wherein the tuning the one or more microphones based on the depth of the respective audio source relative to the communication device comprises adjusting one or more tuning parameters of the one or more microphones, and further wherein the one or more tuning parameters comprise an automatic gain control parameter, a noise suppression parameter, and an echo cancellation parameter.

5. The method of claim 1 , wherein tuning the one or more microphones comprises: configuring an automatic gain control parameter based on the depth of the respective audio source relative to the communication device, wherein the automatic gain control parameter is set higher for a far-field mode than a near-field mode.

6. The method of claim 1 , wherein tuning the one or more microphones comprises: configuring a noise suppression parameter based on the depth of the respective audio source relative to the communication device, wherein the noise suppression parameter is set higher for a near-field mode than a far-field mode.

7. The method of claim 1 , wherein the respective audio source is a focal point in the local area.

8. The method of claim 1 , wherein the selection is a voice command captured by the other communication device.

9. The method of claim 1 , wherein tuning the one or more microphones is further based on an application running on the communication device, wherein the method further comprises: receiving another selection indicating an application running on the communication device; and responsive to determining that the application includes a virtual assistant, tuning the one or more microphones in accordance with a virtual assistant mode, wherein the tuning comprises setting an automatic gain control parameter at or below a setting of the automatic gain control parameter for a near-field mode and setting a noise suppression parameter at or below a setting of the noise suppression parameter for a far-field mode.

10. A non-transitory computer-readable storage medium storing executable instructions that, when executed by one or more processors, cause the one or more processors to perform steps comprising: establishing, by a communication device, a communication session with another communication device via a network; capturing, via one or more cameras of the communication device, a local area within a field of view of the one or more cameras, the local area including one or more audio sources; receiving, from the other communication device via the network, a selection of a respective audio source from the one or more audio sources in the local area, wherein the selection indicates that audio originating from the respective audio source is to be prioritized over other audio sources in the local area; and tuning the one or more microphones of the communication device based on a depth of the respective audio source relative to the communication device to optimize reception of audio originating at the respective audio source by the one or more microphones.

11. The non-transitory computer-readable storage medium of claim 10 , wherein tuning the one or more microphones based on the depth comprises: determining whether the depth of the respective audio source relative to the communication device is outside a distance threshold; and tuning the one or more microphones in accordance with a far-field mode in response to determining that the depth of the respective audio source relative to the communication device is outside the distance threshold.

12. The non-transitory computer-readable storage medium 11 , wherein tuning the one or more of the microphones based on the depth comprises: tuning the one or more microphones in accordance with a near-field mode in response to determining that the depth of the respective audio source relative to the communication device is within the distance threshold.

13. The non-transitory computer-readable storage medium of claim 10 , wherein tuning the one or more microphones based on the depth of the respective audio source relative to the communication device comprises adjusting one or more tuning parameters of the one or more microphones, and wherein the one or more tuning parameters, and further wherein the one or more tuning parameters comprise an automatic gain control parameter, a noise suppression parameter, and an echo cancellation parameter.

14. The non-transitory computer-readable storage medium of claim 10 , wherein tuning the one or more microphones comprises: configuring an automatic gain control parameter based on the depth of the respective audio source relative to the communication device, wherein the automatic gain control parameter is set higher for a far-field mode than a near-field mode.

15. The non-transitory computer-readable storage medium of claim 10 , wherein tuning the one or more microphones comprises: configuring a noise suppression parameter based on the depth of the respective audio source relative to the communication device, wherein the noise suppression parameter is set higher for a near-field mode than a far-field mode.

16. The non-transitory computer-readable storage medium of claim 10 , wherein the respective audio source is a focal point in the local area.

17. The non-transitory computer-readable storage medium of claim 10 , wherein the selection is a voice command captured by the other communication device.

18. The non-transitory computer-readable storage medium of claim 10 , wherein tuning the one or more microphones is further based on an application running on the communication device, wherein the instructions further cause the processor to perform steps comprising: receiving another selection indicating an application running on the communication device; and responsive to determining that the application includes a virtual assistant, tuning the one or more microphones in accordance with a virtual assistant mode, wherein the tuning comprises setting an automatic gain control parameter at or below a setting of the automatic gain control parameter for a near-field mode and setting a noise suppression parameter at or below a setting of the noise suppression parameter for a far-field mode.

19. A communication device comprising: one or more microphones; one or more cameras; one or more processors; and memory storing one or more programs for execution by the one or more processors, the one or more programs including instruction for: establishing a communication session with another communication device via a network; capturing, via the one or more cameras, a local area within a field of view of the one or more cameras, the local area including one or more audio sources; receiving, from the other communication device via the network, a selection of a respective audio source from the one or more audio sources in the local area, wherein the selection indicates that audio originating from the respective audio source is to be prioritized over other audio sources in the local area; and tuning the one or more microphones based on a depth of the respective audio source relative to the communication device to optimize reception of audio originating at the respective audio source by the one or more microphones.

20. The computer system of claim 19 , wherein tuning the one or more microphones based on the depth comprises: determining whether the depth of the respective audio source relative to the communication device is outside a distance threshold; and tuning the one or more microphones in accordance with a far-field mode in response to determining that the depth of the respective audio source relative to the communication device is outside the distance threshold.

Patent Metadata

Filing Date

Unknown

Publication Date

December 14, 2021

Inventors

Pratik Mukesh Kamdar

Subhash Chandra Bose Naripeddy

Vamsi Mynampati

Sridhar Pilli

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search