11074292

Voice Tagging of Video While Recording

PublishedJuly 27, 2021
Assigneenot available in USPTO data we have
Technical Abstract

Patent Claims
18 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1

1. A method of tagging a video using voice input as the video is being recorded comprising: recording a video of a scene through a head-mounted display device; receiving a first audio signal at the head-mounted display device, the audio signal capturing a voice of a user of the head-mounted display device; performing audio analysis on the first audio signal to recognize a voice tag solicitation command; in response to the tag solicitation command, building a list of tags that are relevant to the scene; outputting the list of tags for display through the head-mounted display device; receiving a second audio signal at the head-mounted display device while the video is at a particular duration point, the audio signal capturing the voice of the user of the head-mounted display device; performing audio analysis on the second audio signal to recognize a tag from the list of tags; and storing an association of the tag with the particular duration point of the video in a computer storage.

2

2. The method of claim 1 , wherein the scene is associated with a project and wherein building the list of tags comprises retrieving tags from a curated list of tags associated with the project.

3

3. The method of claim 2 , further comprising determining a location characteristic of the head-mounted display device and wherein said retrieving tags from the curated list further comprises retrieving tags associated with the location characteristic.

4

4. The method of claim 3 , wherein the location characteristic is a directional orientation of a camera on the head-mounted display.

5

5. The method of claim 2 , wherein the tagging solicitation command includes identification information for the project.

6

6. The method of claim 1 , wherein building the list of tags comprises ranking available tags by likelihood of usage given a present context of the head-mounted display device.

7

7. The method of claim 6 , wherein the likelihood is calculated using a machine learning process that identifies patterns of tag usage given a context.

8

8. A method of tagging a video using voice input as the video is being recorded comprising: receiving an audio signal at a head-mounted display device while the head-mounted display device is recording a video of a scene, the audio signal capturing a voice of a user of the head-mounted display device; performing audio analysis on the audio signal to identify a tag initiation command issued by a user of the head-mounted display device, wherein the tag initiation command comprises a tag activation word and a tag description; receiving an additional audio signal; performing audio analysis on the audio signal to identify a display tag command to show tags; building a list of tags that are relevant to the scene; outputting the list of tags for display through the head-mounted display device; and storing an association of the tag description with a particular duration point of the video in a computer storage, wherein the particular duration point coincides with a point in time when the audio signal was received.

9

9. The method of claim 8 , wherein the tag initiation command also comprises a tagging method.

10

10. The method of claim 9 , wherein the tagging method is a single point tag.

11

11. The method of claim 9 , wherein the tagging method is a duration tag starting at a first progress point in the video and terminating a second progress point in the video.

12

12. The method of claim 8 , wherein building the list of tags comprises ranking available tags by likelihood of usage given a present context of the head-mounted display device.

13

13. The method of claim 12 , wherein the likelihood is calculated using a machine learning process that identifies patterns of tag usage given a context.

14

14. A computer-storage media having computer-executable instructions embodied thereon that when executed by a computer processor causes a mobile computing device to perform a method of method of tagging a video using voice input as the video is being recorded, the method comprising: receiving a first audio signal at a computing device, the audio signal capturing a voice of a user of the computing device; performing audio analysis on the first audio signal to identify a tagging initiation command issued by a user of the computing device; building a list of tags that are relevant to a scene captured by a camera associated with the computing device; outputting the list of tags for display through the computing device; receiving a second audio signal at the computing device, the audio signal capturing the voice of the user of the computing device; performing audio analysis on the second audio signal to identify a tag from the list of tags; and storing an association of the tag with a particular duration point of a video subsequently recorded by the computing device.

15

15. The media of claim 14 , wherein building the list of tags comprises ranking available tags by likelihood of usage given a context of the head-mounted display device.

16

16. The media of claim 15 , wherein the likelihood is calculated using a machine learning process that identifies patterns of tag usage for the given a context.

17

17. The media of claim 15 , wherein the given context is a previous tag added to the video through an audible command.

18

18. The media of claim 15 , wherein the method comprises updating the list of tags dynamically as the given context changes, wherein updating the list comprises removing tags that have become less relevant and adding tags with increased relevance.

Patent Metadata

Filing Date

Unknown

Publication Date

July 27, 2021

Inventors

Sanjay Subir Jhawar
Christopher Iain Parkinson
Tom Dollente

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “VOICE TAGGING OF VIDEO WHILE RECORDING” (11074292). https://patentable.app/patents/11074292

© 2026 Patentable. All rights reserved.

Patentable is a research and drafting-assistant tool, not a law firm, and does not provide legal advice. Documents we generate are drafts for review by a licensed patent attorney.