Deep learning system for determining audio recommendations based on video content

PublishedSeptember 3, 2024

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

Embodiments are disclosed for determining an answer to a query associated with a graphical representation of data. In particular, in one or more embodiments, the disclosed systems and methods comprise receiving an input including an unprocessed audio sequence and a request to perform an audio signal processing effect on the unprocessed audio sequence. The one or more embodiments further include analyzing, by a deep encoder, the unprocessed audio sequence to determine parameters for processing the unprocessed audio sequence. The one or more embodiments further include sending the unprocessed audio sequence and the parameters to one or more audio signal processing effects plugins to perform the requested audio signal processing effect using the parameters and outputting a processed audio sequence after processing of the unprocessed audio sequence using the parameters of the one or more audio signal processing effects plugins.

Patent Claims

2 claims

Legal claims defining the scope of protection, as filed with the USPTO.

7. The computer-implemented method of claim 6, wherein the first neural network is a spatial branch that uses the frame level video features to generate the spatial features of the video sequence, wherein the second neural network is a temporal branch that uses a transposed form of the frame level video features to generate the temporal features of the video sequence, and wherein the third neural network is a global branch that uses the video level video features to generate the global features of the video sequence.

14. The non-transitory computer-readable storage medium of claim 13, wherein the first neural network is a spatial branch that uses the frame level video features to generate the spatial features of the video sequence, wherein the second neural network is a temporal branch that uses a transposed form of the frame level video features to generate the temporal features of the video sequence, and wherein the third neural network is a global branch that uses the video level video features to generate the global features of the video sequence.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

G06V H04N G06F G06N G10L

Patent Metadata

Filing Date

June 28, 2021

Publication Date

September 3, 2024

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search