Methods and audio processing units for generating an object based audio program including conditional rendering metadata corresponding to at least one object channel of the program, where the conditional rendering metadata is indicative of at least one rendering constraint, based on playback speaker array configuration, which applies to each corresponding object channel, and methods for rendering audio content determined by such a program, including by rendering content of at least one audio channel of the program in a manner compliant with each applicable rendering constraint in response to at least some of the conditional rendering metadata. Rendering of a selected mix of content of the program may provide an immersive experience.
Legal claims defining the scope of protection, as filed with the USPTO.
1. A method for generating audio content for an object based audio program, said method comprising: generating conditional rendering metadata corresponding to at least one object channel, wherein the conditional rendering metadata is indicative of at least one rendering constraint for the at least one object channel, wherein the rendering constraint is related to a playback speaker array configuration; determining a set of audio channels including the at least one object channel based on the conditional rendering metadata; and generating the object based audio program such that said object based audio program is indicative of the set of audio channels and the conditional rendering metadata; wherein the rendering constraint includes a constraint relating to an elevation of at least a speaker in the playback speaker array configuration, wherein the at least a speaker in the playback speaker array configuration has an assumed location based on the rendering constraint of the conditional rendering metadata, and wherein the object based audio program is an encoded bitstream comprising frames, and each of at least some of the frames is indicative of at least one data structure which is a container including some content of the at least one object channel and some of the conditional rendering metadata.
2. The method of claim 1 , wherein the set of audio channels includes at least one speaker channel, and audio content of at least one speaker channel of the set of audio channels is indicative of sound captured at a spectator event, and audio content indicated by at least one object channel of the set of audio channels is indicative of commentary for the spectator event.
3. The method of claim 1 , wherein the object based audio program is a Dolby E bitstream comprising a sequence of bursts and guard bands between pairs of the bursts.
4. A method of rendering audio content for an object based audio program, said method comprising: receiving conditional rendering metadata corresponding to at least one object channel, wherein the conditional rendering metadata is indicative of at least one rendering constraint for the at least one object channel, wherein the rendering constraint is related to a playback speaker array configuration, and wherein the object based audio program is an encoded bitstream comprising frames, and each of at least some of the frames is indicative of at least one data structure which is a container including some content of the at least one object channel and some of the conditional rendering metadata; rendering content of a set of audio channels including the at least one object channel based on the rendering constraint of the conditional rendering metadata, wherein the rendering constraint includes a constraint relating to an elevation of at least a speaker in the playback speaker array configuration, wherein the at least a speaker in the playback speaker array configuration.
5. The method of claim 4 , wherein the set of audio channels includes at least one speaker channel, and rendering the content includes selecting at least one object channel of the set of audio channels, thereby determining a selected object channel subset, and mixing each object channel of the selected object channel subset with at least one speaker channel of the set to render a down-mix of content of the selected object channel subset and said at least one speaker channel of the set.
6. The method of claim 5 , where in, in response to at least some of the conditional rendering metadata, and based on a specific playback speaker array configuration of the audio processing unit, providing a menu of rendering options which are available for selection; and selecting said at least one object channel of the set of audio channels, thereby determining the selected object channel subset, by selecting one of the rendering options indicated by the menu.
7. A system for rendering audio content for an object based audio program, said system comprising: a receiver for receiving conditional rendering metadata corresponding to at least one object channel, wherein the conditional rendering metadata is indicative of at least one rendering constraint for the at least one object channel, wherein the rendering constraint is related to a playback speaker array configuration, and wherein the object based audio program is an encoded bitstream comprising frames, and each of at least some of the frames is indicative of at least one data structure which is a container including some content of the at least one object channel and some of the conditional rendering metadata; a rendering subsystem for rendering content of a set of audio channels including the at least one object channel based on the rendering constraint of the conditional rendering metadata, wherein the rendering constraint includes a constraint relating to an elevation of at least a speaker in the playback speaker array configuration.
8. The system of claim 7 , wherein the set of audio channels includes at least one speaker channel, and also including: a controller coupled to the receiver, wherein the controller is configured to provide, in response to at least some of the conditional rendering metadata and based on a specific playback speaker array configuration of system, a menu of rendering options which are available for selection, and wherein the controller is configured to determine the selected subset of the set of audio channels in response to user selection one of the rendering options indicated by the menu, and wherein the rendering subsystem is configured to mix each object channel of the selected subset of the set of audio channels with at least one speaker channel of said selected subset of the set of audio channels to render the content.
9. The system of claim 7 , wherein the system is configured to select at least one object channel of the set of audio channels, thereby determining a selected object channel subset, and the rendering subsystem is configured to mix each object channel of the selected object channel subset with at least one speaker channel of the set to generate a downmix of content of the selected object channel subset and said at least one speaker channel of the set.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
January 29, 2018
August 20, 2019
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.