Region-of-interest (ROI)-based image enhancement using a residual network, including: generating, based on an input image and a residual path of a residual network, a first output corresponding to a region-of-interest of the input image; generating, based on the input image and a skip path of the residual network, a second output; and generating an output image based on the first output and the second output.
Legal claims defining the scope of protection, as filed with the USPTO.
. A method of region-of-interest (ROI)-based image enhancement using a residual network, the method comprising:
. The method of, wherein the region-of-interest of the input image comprises a segmentation mask, and wherein generating the output image includes mapping pixels from the first output onto the second output at the ROI of the output image.
. The method of, further comprising performing a backpropagation of the residual path based on the region-of-interest, including identifying, for each layer of the residual path, a corresponding input feature map subspace based on the region-of-interest.
. The method of, wherein performing the backpropagation further comprises storing, for each layer of the residual path, subspace metadata describing the corresponding input feature map subspace.
. The method of, wherein generating the first output comprises processing, for each layer of the residual path, the corresponding input feature map subspace described in the subspace metadata.
. The method of, wherein the input image comprises a frame of video data, and wherein the region-of-interest comprises a dynamic region-of-interest that is variable across a plurality of frames of the video data.
. An apparatus for region-of-interest (ROI)-based image enhancement using a residual network, the apparatus comprising:
. The apparatus of, wherein the region-of-interest comprises a segmentation mask.
. The apparatus of, wherein the residual network comprises an image enhancement neural network.
. The apparatus of, wherein image enhancement circuit is further configured to perform a backpropagation of the residual path based on the region-of-interest, including identifying, for each layer of the residual path, a corresponding input feature map subspace based on the region-of-interest.
. The apparatus of, wherein performing the backpropagation further comprises storing, for each layer of the residual path, subspace metadata describing the corresponding input feature map subspace.
. The apparatus of, wherein generating the first output comprises processing, for each layer of the residual path, the corresponding input feature map subspace described in the subspace metadata.
. The apparatus of, wherein the input image comprises a frame of video data, and wherein the region-of-interest comprises a dynamic region-of-interest that is variable across a plurality of frames of the video data.
. A computer program product comprising a non-transitory computer readable storage medium, the computer program product comprising computer program instructions for region-of-interest (ROI)-based image enhancement using a residual network that, when executed, cause a computer system to perform steps comprising:
. The computer program product of, wherein the region-of-interest comprises a segmentation mask.
. The computer program product of, wherein the residual network comprises an image enhancement neural network.
. The computer program product of, wherein the steps further comprise performing a backpropagation of the residual path based on the region-of-interest, including identifying, for each layer of the residual path, a corresponding input feature map subspace based on the region-of-interest.
. The computer program product of, wherein performing the backpropagation further comprises storing, for each layer of the residual path, subspace metadata describing the corresponding input feature map subspace.
. The computer program product of, wherein generating the first output comprises processing, for each layer of the residual path, the corresponding input feature map subspace described in the subspace metadata.
Complete technical specification and implementation details from the patent document.
Neural networks are used to perform image enhancements on input images. Neural networks that produce higher quality outputs generally use more computational and power resources compared to other networks. While a neural network that requires fewer computational and power resources, the resulting output is of reduced quality.
Neural networks are used to perform image enhancements on input images. Neural networks that produce higher quality outputs generally use more computational and power resources compared to other networks. While a neural network that requires fewer computational and power resources, the resulting output is of reduced quality. Moreover, while an input image may include objects or regions of varying importance, these neural networks are not usable on arbitrarily selected portions of an image.
To that end, the present specification sets forth various implementations for region-of-interest (ROI)-based image enhancement using a residual network. In some implementations, a method of region-of-interest (ROI)-based image enhancement using a residual network includes generating, based on an input image and a residual path of a residual network, a first output corresponding to a region-of-interest of the input image; generating, based on the input image and a skip path of the residual network, a second output; and generating an output image based on the first output and the second output.
In some implementations, the region-of-interest includes a segmentation mask. In some implementations, the residual network includes an image enhancement neural network. In some implementations, the method further includes performing a backpropagation of the residual path based on the region-of-interest, including identifying, for each layer of the residual path, a corresponding input feature map subspace based on the region-of-interest. In some implementations, performing the backpropagation further includes storing, for each layer of the residual path, subspace metadata describing the corresponding input feature map subspace. In some implementations, generating the first output includes processing, for each layer of the residual path, the corresponding input feature map subspace described in the subspace metadata. In some implementations, the input image includes a frame of video data, and the region-of-interest includes a dynamic region-of-interest that is variable across a plurality of frames of the video data.
The present specification also describes various implementations of an apparatus for region-of-interest (ROI)-based image enhancement using a residual network. Such an apparatus includes a computer processor, a computer memory operatively coupled to the computer processor, the computer memory having disposed within it computer program instructions that, when executed by the computer processor, cause the apparatus to carry out steps including: generating, based on an input image and a residual path of a residual network, a first output corresponding to a region-of-interest of the input image; generating, based on the input image and a skip path of the residual network, a second output; and generating an output image based on the first output and the second output.
In some implementations, the region-of-interest includes a segmentation mask. In some implementations, the residual network includes an image enhancement neural network. In some implementations, the steps further include performing a backpropagation of the residual path based on the region-of-interest, including identifying, for each layer of the residual path, a corresponding input feature map subspace based on the region-of-interest. In some implementations, performing the backpropagation further includes storing, for each layer of the residual path, subspace metadata describing the corresponding input feature map subspace. In some implementations, generating the first output includes processing, for each layer of the residual path, the corresponding input feature map subspace described in the subspace metadata. In some implementations, the input image includes a frame of video data, and the region-of-interest includes a dynamic region-of-interest that is variable across a plurality of frames of the video data.
Also described in this specification are various implementations of a computer program for region-of-interest (ROI)-based image enhancement using a residual network. Such a computer program product is disposed upon a non-transitory computer readable medium including computer program instructions for that, when executed, cause a computer system to perform steps including: generating, based on an input image and a residual path of a residual network, a first output corresponding to a region-of-interest of the input image; generating, based on the input image and a skip path of the residual network, a second output; and generating an output image based on the first output and the second output.
In some implementations, the region-of-interest includes a segmentation mask. In some implementations, the residual network includes an image enhancement neural network. In some implementations, the steps further include performing a backpropagation of the residual path based on the region-of-interest, including identifying, for each layer of the residual path, a corresponding input feature map subspace based on the region-of-interest. In some implementations, performing the backpropagation further includes storing, for each layer of the residual path, subspace metadata describing the corresponding input feature map subspace. In some implementations, generating the first output includes processing, for each layer of the residual path, the corresponding input feature map subspace described in the subspace metadata.
The following disclosure provides many different implementations, or examples, for implementing different features of the provided subject matter. Specific examples of components and arrangements are described below to simplify the present disclosure. These are, of course, merely examples and are not intended to be limiting. For example, the formation of a first feature over or on a second feature in the description that follows include implementations in which the first and second features are formed in direct contact, and also include implementations in which additional features be formed between the first and second features, such that the first and second features are not in direct contact. Further, spatially relative terms, such as “beneath,” “below,” “lower,” “above,” “upper,” “back,” “front,” “top,” “bottom,” and the like, are used herein for ease of description to describe one element or feature's relationship to another element(s) or feature(s) as illustrated in the figures. Similarly, terms such as “front surface” and “back surface” or “top surface” and “back surface” are used herein to more easily identify various components, and identify that those components are, for example, on opposing sides of another component. The spatially relative terms are intended to encompass different orientations of the device in use or operation in addition to the orientation depicted in the figures.
is a diagram of an example image enhancement circuitfor region-of-interest (ROI)-based image enhancement using a residual network according to some implementations of the present disclosure. The example image enhancement circuitcan be implemented in a variety of computing devices, including mobile devices, personal computers, peripheral hardware components, gaming devices, set-top boxes, and the like. The image enhancement circuitimplements a residual network. The residual networkaccepts, as input, an input image. The input imageincludes any digitally encoded image as would be appreciated by one skilled in the art. For example, in some implementations, the input imageis one of multiple frames of video data.
The residual networkis a neural network such as a convolutional neural network or other type of neural network as can be appreciated. Accordingly, the residual networkincludes multiple layers that each accept some input and provide some output. For example, a first layer or “input layer” accepts the input imageas input. The residual networkis considered a “residual” neural network in that there are multiple usable paths for processing input and producing output. For example, the residual networkincludes a residual pathwhereby an input is processed by some number of layers in the residual network. The residual networkalso includes a skip pathwhereby the input is processed by a subset of the layers included in the residual path. In other words, input processed by the skip pathskips some number of layers of the residual path, thereby incurring a lower computational cost and power usage when compared to the residual path.
A non-limiting example structure for the residual networkis shown in. Here, the residual networkincludes multiple layers,,−1,. The layeraccepts inputand the layerproduces an output. In this example residual network, a residual pathuses each layer−1,n to produce the final output. A skip pathof the example residual networkuses layers,to produce the final output, skipping layers-−1. One skilled in the art will appreciate that the layout of the residual networkofis merely exemplary and illustrative, and that other configurations of layers-, as well as configurations for residual pathsand skip pathsare also contemplated within the scope of the present disclosure.
describes the example inputs and outputs for a given layerof the residual network. For example, the layercorresponds to any layer of a residual network, such as the layers-of. The layeraccepts, as input, an input feature map. An input feature mapincludes multiple data points provided as input to the layer. For example, different data points in the input feature mapwill be provided as input to different neurons or other subcomponents of the layer. In some implementations, the input feature mapis encoded as an array or matrix with each cell corresponding to a particular pixel (e.g., of an input image, of an output image, or some intermediary image or collection of data during use of the residual network). As an example, each cell may include one or multiple values describing particular attributes or encoding particular data values for a particular pixel or other subset of an image.
The layerprovides, as output, an output feature map. The output feature mapis encoded similar to the input feature mapbut is used to describe the output from a particular layer. As is set forth above, the residual networkincludes multiple layers. Each layeraccepts input from a preceding layer, provides output to a successive layer, or both. Thus, the input feature mapfor some layersis the output feature mapof a preceding layer, and the output feature mapfor some layersserves as the input feature mapfor a successive layer. A first layerin the residual network(e.g., the layerof) is considered an “input” layer. The input feature mapfor the input layeris the inputof, or some derivation of the inputafter preprocessing and the like. A last layerin the residual network(e.g., the layerof) is considered an “output” layer. The output feature mapfor the output layeris the outputof.
Turning back to, both the residual pathand skip pathperform some image processing or enhancement process on the input image, such as sharpening, denoising, upscaling, and other image processing functions as can be appreciated. While, in some implementations, the residual pathproduces an output of a perceptibly higher quality than the skip path, use of the residual pathincurs a higher computational cost compared to the skip path.
Accordingly, the image enhancement circuitselectively uses the residual pathon a portion of the input imagewhose resulting output (encoded in an output image, described in further detail below) corresponds to a region-of-interest (ROI). The ROIis a sub-area of the input imageidentified for a particular purpose. For example, in some implementations the ROIcorresponds to a portion of the imageindicating as having particular importance, such as an identified area of a medial image, a face or human subject of a frame of video, and the like. In some implementations, the ROIis embodied or encoded as a segmentation mask for the input image.
In order to selectively use the residual path, in some implementations, the image enhancement circuitperforms a backpropagation of the residual path. To begin, the image enhancement circuitidentifies which portions of the output of the residual path(e.g., the output feature mapof the output layer) correspond to the ROI. In other words, the image enhancement circuitdetermines which values of the output feature mapare encoded as or otherwise affect pixels in the ROI. A subset of values from an output feature mapis hereinafter referred to as an output feature mapsubspace. Thus, the image enhancement circuitidentifies the output feature mapsubspace corresponding to the ROI.
The image enhancement circuitthen identifies, for the output layer, a subset of the input feature map(e.g., an input feature mapsubspace) corresponding to the identified output feature mapsubspace. The input feature mapsubspace corresponding to the output feature mapsubspace includes those values in the input feature mapthat are factors in calculating the values in the output feature mapsubspace. In some implementations, after determining the input feature mapsubspace for the output layer, the image enhancement circuitstores subspace metadata identifying the input feature mapsubspace for the output layer.
The backpropagation then moves to the preceding layer(e.g., some intermediary layer), and repeats the process described above for each layer. That is, for a given layer, the image enhancement circuitidentifies an input feature mapsubspace for the identified output feature mapsubspace for that layer(e.g., the identified input feature mapsubspace for the successive layer). In other words, for a given layer, the image enhancement circuitidentifies the input feature mapsubspace whose values are factors in calculating an output feature mapsubspace matching the identified input feature mapsubspace for the next layer. As an example, identifying the input feature mapsubspace for a given layerincludes accessing subset metadata describing the input feature mapsubspace of the successive layer. Accordingly, in some implementations, for each layer, the identified input feature mapsubspace is stored as subset metadata. Once the backpropagation has processed the input layer, the input feature mapsubspace for the input layercorresponds to a subset of the input image. Thus, for each layer, an input feature mapsubspace is identified whose values affect the ROIof the residual pathoutput.
The image enhancement circuitthen performs a forward inference of the residual pathto generate a first output. At each layerof the residual path, an input feature mapsubspace is selected from the input feature mapprovided to the layer. As an example, for each layer, subspace metadata for that layeris accessed. The input feature mapsubspace described in the subspace metadata is then selected from the input feature map. The selected input feature mapsubspace is then provided as input to the layerfor processing. As an example, the input feature mapsubspace is subdivided into multiple tensors or other subunits of data and provided to a hardware accelerator (e.g., a graphics processing unit (GPU), a machine learning accelerator, and the like) for processing by the layer. As the layeris only processing the identified input feature mapsubspace for that layer, in some implementations, the resulting output feature mapfor that layer will include null values, zero values, default values, and the like due to the unprocessed portions of the input feature map.
The forward inference of the residual pathdescribed above limits processing by each layerto those values that will ultimately affect the ROI. The resulting outputcorresponds to the ROI(e.g., encoding the identified particular area after processing by the residual path). This approach allows for an ROIto be processed using the computationally intensive residual path, without expending resources on processing portions of an input imagethat are not included in nor affect the ROI.
The residual networkalso performs a forward inference on the skip pathusing the input image. In some implementations, the forward inference on the skip pathis performed sequentially before or after the forward inference on the residual path. In some implementations, the forward inference on the skip pathis performed at least partially in parallel to the forward inference on the residual path. Thus, the full input imageis provided to the skip path, with each layer in the skip pathprocessing a full input feature map. The outputof the skip paththus includes a version of the complete input imageprocessed via the skip path.
The outputand outputare combined to create an output image. For example, in some implementations, the pixels from the output(e.g., corresponding to the ROI) are mapped onto the outputat their corresponding locations to produce an output image. The resulting output imagethen includes an area corresponding to the ROIthat has been processed using the higher-quality, computationally expensive residual pathand a remaining area processed by the lesser-quality, computationally cheaper skip path. Thus, the ROIbenefits from the enhanced quality of the residual pathwhile saving resources on the remainder of the output image. The output imageis eventually rendered on a display. In some variations, the output imageis provided to a graphics processing unit that renders the output image on a display. In some variations, the ROI-based image enhancement is carried out by a GPU or a circuit of a GPU, such as the image enhancement circuit ofimplemented as a component of a GPU. In some variations, the output image
Although the approaches set forth above describe using the residual networkto enhance a single input image, it is understood that, in some implementations, the approaches set forth above are applicable to a sequence of input images, such as multiple frames of video data. In some implementations, the ROIwill change across input images(e.g., across frames). Accordingly, in some implementations, the ROIis recalculated for each input imageto be processed (e.g., using particular object identifying algorithms or models, and the like). In some implementations, the ROIis recalculated at an interval less than for each input image. For example, in some implementations, the ROIis recalculated every N-frames (e.g., a frame interval). As another example, in some implementations, the ROIis not recalculated for a particular frame or input imagein response to an estimated computation time or amount of resources required for recalculating the ROIexceeding a threshold. In other implementations, the ROIdoes not change across frames or input images.
are pictorial diagrams showing an example process flow for region-of-interest (ROI)-based image enhancement using a residual network according to some implementations of the present disclosure.shows an example input imageof a person.shows an example ROIfor the input imageof. Here, the ROIis the person depicted in the input image, excluding the background of the input image. As an example, the ROIis encoded as or corresponds to a segmentation mask defining the pixels of the input imagecapturing the person.
shows an outputresulting from providing the input imageofto a skip pathof a residual network. Here, the outputis shown as shaded compared to the input imageofto represent some amount of processing, enhancement, or transformation by the skip pathof the residual network.shows an outputresulting from applying the residual pathof the residual networkto the input feature mapsubspace corresponding to the ROI. Here, the ROIis shown using a different shade to represent the different degree of processing, enhancement, or transformation by the residual path.
shows an example output image. Here, the outputhas been mapped onto the output. This results in an output imageshowing a person that has been enhanced using the higher quality, computationally expensive processing of the residual pathand a background that has been enhanced using the lower quality, less expensive processing of the skip path.
In some implementations, the approaches set forth herein for region-of-interest (ROI)-based image enhancement using a residual network are implemented using one or more general purpose computing devices, such as the exemplary computerof. The computerincludes at least one processor. In addition to the at least one processor, the computerofincludes random access memory (RAM)which is connected through a high speed memory busand bus adapterto processorand to other components of the computer. Stored in RAMis an operating system. The operating systemin the example ofis shown in RAM, but many components of such software typically are stored in non-volatile memory also, such as, for example, on data storage, such as a disk drive. Also stored in RAMis an optional image enhancement modulewhich comprises computer program instructions that, when executed, cause the computerto carry out the ROI-based image enhancement described above in accordance with variations of the present disclosure. The moduleis described here as ‘optional’ because in some implementation, the ROI-based image enhancement is carried out by one or more standalone hardware components such as the image enhancement circuit. That is, the ROI-based image enhancement can optionally be carried out through execution of the image enhancement moduleor through operation of the image enhancement circuit. In the example ofthe optional image enhancement circuitis coupled to a video adapter(or graphics processing unit) to provide ROI-based image enhancement in accordance with variations of the present disclosure. Such a circuit, in some implementations, is a component of the processor, of the video adapter, or of some other GPU or accelerator.
The computerofincludes disk drive adaptercoupled through expansion busand bus adapterto processorand other components of the computer. Disk drive adapterconnects non-volatile data storage to the computerin the form of data storage. Such disk drive adapters include Integrated Drive Electronics (‘IDE’) adapters, Small Computer System Interface (SCSI′) adapters, and others as will occur to those of skill in the art. In some implementations, non-volatile computer memory is implemented as an optical disk drive, electrically erasable programmable read-only memory (so-called ‘EEPROM’ or ‘Flash’ memory), RAM drives, and so on, as will occur to those of skill in the art.
The example computerofincludes one or more input/output (I/O′) adapters. I/O adapters implement user-oriented input/output through, for example, software drivers and computer hardware for controlling output to display devices such as computer display screens, as well as user input from user input devicessuch as keyboards and mice. The example computerofincludes a video adapter, which is an example of an I/O adapter specially designed for graphic output to a display devicesuch as a display screen or computer monitor. Video adapteris connected to processorthrough a high speed video bus, bus adapter, and the front side bus, which is also a high speed bus.
The exemplary computerofincludes a communications adapterfor data communications with other computers and for data communications with a data communications network. Such data communications are carried out serially through RS-232 connections, through external buses such as a Universal Serial Bus (‘USB’), through data communications networks such as IP data communications networks, and/or in other ways as will occur to those of skill in the art. Communications adaptersimplement the hardware level of data communications through which one computer sends data communications to another computer, directly or through a data communications network. Such communication adaptersinclude modems for wired dial-up communications, Ethernet (IEEE 802.3) adapters for wired data communications, and 802.11 adapters for wireless data communications.
The approaches described above for region-of-interest (ROI)-based image enhancement using a residual network are also described as methods in the flowcharts of. Accordingly, for further explanation,sets forth a flow chart illustrating an example method for region-of-interest (ROI)-based image enhancement using a residual network according to some implementations of the present disclosure. The method ofis performed, for example, by an image enhancement circuitas described in. The method ofincludes generating, based on an input imageand a residual pathof a residual network, a first outputcorresponding to a region of interest (ROI)of the input image. The input imageincludes any digitally encoded image as can be appreciate by one skilled in the art. The ROIdefines an area of an image (e.g., the input image) that is indicated as serving a particular purpose or having a particular importance. Accordingly, in some implementations, generatingthe first outputincludes identifying the ROIby applying an object identification algorithm, an image segmentation algorithm, or other algorithm as can be appreciated to identify the ROI.
The residual networkis a neural network (e.g., a convolutional neural network and the like) that performs image enhancement, processing, or transformations based on an input image. As an example, the residual networkperforms sharpening, denoising, upscaling, or other transformations and enhancements as can be appreciated. The residual networkincludes a residual pathand a skip path. Processing an input using the residual pathincludes performing a forward inference processed through each layerof the residual network. Processing an input using the skip pathincludes performing a forward inference that skips or bypasses one or more layersof the residual network. Accordingly, generatingthe first output corresponding to the ROIincludes performing a forward inference on the residual pathwith each layeron the residual path processing a subset of its received input feature map(e.g., an input feature mapsubspace) that is a factor or affects pixels or data in the ROI. Thus, the residual pathonly processes data necessary to generate an outputcorresponding to the ROI.
The method ofalso includes generating, based on the input imageand a skip pathof the residual network, a second output. As is set forth above, the skip pathincludes a subset of the layersof the residual pathsuch that processing an input using the skip pathuses less power and fewer computational resources compared to the residual path. Though the skip pathuses less power and fewer computational resources compared to the residual path, in some implementations the skip pathproduces an image output having a visually perceptive reduction in quality compared to output from the residual path. Accordingly, generatingthe second outputincludes performing a forward inference on the skip pathusing the input imageas the initial input to the skip path. The resulting outputincludes version of the whole input imagethat has been processed using the skip path.
The method ofalso includes generating, based on the first outputand the second output, an output image. In some implementations, the output imageis generatedby combining the first outputand the second output. For example, as the first outputcorresponds to a subset of the input imagedefined by the ROI, in some implementations generatingoutput imageincludes mapping the pixels or data from the first outputonto the second output
For further explanation,sets forth a flow chart illustrating an example method for region-of-interest (ROI)-based image enhancement using a residual network according to some implementations of the present disclosure. The method ofis similar toin that the method ofincludes: generating, based on an input imageand a residual pathof a residual network, a first outputcorresponding to a region of interest (ROI)of the input image; generating, based on the input imageand a skip pathof the residual network, a second output; and generating, based on the first outputand the second output, an output image.
The method ofdiffers fromin that the method ofincludes performinga backpropagation of the residual pathbased on the ROI. Performingthe backpropagation includes traversing the residual pathstarting from the output layerto the input layerand performing, at each layer, a particular operation. Accordingly, in some implementations, performingthe backpropagation includes identifying, for each layerof the residual path, a corresponding input feature mapsubspace based on the ROI.
For a current given layerin the backpropagation, an output feature mapsubspace has been determined by virtue of having previously propagated through a next layerin the residual path. In the case of the output layer, the output feature mapsubspace is the portion of the output feature mapfor the output layercorresponding to the ROI. For every other layer, the output feature mapsubspace for the given layeris the selected input feature mapsubspace for the successive layerpreviously traversed during the backpropagation.
Accordingly, for the given layerand having a determined output feature mapsubspace, the input feature mapsubspace for the given later 300 includes those values or data in the input feature mapthat are factors in or otherwise affect some portion of the determined output feature map. Thus, values or data in the input feature mapthat do not affect the output feature mapsubspace are not included in the input feature mapsubspace. Having identified the input feature mapsubspace for a given layer, the identified input feature mapsubspace serves as the output feature mapsubspace for the next layerin the backpropagation (e.g., the preceding layerin the residual path). The identified feature mapsubspaces are then used in generatingthe first outputusing the residual pathas is described in further detail below.
For further explanation,sets forth a flow chart illustrating an example method for region-of-interest (ROI)-based image enhancement using a residual network according to some implementations of the present disclosure. The method ofis similar toin that the method ofincludes: performinga backpropagation of the residual pathbased on the ROIincluding identifying, for each layerof the residual path, a corresponding input feature mapsubspace based on the ROI; generating, based on an input imageand a residual pathof a residual network, a first outputcorresponding to a region of interest (ROI)of the input image; generating, based on the input imageand a skip pathof the residual network, a second output; and generating, based on the first outputand the second output, an output image.
The method ofdiffers fromin performingthe backpropagation of the residual pathbased on the ROIincludes storing 802, for each layerof the residual path, subspace metadata describing the corresponding input feature mapsubspace. For example, assuming that the input feature mapsubspace for a given layerhas been identifiedduring the backpropagation. Subspace metadata describing the particular input feature mapsubspace for that layeris stored. Thus, after the backpropagation, each layerof the residual pathhas subspace metadata describing the particular input feature mapsubspace that affects the ROIof the resulting output.
The method offurther differs fromin that generatingthe first outputincludes processing, for each layerof the residual path, the corresponding feature mapsubspace described in the subspace metadata. As an example, during forward inference on the residual path, a given layerhas received an input feature mapfrom a preceding layer. The subspace metadata for the given layeris accessed to determine the input feature mapsubspace that should be selected from the input feature map. The input feature mapsubspace is then processed using the given layerwithout processing other data from the input feature mapexcluded from the subspace. Thus, computational and power resources are saved by only processing, for a given layer, the data in the input feature mapsubspace that affects output corresponding to the ROI.
In view of the explanations set forth above, readers will recognize that the benefits of region-of-interest (ROI)-based image enhancement using a residual network include improved performance of a computing system by reducing overall power usage and computational resource consumption through selectively processing the region-of-interest by a higher cost residual data path of a residual network and processing the remainder of the image using a lower cost skip path of the residual network.
Exemplary implementations of the present disclosure are described largely in the context of a fully functional computer system for region-of-interest (ROI)-based image enhancement using a residual network. Readers of skill in the art will recognize, however, that the present disclosure also can be embodied in a computer program product disposed upon computer readable storage media for use with any suitable data processing system. Such computer readable storage media can be any storage medium for machine-readable information, including magnetic media, optical media, or other suitable media. Examples of such media include magnetic disks in hard drives or diskettes, compact disks for optical drives, magnetic tape, and others as will occur to those of skill in the art. Persons skilled in the art will immediately recognize that any computer system having suitable programming means will be capable of executing the steps of the method of the disclosure as embodied in a computer program product. Persons skilled in the art will recognize also that, although some of the exemplary implementations described in this specification are oriented to software installed and executing on computer hardware, nevertheless, alternative implementations implemented as firmware or as hardware are well within the scope of the present disclosure.
The present disclosure can be a system, a method, and/or a computer program product. The computer program product can include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present disclosure.
The computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device. The computer readable storage medium can be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. A non-exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing. A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network. The network can include copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.
Computer readable program instructions for carrying out operations of the present disclosure can be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C++ or the like, and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The computer readable program instructions can execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer can be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection can be made to an external computer (for example, through the Internet using an Internet Service Provider). In some implementations, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) can execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present disclosure.
Aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to implementations of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.
These computer readable program instructions can be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions can also be stored in a computer readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein includes an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.
Unknown
March 10, 2026
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.