An apparatus includes an interface circuit, a digital signal processing pipeline, and a tuning tool. The interface circuit may be configured to receive pixel data for one or more raw images. The digital signal processing pipeline may be configured to generate processed image data based on the pixel data of the one or more raw images and a set of image quality parameters. The tuning tool may be configured to generate the set of image quality parameters based on the one or more raw images by executing a first artificial neural network model that was trained using a set of the raw images, a set of reference images, and a loss function.
Legal claims defining the scope of protection, as filed with the USPTO.
. An apparatus comprising:
. The apparatus according to, wherein said image quality parameters relate to one or more of black level correction, lens distortion correction, sensor defect detection, spatial and temporal noise reduction, Bayer demosaicing, color and tone correction, brightness and saturation, contrast, sharpness, white balance, autofocus statistics, auto-exposure statistics, automatic white balance statistics, lens shading correction, color space conversion, gamma correction, dynamic range, electronic image stabilization, decompanding, anti-aliasing, chromatic aberration, digital gain, vignette compensation, and statistics extraction.
. The apparatus according to, wherein said first artificial neural network model and said second artificial neural network model comprise convolutional neural network models.
. The apparatus according to, further comprising a post processing circuit configured to generate a set of customized image quality parameters for a specific application or device based on said set of image quality parameters.
. The apparatus according to, wherein said post processing circuit customizes said set of image quality parameters for use in an edge device.
. The apparatus according to, wherein said post processing circuit allows a user to fine tune said set of image quality parameters.
. An apparatus comprising:
. The apparatus according to, wherein said tuning tool further comprises a second artificial neural network model configured to generate a loss signal based on said processed image data, said set of reference images, and said loss function.
. The apparatus according to, wherein in a training mode of said apparatus:
. The apparatus according to, wherein said training mode is ended when said loss signal reaches a predetermined threshold.
. A method of tuning image quality parameters of a digital signal processing pipeline comprising:
. The method according to, wherein:
. The method according to, wherein said image quality parameters relate to one or more of black level correction, lens distortion correction, sensor defect detection, spatial and temporal noise reduction, Bayer demosaicing, color and tone correction, brightness and saturation, contrast, sharpness, white balance, autofocus statistics, auto-exposure statistics, automatic white balance statistics, lens shading correction, color space conversion, gamma correction, dynamic range, electronic image stabilization, decompanding, anti-aliasing, chromatic aberration, digital gain, vignette compensation, and statistics extraction.
. The method according to, wherein said first artificial neural network model and said second artificial neural network model comprise convolutional neural network models.
. The method according to, further comprising:
. The method according to, wherein said set of image quality parameters are customized for use in an edge device.
. The method according to, further comprising:
. The method according to, further comprises:
. The method according to, wherein said edge device comprises a digital camera.
Complete technical specification and implementation details from the patent document.
The invention relates to digital image processing generally and, more particularly, to a method and/or apparatus for implementing automatic image quality parameter tuning for user-selected visual characteristics based on machine learning.
When a digital image sensor measures light at individual locations or picture elements (pixels), the measurements are not exact but rather include noise. Picture noise can degrade the subjective quality of a digital picture. Typically, digital image capture devices (e.g., cameras) incorporate some form of image signal processing (ISP) to reduce picture noise based on numerous image quality (IQ) parameters.
Initiating a project involving tuning the IQ parameters of an ISP pipeline can be quite complicated and time consuming. Manually tuning the IQ parameters can take several weeks to months due to the large number of IQ parameters of the ISP pipeline to be optimized and the number of iterations needed to obtain good image quality. Moreover, switching from a system-on-chip (SoC) and camera module to another setup may require a complete re-tuning of the IQ parameters configuring the ISP pipeline, leading to an increase in development work.
For different applications, engineers may need different ISP settings according to specific needs. Therefore, it would be helpful to have a flexible tuning system that allows engineers to choose the characteristics of the resulting image output from the ISP pipeline.
It would be desirable to implement automatic image quality parameter tuning for user-selected visual characteristics based on machine learning.
The invention concerns an apparatus comprising an interface circuit, a digital signal processing pipeline, and a tuning tool. The interface circuit may be configured to receive pixel data for one or more raw images. The digital signal processing pipeline may be configured to generate processed image data based on the pixel data of the one or more raw images and a set of image quality parameters. The tuning tool may be configured to generate the set of image quality parameters based on the one or more raw images by executing a first artificial neural network model that was trained using a set of the raw images, a set of reference images, and a loss function.
Embodiments of the present invention include providing automatic image quality parameter tuning for user-selected visual characteristics based on machine learning that may (i) provide a modern machine learning based (ML-based) image quality parameter tuning tool, (ii) implement a combination of automatic image quality (IQ) tuning with a modern ML-based architecture, (iii) integrate the ML-based tuning tool model with a hardware (HW) implementation (or equivalent software (SW) emulation) of an image digital signal processing (IDSP) pipeline model during a training phase of the model, (iv) insert the actual IDSP pipeline (or equivalent software (SW) emulation) in a feedback loop that trains the ML-based tuning tool model, (v) apply IQ parameters generated by the ML-based tuning tool, via the IDSP pipeline, on raw images to evaluate, via a second ML-based model, whether desired visual characteristics are present on resulting images, (vi) be performed “offline”, thus not consuming any runtime resources, (vii) provide a set of customized IQ tuning parameters to exploit an IDSP pipeline on a selected SoC, (viii) be run on a host computer before implementation on the SoC, (ix) allow a user to choose preferred visual characteristics of the images resulting from the IDSP pipeline, (x) be flexible enough to adapt to any characteristics a user may possibly want, even if the characteristics are not ones typically desired for computer vision applications, (xi) understand, without knowing beforehand what characteristics should be found in the output image, whether the output image contains a specific visual characteristic or not, (xii) provide an IQ parameter tuning tool that may be directly integrated with SoC systems, (xiii) provide an IQ parameter tuning tool that may be configured to generate either generic IQ parameters (e.g., a user has to translate them to SoC-specific parameters) or specific IQ parameters (e.g., ready to be integrated in a specific SoC), (xiv) provide IQ parameters that may be easily integrated with proprietary systems, (xv) provide IQ parameters that may be difficult to integrate with systems of competitors, to voluntarily limit usage, and/or (xvi) be implemented as one or more integrated circuits.
Artificial intelligence (AI) solutions that rely on computer vision need to begin with high-quality video. This is especially true for AI system applications on the road (e.g., ADAS, DMS, autonomous vehicle control, etc.). When a vision-based system uses grainy, low-quality images of traffic and pedestrians, any decisions recommended are suspect and any warnings provided are less reliable. Vision-based surveillance and security systems may be similarly affected when using grainy, low-quality images. However, when the same systems start with high-quality imagery, accuracy is improved significantly. High-quality imagery, for example, enables AI systems to more successfully identify objects in the environment, evaluate complex scenarios, and make predictions as situations change.
Although computer vision applications do not “see” images in the same way humans do, this does not diminish the importance of high-quality image processing techniques. If anything, high-quality image processing techniques become even more important, especially when lives are at stake (e.g., on the road). An image digital signal processing (IDSP) pipeline is generally used to obtain high-quality imagery. The IDSP pipeline generally utilizes a combination of complex processes to transform raw sensor data into pristine imagery. In an example, high dynamic range (HDR) processing allows an advanced driver assistance system (ADAS) to operate successfully in scenarios where a high degree of contrast creates perception challenges, such as when a vehicle emerges from a tunnel. Related techniques may also be applied to help the same ADAS system perform well in low-light environments, rain, snow, or fog.
A number of different image quality (IQ) parameters (e.g., HDR, color, tone mapping, edge detection, sharpness filters, etc.) of the IDSP pipeline may be tuned differently for artificial intelligence application (e.g., sensing applications) or for human consumption (e.g., viewing applications). Manually tuning the myriad IQ parameters of an ISP pipeline can be quite complicated and time consuming. Moreover, switching from a system-on-chip (SoC) and camera module to another setup may involve a complete re-tuning of the IQ parameters configuring the IDSP pipeline, leading to an increase in development costs. For different applications, engineers may need different ISP settings to meet specific needs.
In various embodiments, an image quality parameter tuning tool is generally provided that enables rapid tuning of an IDSP pipeline to provide the flexibility needed for consumption by both artificial intelligence applications and humans. In various embodiments, an image quality parameter tuning tool may be provided, based on a machine learning technique (or process), that prepares the image quality parameter tuning tool “offline”, thus not consuming runtime resources. In an example, the image quality parameter tuning tool in accordance with an embodiment of the invention may be used to prepare a set of IQ tuning parameters to exploit an IDSP pipeline on a selected SoC. In an example, the image quality parameter tuning tool may be run on a host computer before actual implementation on the selected SoC. In an example, a user may be allowed to choose preferred visual characteristics of the images resulting from the IDSP pipeline. In an example, the image quality parameter tuning tool in accordance with an embodiment of the invention may be flexible enough to adapt to any characteristics a user may possibly want, even if the characteristics are not ones typically desired for computer vision applications. In an example, the image quality parameter tuning tool in accordance with an embodiment of the invention may be configured to automatically extract the visual characteristics a user may want from a set of reference images embodying the visual characteristics, regardless of whether the reference images are ideal.
In an example, the image quality parameter tuning tool does not need to know beforehand what characteristics should be found in the output image. Instead, the image quality parameter tuning tool may understand (e.g., via a set of reference images) whether the output image contains a specific visual characteristic or not. In an example, a developer may utilize the image quality parameter tuning tool in accordance with an embodiment of the invention to develop a new SoC/camera pair and, in a few minutes after some specific inputs are supplied, obtain a set of IQ parameters that allow the IDSP pipeline to produce images with the desired visual characteristics. In some embodiments, the user may also be allowed to manually fine tune some of the parameters after a generic tuning that provides a starting point.
Referring to, a diagram is shown illustrating examples of edge devices that may utilize image quality parameter tuning in accordance with an embodiment of the invention. In an example, edge devices may include low power technology designed to be deployed in embedded platforms at the edge (e.g., battery-powered devices), where power consumption is a critical concern. In an example, edge devices may comprise traffic cameras and intelligent transportation systems (ITS) solutions including automated number plate recognition (ANPR) cameras, traffic cameras, vehicle cameras, access control cameras, automatic teller machine (ATM) cameras, bullet cameras, and dome cameras. In an example, the traffic cameras and intelligent transportation systems (ITS) solutions may be designed to enhance roadway security with a combination of person and vehicle detection, vehicle make/model recognition, and automatic number plate recognition (ANPR) capabilities. In an example, edge devices may further comprise smart phones and smart home internet-of-things (IoT) devices.
In an example, person and vehicle detection, vehicle make/model recognition, and automatic number plate recognition (ANPR) capabilities may be facilitated utilizing image quality parameter tuning in accordance with embodiments of the invention. In an example, access control cameras may comprise security camera applications. In an example, the security camera applications may include battery-powered cameras, doorbell cameras, outdoor cameras, and indoor cameras. In an example, the security camera application edge devices may include low power technology designed to be deployed in embedded platforms at the edge (e.g., battery-powered devices), where power consumption is a critical concern. The security camera applications may realize performance benefits from application of image quality parameter tuning in accordance with embodiments of the invention.
In various embodiments, an image quality parameter tuning tool may be implemented. The image quality parameter tuning tool is generally trained using a dataset provided by an end user. The image quality parameter tuning tool is generally trained to adapt to an image quality loss of an edge device on which the image quality parameter tuning tool will be utilized in conjunction with an image digital signal processing (IDSP) pipeline (also referred to more simply as an image signal processing (ISP) pipeline or a digital signal processing (DSP) pipeline). Training the image quality parameter tuning tool generally involves less effort than redesigning the IDSP pipeline.
Referring to, a diagram is shown illustrating a camera system implementing image quality parameter tuning in accordance with an embodiment of the invention. In an example, a camera systemmay be configured to implement an image quality parameter tuning process and/or tool in accordance with an embodiment of the invention. In an example, the camera systemmay be implemented in an edge device. In an example, the camera systemmay comprise a block (or circuit), and one or more blocks (or circuits)-. In an example, the blockmay be implemented as a processor or system-on-chip (processor/SoC). In an example, the blocks-may each implement a camera assembly.
In an example, the camera assemblies-may comprise lenses-and blocks (or circuits)-. In an example, the circuits-may comprise capture devices. The lenses-may be attached to the capture devices-. The capture devices-may be configured to receive light as an input via the lenses-. The lenses-may be implemented as optical lenses. The lenses-may provide a zooming feature and/or a focusing feature. The capture devices-and/or the lenses-may be implemented, in one example, as a single assembly. In another example, the lenses-may be a separate implementation from the capture devices-
The capture devices-may be configured to convert the input light into computer readable data. The capture devices-may capture data received through the lenses-to generate raw pixel data. In some embodiments, the capture devices-may capture data received through the lenses-to generate bitstreams (e.g., generate video frames). For example, the capture devices-may receive focused light from the lenses-. The lenses-may be directed, tilted, panned, zoomed and/or rotated to provide a targeted view from the camera system. The capture devicemay generate a signal (e.g., VIDEO). The signal VIDEO may be pixel data (e.g., a sequence of pixels that may be used to generate video frames). In some embodiments, the signal VIDEO may be video data (e.g., a sequence of video frames). The signal VIDEO may be presented to one of the inputs of the processor/SoC. In some embodiments, the pixel data generated by the capture devicemay be uncompressed and/or raw data generated in response to the focused light from the lens. In some embodiments, the output of the capture devices-may be digital video signals.
The lenses-(e.g., camera lenses) may be directed to provide a view of an environment surrounding the apparatus. The lenses-may be aimed to capture environmental data. The lenses-may be a wide-angle lens and/or fish-eye lens (e.g., lenses capable of capturing a wide field of view). The lenses-may be configured to capture and/or focus the light for the capture devices-. Generally, the capture devices-are located behind the lenses-. Based on the captured light from the lenses-, the capture devices-may generate a bitstream and/or video data (e.g., the signal VIDEO).
In various embodiments, the lenses-may be implemented as fixed focus lenses. A fixed focus lens generally facilitates smaller size and low power. In an example, a fixed focus lens may be used in battery powered, doorbell, and other low power camera applications. In some embodiments, the lenses-may be directed, tilted, panned, zoomed and/or rotated to capture the environment surrounding the apparatus(e.g., capture data from the field of view). In an example, professional camera models may be implemented with an active lens system for enhanced functionality, remote control, etc.
The capture devices-may transform the received light into a digital data stream. In some embodiments, the capture devices-may perform an analog to digital conversion. For example, the capture devices-may perform a photoelectric conversion of the light received by the lenses-. The capture devices-may transform the digital data stream into a video data stream (or bitstream), a video file, and/or a number of video frames. In an example, the capture devices-may present the video data as a digital video signal (e.g., VIDEO). The digital video signal may comprise the video frames (e.g., sequential digital images and/or audio). In some embodiments, the camera assemblies-may comprise a microphone for capturing audio.
The video data captured by the capture devices-may be represented as a signal/bitstream/data VIDEO (e.g., a digital video signal). The capture devices-may present the signal VIDEO to the processor/SoC. The signal VIDEO may represent the video frames/video data. The signal VIDEO may be a video stream captured by the capture devices-. In some embodiments, the signal VIDEO may comprise pixel data that may be operated on by the processor/SoC(e.g., using an image digital signal processor (IDSP), etc.). The processor/SoCmay generate video frames in response to the pixel data in the signal VIDEO.
In various embodiments, the capture devices-may be configured to generate an RGB video signal, an IR video signal, and/or an RGB-IR video signal. In an infrared light only illuminated field of view, the capture devices-may generate a monochrome (B/W) video signal. In a field of view illuminated by both IR light and visible light, the capture devices-may be configured to generate color information in addition to the monochrome video signal. In various embodiments, the capture devices-may be configured to generate a video signal in response to visible and/or infrared (IR) light. In an example, the circuits-may comprise a color (RGB) image sensor, an infrared (IR) image sensor and/or a hybrid RGB-IR image sensor.
In some embodiments, the capture devices-may comprise a rolling shutter sensor or a global shutter sensor. In an example, the rolling shutter sensor may implement an RGB-IR sensor. In some embodiments, the capture devices-may comprise a rolling shutter IR sensor and an RGB sensor (e.g., implemented as separate components). In an example, the rolling shutter sensor may be implemented as an RGB-IR rolling shutter complementary metal oxide semiconductor (CMOS) image sensor. In one example, the rolling shutter sensor may be configured to assert a signal that indicates a first line exposure time. In one example, the rolling shutter sensor may apply a mask to a monochrome sensor. In an example, the mask may comprise a plurality of units containing one red pixel, one green pixel, one blue pixel, and one IR pixel. The IR pixel may contain red, green, and blue filter materials that effectively absorb all of the light in the visible spectrum, while allowing the longer infrared wavelengths to pass through with minimal loss. With a rolling shutter, as each line (or row) of the sensor starts exposure, all pixels in the line (or row) may start exposure simultaneously.
In an example, the circuitmay comprise a block (or circuit), a block (or circuit), and a block (or circuit). The circuitmay implement a sensor interface. The circuitmay implement an image digital signal processing (IDSP) pipeline. The circuitmay implement an image quality parameter tuning tool in accordance with an embodiment of the invention. The circuitmay be configured to receive data communicated by the camera systems-. In an example, the image data signal VIDEO may be presented to an input of the circuit. The circuitmay generate video frames in response to the pixel data in the signal VIDEO. The video frames generated by the circuitmay comprise raw images formed by the raw pixel data received by the circuitvia the signal VIDEO. The circuitmay have an output that may present a signal (e.g., RAW IMAGES) that may communicate the raw image data. The signal RAW IMAGES may be presented to a first input of the circuitand a first input of the circuit.
The circuitgenerally implements an image digital signal processor (IDSP) pipeline of the processor/SoC. The circuitis generally used to obtain high-quality imagery. In an example, the circuitgenerally utilizes a combination of complex processes (e.g., described below in connection with) to transform the raw image data received in the signal RAW IMAGES into pristine imagery. A set of image quality (IQ) parameters is used by the circuitto allow the IDSP pipeline to produce images with desired visual characteristics. The circuitmay have a second input that may receive a signal (e.g., IQPARAMS) that may communicate the set of image quality parameters. The image quality parameters communicated by the signal IQPARAMS may be utilized by the circuitto process the raw image data received via the signal RAW IMAGES. In an example, the signal IQPARAMS may comprise image quality parameters related to one or more of black level correction, lens distortion correction, lens shading correction, sensor defect detection, spatial and temporal noise reduction, Bayer demosaicing, color and tone correction, brightness and saturation, contrast, sharpness, white balance, autofocus statistics, auto-exposure statistics, automatic white balance statistics, color space conversion, gamma correction, dynamic range, electronic image stabilization, decompanding, anti-aliasing, chromatic aberration, digital gain, vignette compensation, and statistics extraction. The circuitmay have an output that may present a signal (e.g., FRAMES) that may communicate processed image data. The circuitis generally configured to generate the processed image data presented in the signal FRAMES in response to the raw image data receive via the signal RAW IMAGES and IQ parameters received via the signal IQPARAMS.
The circuitgenerally implements an image quality parameter tuning tool in accordance with an embodiment of the invention. In various embodiments, the circuitmay be configured to generate a set of IQ parameters which, when applied on raw images via the IDSP pipeline of the circuit, allows the circuitto obtain new images with desired visual characteristics. In some embodiments, the IQ parameters provided by the image quality parameter tuning toolmay be customized to produce IQ tuning that can only be used with proprietary systems, voluntarily limiting the use of the image quality parameter tuning tool in combination with competing tools and systems. In an example, the image quality parameter tuning tool may include a post-processing embedded block that translates generic IQ parameters to SoC-specific values.
In an example, the circuitmay have a first input that may receive a set of raw images from the selected SoC/camera couple, a second input that may receive one or more reference images (e.g., via a signal REFERENCE IMAGES) embedding the desired visual characteristics, and an output that may communicate either the generic or the customized IQ parameters via the signal IQPARAMS. In an example, the signal REFERENCE IMAGES may communicate one or more reference images that exemplify an image quality to be obtained. In various embodiments, the circuitimplements a first machine learning based (ML-based) model that, in an operating mode, generates (tunes) the set of IQ parameters communicated via the signal IQPARAMS based on the set of raw images received at the first input and the one or more reference images received at the second input.
The circuitmay also implement a training mode in which a second ML-based model may be used to train the first ML-based model. In the training mode, a training process may utilize the second ML-based model cascaded with a hardware (HW) implementation (or equivalent software (SW) emulation) of an image digital signal processing (IDSP) pipeline model. In some embodiments, the training process may utilize the IDSP pipeline of the circuitduring the training mode. In embodiments utilizing the circuit, the circuitmay have a third input that may receive the processed image data from the circuitvia the signal FRAMES.
In the training mode, the training process may utilize the second ML-based model to generate a signal (e.g., LOSS) based on the set of raw images, the one or more reference images, and a loss function. The signal LOSS generally communicates a result (or loss) of the loss function. The second ML-based model is generally configured (trained) to evaluate the results coming from the first ML-based model and generate the signal LOSS based on comparison of the results coming from the first ML-based model with the one or more reference images received at the second input, and allows an iterative training of the first ML-based model according to the signal loss.
In various embodiments, the image quality parameter tuning toolmay be trained in a semi-supervised fashion, since the model needs to learn an embedded representation of the visual characteristics with some feedbacks that may be provided by human expertise. In an example, rather than performing model training on request, a continuous improvement approach may be followed that performs model training whenever new images, labels, historical results, etc., become available. In an example, the first ML-based model may be built and trained such that only one image per scenario of interest may be needed (e.g., for automotive application: one image shot on the road during the day, one shot in a garage during the night, etc.). In an example, the second ML-based model may be built and trained such that only one image for each extreme condition of interest may be needed (e.g., one image shot during the day, one during the night, one with a source of light directly in front of the camera, etc.). In general, the set of reference images is not the output goal of the ML-based model. Rather, the set of reference images generally provides the desired visual characteristics that the ML-based model needs to learn to replicate on the set of raw images. As such, the set of reference images may be obtained from an already tuned camera module, or even downloaded from a dataset on the internet.
Referring to, a diagram is shown illustrating an example implementation of an image quality parameter tuning tool circuit in accordance with an embodiment of the invention. In an example, the circuitmay comprise a block (or circuit)and a block (or circuit). The circuitmay have a first input that may receive the signal RAW IMAGES, a second input that may receive the signal REFERENCE IMAGES, a third input that may receive the loss signal (e.g., LOSS), and an output that may present the signal IQPARAMS. The signal IQPARAMS generally comprise image quality parameter values generated by the circuitbased on the signal RAW IMAGES, the signal REFERENCE IMAGES, and the loss signal LOSS. In some embodiments, the signal IQPARAMS may comprise generic image quality parameter values. In some embodiments, the circuitmay implement a post processing customization step. In embodiments implementing the post processing customization step, the signal IQPARAMS may comprise customized (e.g., SoC-specific, proprietary, etc.) image quality parameter values.
The circuitmay have a first input that may receive the signal RAW IMAGES, a second input that may receive the signal REFERENCE IMAGES, a third input that may receive the signal IQPARAMS, and an output that may present the loss signal LOSS. The circuitis generally configured to generate the loss signal LOSS in response to the signal RAW IMAGES, the signal REFERENCE IMAGES, the signal IQPARAMS, and a loss function. In an example, the circuitmay operate similarly to a “discriminator” or a “critique” element of a generative adversarial network (GAN), in that the circuitgenerally rates the result of applying the image quality parameter values in the signal IQPARAMS on the raw images in the signal RAW IMAGES against the reference images in the signal REFERENCE IMAGES. In an example, the loss function may comprise a min-max loss as used in training GAN networks. In some embodiments, the loss function may start with the min-max loss and move to a different loss function (e.g., non-saturating GAN loss, Wasserstein Generative Adversarial Network (WGAN) loss, etc.).
In an example, the signal RAW IMAGES may communicate a set of raw images-. In an example, the set of raw images-may be obtained from a selected SoC/camera pair. In an example, when the circuitis ideally configured (built and trained), only one image per scenario of interest may be needed (e.g., for automotive application: one image shot on the road during the day, one shot in a garage during the night, etc.). In an example, the signal REFERENCE IMAGES may communicate one or more reference images-embedding the desired visual characteristics. In an example, when the circuitsandare ideally configured (built and trained), only one image for each extreme condition of interest may be needed (e.g., one image shot during the day, one during the night, one with a source of light directly in front of the camera, etc.). In general, the set of reference images-communicated by the signal REFERENCE IMAGES does not provide the output goal of the circuit. Rather, the set of reference images-communicated by the signal REFERENCE IMAGES provides the desired visual characteristics that the circuitneeds to learn (e.g., via a machine learning technique or process) to replicate on the set of raw images-communicated via the signal RAW IMAGES. In an example, the set of reference images-communicated by the signal REFERENCE IMAGES may be obtained from an already tuned camera module, downloaded from a dataset on internet, or obtained from some other source selected by a user.
In various embodiments, the signal IQPARAMS communicates a set of IQ parameters which, if applied on the raw image date communicated by the signal RAW IMAGES via the IDSP pipeline, allows the IDSP pipelineto obtain new images (e.g., processed images) with the desired visual characteristics. In embodiments implementing the post-processing step, the output e.g., the signal IQPARAMS) provided by the image quality parameter tuning toolmay be customized to produce IQ tuning that may only be used with proprietary systems, voluntarily limiting the use of the image quality parameter tuning toolin combination with competing tools and systems. To achieve this goal, the image quality parameter tuning toolmay include a post-processing embedded block that translates the generic IQ parameters to some SoC-specific values.
In various embodiments, the circuitmay comprise a block (or circuit). The circuitmay implement a first artificial neural network model (e.g., MODEL 1). In an example, the first artificial neural network model MODEL 1 may be implemented as a convolutional neural network (CNN) model. The circuitmay have a first input that may receive the signal RAW IMAGES, a second input that may receive the signal REFERENCE IMAGES, a third input that may receive the loss signal LOSS, and an output that may present the signal IQPARAMS comprising a set of generic image quality parameters. The first artificial neural network model MODEL 1 may be configured (trained) to generate the set of generic image quality parametersbased on the raw image data communicated via the signal RAW IMAGES, the image data embedding the desired visual characteristics communicated via the signal REFERENCE IMAGES, and the loss signal LOSS.
In some embodiments, the circuitmay further comprise a block (or circuit). The circuitmay implement an optional post-processing circuit. The circuitmay have an input that may receive the set of generic image quality parametersand an output that may present a set of customized (e.g., SoC-specific, proprietary, etc.) image quality parameters. The post-processing circuitis generally configured to generate the set of customized image quality parametersfrom the set of generic image quality parameters. In an example, the post-processing circuitmay be configured to translate the set of generic IQ parametersto some device-specific values. In an example, the set of customized image quality parametersmay be customized for a particular device (e.g., a system-on-chip, edge device, proprietary device, etc.).
In an example, the circuitmay comprise a block (or circuit)and a block (or circuit). The circuitmay implement a model of an image digital signal processing (IDSP) pipeline. In some embodiments, the circuitmay implement the model of the IDSP pipeline with a hardware (HW) implementation (or an equivalent software (SW) emulation) of an image digital signal processing (IDSP) pipeline model. In some embodiments, the circuitmay comprise the actual IDSP pipeline of the circuit. The circuitmay implement a second artificial neural network model (e.g., MODEL 2). In an example, the second artificial neural network model MODEL 2 may be implemented as a convolutional neural network (CNN) model.
In an example, the circuitmay have a first input that may receive the signal RAW IMAGES, a second input that may receive the signal IQPARAMS, and an output that may present a signal (e.g., PROCESSED IMAGES). The signal PROCESSED IMAGES may communicate new, processed image data generated by the circuit. In an example, the circuitmay be configured to generate the processed image data presented in the signal PROCESSED IMAGES in response to the raw image data receive via the signal RAW IMAGES and IQ parameters received via the signal IQPARAMS.
In an example, the circuitmay have a first input that may receive the signal PROCESSED IMAGES from the circuitand a second input that may receive the signal REFERENCE IMAGES. The circuitmay have an output that presents the loss signal LOSS. The second artificial neural network model MODEL 2 implemented by the circuitis generally configured (trained) to generate the loss signal LOSS in response to the output of the circuit, the signal REFERENCE IMAGES, and the loss function.
Although the actual implementation of the image quality parameter tuning toolmay vary during development and test phases, the architecture generally comprises the two cascaded ML-based models. The first artificial neural network model MODEL 1 generally follows an end-to-end approach, as MODEL 1 takes the set of raw images-as input and generates the set of IQ parametersas output. The second artificial neural network model MODEL 2 evaluates the results coming from the first artificial neural network model MODEL 1 and generates a loss signal LOSS based on the comparison with the set of reference images-. The second artificial neural network model MODEL 2 generally allows the iterative training of the first artificial neural network model MODEL 1 according to generated loss.
In various embodiments, the first artificial neural network model MODEL 1 follows an end-to-end approach. Given the set of raw images-and the set of reference images-embedding the desired characteristics, the first artificial neural network model MODEL 1 generates the set of IQ parameters. The first artificial neural network model MODEL 1 gets trained on the loss signal from the second artificial neural network model MODEL 2. Optionally, instead of a real end-to-end approach based on neural networks, the first artificial neural network model MODEL 1 may be constituted of several computer vision blocks, each mimicking a specific image signal processing (ISP) function. In an example, the computer vision blocks may be based on both ML methods and proprietary expertise on computer vision, possibly introducing configuration parameters that may be set up by the user or by engineering practice.
In various embodiments, the second artificial neural network model MODEL 2 is paired with the blockcomprising the IDSP pipeline model that applies the set of IQ parameters(orwhen post-processing is implemented) on the set of raw images-. The second artificial neural network model MODEL 2 compares the new images obtained by the application of the IQ parameters on the set of raw images-with the set of reference images-embedding the desired characteristics. To apply the IQ parameters on the set of raw images-, the IDSP pipeline modelneeds to replicate the results of the IDSP pipeline. In various embodiments, this may be achieved by either connecting the IQ parameter tuning toolto the actual HW system, or building a software IDSP emulator. In general, a new set of processed images produced by the application of the set of IQ parametersis obtained.
Finally, the new set of processed images is compared against the set of reference images-embedding the desired characteristics to check whether the two sets share the main visual characteristics. Similar to first artificial neural network model MODEL 1, the comparison between the two sets of images may be based either on a neural network model or some engineering expertise, or a combination of the two. The second artificial neural network model MODEL 2 is mainly used during the training phase of the first artificial neural network model MODEL 1. However, the second artificial neural network model MODEL 2 may also be exploited at an evaluation stage (e.g., performing the same operations on a new set of raw images). The second artificial neural network model MODEL 2 generally acts as a “teacher” for the first artificial neural network model MODEL 1, by providing the loss signal LOSS that is only based on visual characteristics and not on the actual content of the images.
In various embodiments, the image quality parameter tuning toolmay be trained in a semi-supervised fashion, since the ML-based model needs to learn an embedded representation of the visual characteristics with some feedbacks that may be provided by human expertise. In an example, instead of model training happening on request, a continuous improvement approach may be followed whenever new images, labels, historical results, etc., become available. In an example, when the first artificial neural network model MODEL 1 is properly built and trained, only one image per scenario of interest may be needed (e.g., for an automotive application: one image shot on the road during the day, one shot in a garage during the night, etc.). In an example, when the second artificial neural network model MODEL 2 is properly built and trained, only one image for each extreme condition of interest would be needed (e.g., one image shot during the day, one during the night, one with a source of light directly in front of the camera, etc.). In general, the set of reference images-is not the output goal of the second artificial neural network model MODEL 2. Rather, the set of reference images-generally provides the desired visual characteristics that the first artificial neural network model MODEL 1 needs to learn to replicate on the set of raw images-. As such, the set of reference images-may be obtained from an already tuned camera module, or even downloaded from a dataset on the internet.
The circuitmay also implement the training mode utilizing the second artificial neural network model MODEL 2 to train the first ML-based model on the SoC. In the training mode, the training process may utilize the second artificial neural network model MODEL 2 with a hardware (HW) implementation (or equivalent software (SW) emulation) of an image digital signal processing (IDSP) pipeline model. In some embodiments, the training process may utilize the IDSP pipeline of the circuitduring the training mode. In embodiments utilizing the IDSP pipeline of the circuit, the circuitmay have a third input that may receive the processed image data from the circuitvia the signal FRAMES. In the training mode, the training process may utilize the second artificial neural network model MODEL 2 to generate the signal LOSS based on the set of raw images-, the one or more reference images-, and the loss function embodied in the training of the second artificial neural network model MODEL 2. The signal LOSS generally communicates a result (or loss) of the loss function. The second artificial neural network model MODEL 2 is generally configured (trained) to evaluate the results coming from the first artificial neural network model MODEL 1 and generate the signal LOSS based on comparison of the results coming from the first artificial neural network model MODEL 1 with the one or more reference images-received at the second input. The second artificial neural network model MODEL 2 generally allows an iterative training process of the first artificial neural network model MODEL 1 according to the evaluated loss.
Referring to, a diagram is shown illustrating an example of the image digital signal processing (IDSP) pipeline modelof. In an example, the IDSP pipeline modelgenerally models operations of the IDSP pipelineof. In various embodiments, the IDSP modelmay implement a IDSP pipeline for converting raw image data acquired from an image sensor format to a YUV picture format. In some embodiments, the IDSP modelmay model individual blocks (or circuits) of the IDSP pipeline. A camera system needs be able to take quality images in a variety of lighting conditions, including, but not limited to, indoors, in strong sunlight, and in darkness. Video sequences and still images can lose colors and critical details, and gain image noise in dim light. Many elements make up a camera system and work together to obtain a final image.
Image quality defines how well a camera system performs when reproducing an object or a scene. Various characteristics of the camera system including, but not limited to, sensor type and characteristics, firmware, and lens characteristics may contribute different elements to the overall quality of an image. Image quality parameter tuning is generally needed to achieve the best image/video quality from the camera system. Image quality parameters that may need to be tuned may relate to lens distortion, sensor defects, noise, color response, variations in mechanical, optical systems, and electrical characteristics, measurement criteria including automatic exposure (AE) with brightness and saturation statistics, automatic focus (AF) with contrast statistics, and automatic white balance (AWB) with color, statistics, black level correction, lens distortion correction, sensor defect detection, spatial and temporal noise reduction, Bayer demosaicing, color and tone correction, brightness and saturation, contrast, sharpness, white balance, autofocus statistics, auto-exposure statistics, automatic white balance statistics, lens shading correction, color space conversion, gamma correction, dynamic range, electronic image stabilization, decompanding, anti-aliasing, chromatic aberration, digital gain, vignette compensation, and statistics extraction. Image quality parameter tuning may also be needed due to individual subjective image quality preferences. Unprocessed images generally do not accurately depict an actual scene. An IDSP pipeline is used to obtain the highest image quality possible. In general, a camera system pairs a lens module with an image sensor and an IDSP pipeline.
The IDSP pipeline processes the raw image from the image sensor to a final (processed) image. In order to achieve the best image quality, the IQ parameters of the IDSP pipeline need to be configured iteratively for various lighting conditions and scenarios. In an example, an IDSP pipeline may comprise a number of blocks (or modules). Because each block may affect performance of subsequent blocks, image quality parameter tuning needs to be performed for each of the blocks of the IDSP pipeline. The tuning process generally involves calculating different parameters of the camera system (e.g., dark current, sensor RGB color space, noise model, AWB reference values, distortion model, etc.) to derive initial settings for each of the modules of the IDSP pipeline. In an example, the initial camera parameters may be calculated from images of standard test charts taken at specific and controlled lighting conditions.
In various embodiments, the IDSP pipeline modelmay comprise, but is not limited to, a step (or stage), a step (stage), a step (or stage), a step (or stage), a step (or stage), a step (or stage), a step (stage), a step (or stage), a step (or stage), a step (or stage), a step (or stage), a step (stage), a step (or stage), and a step (or stage). The stepmay perform black level correction on the raw image data. The stepmay extract auto focus statistics. The stepmay perform lens shading correction. The stepmay perform a white balancing operation. The stepmay perform bad pixel correction. The stepmay present color filter array (CFA) formatted image data. The stepmay extract auto-exposure and/or automatic white balance statistics. The stepmay perform Bayer noise reduction and demosaicing on the CFA formatted image data to obtain linear RGB (red, green, blue) image data for each picture element (pixel). The stepmay perform color and tone correction. The stepmay perform gamma correction on the linear RGB (red, green, blue) image data to obtain non-linear RGB (red, green, blue) image data for each picture element (pixel). The stepmay perform RGB to YUV color space conversion. The stepmay perform edge enhancement. The stepmay perform YUV noise filtering (e.g., noise reduction, noise correction, etc.). The stepmay perform sharpening operations. The stepmay perform lens warping correction. An output of the stepmay present the final processed images. The steps-may apply the IQ parameters generated in accordance with an embodiment of the invention using conventional techniques for processing the raw image data. Noise reduction and/or sharpening need not be limited to the stepsand, but may be utilized at one or multiple points in the pipeline steps-. The steps-may be implement as a number of blocks (or circuits) in a hardware (HW) implementation (or equivalent software (SW) emulation) of the image digital signal processing (IDSP) pipeline model.
Unknown
April 21, 2026
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.