Patentable/Patents/US-11949996
US-11949996

Automatic white balance correction for digital images using multi-hypothesis classification

PublishedApril 2, 2024
Assigneenot available in USPTO data we have
Inventorsnot available in USPTO data we have
Technical Abstract

A device for estimating a scene illumination color for a source image is configured to: determine a set of candidate illuminants and for each of the candidate illuminants, determine a respective correction of the source image; for each of the candidate illuminants, apply the respective correction to the source image to form a corresponding set of corrected images; for each corrected image from the set of corrected images, implement a trained data-driven model to estimate a respective probability of achromaticity of the respective corrected image; and based on the estimated probabilities of achromaticity for the set of corrected images, obtain a final estimate of the scene illumination color for the source image. This approach allows for the evaluation of multiple candidate illuminates to determine an estimate of the scene illumination color.

Patent Claims
15 claims

Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.

Claim 2

Original Legal Text

2. The device according to claim 1, wherein the final estimate of the scene illumination color for the source image is obtained using a weighting of at least two candidate illuminants of the set of candidate illuminants.

Plain English Translation

This invention relates to digital image processing, specifically to estimating the color of scene illumination in an image. The problem addressed is accurately determining the true color of illumination in a captured image, which is essential for color correction and enhancing image quality. Existing methods often struggle with complex lighting conditions or shadows, leading to inaccurate color reproduction. The device includes a processor configured to analyze an image and estimate its illumination color. The processor generates a set of candidate illuminants, which are potential color estimates for the scene lighting. The final illumination color estimate is derived by weighting at least two of these candidate illuminants. This weighting process combines multiple estimates to improve accuracy, particularly in challenging lighting scenarios where a single candidate may be unreliable. The weighting may be based on factors such as confidence levels, spatial distribution, or other image features. By integrating multiple illuminant candidates, the device provides a more robust and accurate representation of the true scene lighting, improving color consistency and fidelity in the processed image. This approach is particularly useful in applications like photography, medical imaging, and computer vision, where precise color reproduction is critical.

Claim 5

Original Legal Text

5. The device according to claim 1, wherein the target image represents the scene of the source image under a canonical illuminant.

Plain English Translation

This invention relates to image processing, specifically improving image quality by correcting color under varying lighting conditions. The problem addressed is that images captured under non-canonical (non-standard) lighting appear unnatural due to color casts, which distort the true colors of objects in the scene. The invention provides a device that processes an input source image to generate a target image where the scene appears as it would under a canonical illuminant, such as daylight or a standard white light source. The device includes an image acquisition module to capture or receive the source image, a color correction module to analyze the lighting conditions in the source image, and a transformation module to adjust the colors in the source image to match those expected under the canonical illuminant. The color correction module identifies the illuminant in the source image by analyzing pixel data, and the transformation module applies a color transformation to neutralize the effect of the non-canonical lighting. The target image is then output, displaying the scene with accurate, natural colors as if illuminated by the canonical light source. This approach ensures consistent color representation across different lighting environments, improving visual quality for applications such as photography, medical imaging, and computer vision. The device may also include additional modules for enhancing contrast or sharpness in the target image to further refine the output.

Claim 6

Original Legal Text

6. The device according to claim 1, wherein the set of candidate illuminants is determined by sampling at uniform intervals in an illuminant space.

Plain English Translation

This invention relates to a device for determining a set of candidate illuminants in an illuminant space, particularly for applications in color imaging or computer vision. The problem addressed is the need for an efficient and accurate method to identify potential light sources or illuminants that could have produced a given color image, which is essential for tasks like color constancy, object recognition, and scene understanding. The device includes a mechanism for sampling an illuminant space at uniform intervals to generate the set of candidate illuminants. The illuminant space is a multi-dimensional representation where each point corresponds to a possible illuminant, such as a light source with specific spectral properties. By sampling this space uniformly, the device ensures that the candidate illuminants are distributed evenly, reducing computational complexity while maintaining coverage of the space. This approach avoids the need for exhaustive search or heuristic-based sampling, which can be inefficient or biased. The device may also include a module for preprocessing the input image to extract features relevant to illuminant estimation, such as color distributions or statistical properties. These features are then used to refine the set of candidate illuminants, ensuring that only physically plausible illuminants are considered. The uniform sampling method allows for real-time processing, making it suitable for applications requiring rapid illuminant estimation, such as autonomous vehicles or augmented reality systems. The invention improves upon prior art by providing a systematic and computationally efficient way to generate candidate illuminants, enhancing the accuracy and reliability of illuminant estimation in various imaging and vision tasks.

Claim 7

Original Legal Text

7. The device according to claim 1, wherein the set of candidate illuminants is determined by K-Means clustering.

Plain English Translation

A system for determining a set of candidate illuminants in image processing applications involves analyzing image data to identify potential lighting conditions. The system addresses the challenge of accurately estimating lighting conditions in digital images, which is essential for tasks such as color correction, object recognition, and scene understanding. Traditional methods often struggle with variability in lighting, leading to inaccuracies in downstream applications. The system processes an input image to extract color information, which is then used to generate a set of candidate illuminants. These illuminants represent possible lighting conditions under which the image was captured. To refine this set, the system applies K-Means clustering, a machine learning technique that groups similar data points into clusters. This clustering step helps identify distinct lighting conditions by grouping similar color distributions, reducing noise and improving the accuracy of illuminant estimation. The system may also include preprocessing steps to enhance the input image, such as noise reduction or color space conversion, to improve the reliability of the clustering process. The final set of candidate illuminants is then used to estimate the most likely lighting condition, which can be applied to correct or analyze the image. This approach improves the robustness of lighting estimation in various imaging applications.

Claim 8

Original Legal Text

8. The device according to claim 1, wherein the set of candidate illuminants is determined using a Gaussian mixture model.

Plain English Translation

The invention relates to a device for determining a set of candidate illuminants in a computational imaging system. The problem addressed is accurately estimating the lighting conditions in a scene to improve image processing tasks such as color correction, object recognition, and scene understanding. Traditional methods often struggle with varying or complex lighting environments, leading to inaccuracies in image analysis. The device includes an imaging sensor configured to capture an image of a scene and a processing unit that analyzes the image to identify potential light sources. The processing unit applies a Gaussian mixture model (GMM) to determine the set of candidate illuminants. The GMM is a statistical method that models the distribution of pixel color values in the image, identifying clusters that correspond to different light sources. By fitting multiple Gaussian distributions to the data, the model can distinguish between direct illumination, reflections, and other lighting effects, providing a more robust estimation of the scene's lighting conditions. The device may also include additional components, such as a memory for storing image data and a display for visualizing the results. The processing unit may further refine the candidate illuminants by applying constraints based on physical properties of light, such as spectral power distributions or chromaticity coordinates. This ensures that the estimated illuminants are physically plausible and improves the accuracy of subsequent image processing tasks. The invention enhances computational imaging systems by providing a more reliable method for estimating lighting conditions in diverse environments.

Claim 9

Original Legal Text

9. The device according to claim 1, wherein the trained data-driven model is trained using a set of training images captured by at least two cameras.

Plain English Translation

This invention relates to a device that uses a trained data-driven model to process images captured by multiple cameras. The device is designed to address challenges in image analysis where a single camera may not provide sufficient data for accurate processing, such as in object detection, scene reconstruction, or motion tracking. By using images from at least two cameras, the model can leverage stereo vision or multi-view geometry to improve accuracy, robustness, and depth perception. The training process involves a dataset of images captured by these cameras, allowing the model to learn spatial relationships, occlusions, and other visual cues that enhance performance. The device may incorporate additional features, such as real-time processing, adaptive calibration, or integration with other sensors, to further refine its functionality. The use of multiple cameras enables the model to handle complex scenarios, such as dynamic environments or low-light conditions, where single-camera systems may fail. This approach is particularly useful in applications like autonomous vehicles, surveillance, robotics, and augmented reality, where reliable and precise image analysis is critical. The trained model can be deployed in various hardware configurations, including embedded systems or cloud-based platforms, depending on the application requirements.

Claim 10

Original Legal Text

10. The device according to claim 1, wherein the trained data-driven model is a convolutional neural network.

Plain English Translation

A device for processing data includes a trained data-driven model that analyzes input data to generate an output. The model is trained using a dataset to recognize patterns or features in the input data. The device may include a data acquisition module to collect input data from one or more sources, such as sensors, databases, or user inputs. The trained model processes this data to produce an output, which may be used for decision-making, classification, or prediction tasks. The device may also include a feedback mechanism to refine the model's performance over time. In this specific implementation, the trained data-driven model is a convolutional neural network (CNN). CNNs are particularly effective for processing grid-like data, such as images or time-series signals, by applying convolutional layers to extract hierarchical features. The CNN may include multiple layers, including convolutional, pooling, and fully connected layers, to transform input data into a desired output format. The device may be used in applications like image recognition, object detection, or signal processing, where CNNs excel at identifying spatial or temporal patterns. The system may also include preprocessing steps to normalize or enhance the input data before feeding it into the CNN, ensuring optimal performance.

Claim 12

Original Legal Text

12. The method according to claim 11, wherein the target image represents the scene of the source image under a canonical illuminant.

Plain English Translation

This invention relates to image processing techniques for adjusting the color appearance of images under different lighting conditions. The problem addressed is the variation in color perception caused by non-canonical illuminants, which can distort the true colors of objects in a scene. The solution involves transforming a source image captured under a non-canonical illuminant to a target image that represents the scene under a canonical illuminant, such as daylight, to achieve color consistency. The method includes capturing a source image of a scene under a non-canonical illuminant and processing the image to estimate the illuminant's spectral properties. Based on this estimation, the image is transformed to correct for the illuminant's influence, producing a target image where colors appear as they would under a standard reference illuminant. This involves spectral modeling to account for the illuminant's effect on the captured colors and applying a transformation to neutralize the distortion. The technique may also incorporate additional steps such as scene analysis to refine the illuminant estimation or adaptive adjustments to preserve natural color relationships. The target image is generated by applying a color transformation derived from the illuminant estimation, ensuring that the output image accurately represents the scene's colors under the canonical illuminant. This approach is useful in applications requiring color consistency, such as medical imaging, digital photography, and computer vision.

Claim 13

Original Legal Text

13. The method according to claim 11, wherein the final estimate of the scene illumination color for the source image is obtained using a weighting of at least two candidate illuminants of the set of candidate illuminants.

Plain English Translation

This invention relates to digital image processing, specifically methods for estimating scene illumination color in images. The problem addressed is accurately determining the true color of illumination in a scene, which is essential for color correction and enhancing image quality. Existing methods often struggle with complex lighting conditions or shadows, leading to inaccurate color reproduction. The method involves analyzing an image to identify a set of candidate illuminants, which are potential estimates of the scene's true illumination color. These candidates are derived from different regions or features within the image, such as shadows, highlights, or neutral areas. The final illumination color estimate is then computed by weighting and combining at least two of these candidate illuminants. This approach improves accuracy by leveraging multiple sources of information rather than relying on a single estimate. The weighting process may consider factors such as the reliability or confidence of each candidate illuminant, ensuring that more trustworthy estimates contribute more significantly to the final result. This method is particularly useful in scenarios with mixed lighting or challenging lighting conditions, where a single illuminant estimate may be insufficient. The technique can be applied in various imaging applications, including photography, computer vision, and image enhancement.

Claim 14

Original Legal Text

14. The method according to claim 11, wherein the trained data-driven model is trained using a set of images captured by at least two cameras.

Plain English Translation

This invention relates to a method for training a data-driven model using images captured by multiple cameras to improve accuracy in tasks such as object detection, tracking, or scene understanding. The method addresses the challenge of enhancing model performance by leveraging diverse perspectives from multiple cameras, which can reduce blind spots and improve robustness in real-world applications like autonomous vehicles, surveillance, or robotics. The trained data-driven model is configured to process input data, such as images or sensor data, and generate an output, such as a classification, detection, or prediction. The model is trained using a dataset that includes images captured by at least two cameras, which may be positioned at different angles or locations to provide complementary views of a scene. By incorporating data from multiple cameras, the model learns to integrate spatial and temporal information more effectively, leading to improved accuracy and reliability in its predictions. The method may involve preprocessing the images to align or fuse them, applying data augmentation techniques to enhance training diversity, or using a multi-view learning approach to optimize the model's performance across different perspectives. The trained model can then be deployed in applications where multi-camera setups are available, such as autonomous driving systems, industrial automation, or augmented reality, to achieve higher accuracy and robustness compared to single-camera approaches.

Claim 15

Original Legal Text

15. The method according to claim 11, wherein the trained data-driven model is a convolutional neural network.

Plain English Translation

A convolutional neural network (CNN) is used to analyze input data, such as images or signals, by applying convolutional layers to extract features and fully connected layers to classify or predict outcomes. This approach is particularly useful in fields like computer vision, medical imaging, and autonomous systems, where traditional methods struggle with complex patterns. The CNN processes input data through multiple convolutional layers, each applying filters to detect features like edges, textures, or shapes. These features are then pooled and passed through fully connected layers for final classification or regression tasks. The network is trained using labeled data, adjusting weights to minimize prediction errors. This method improves accuracy and efficiency in tasks like object recognition, defect detection, or anomaly identification compared to non-deep-learning approaches. The CNN's hierarchical feature extraction allows it to handle high-dimensional data effectively, making it a preferred choice for applications requiring robust pattern recognition.

Claim 17

Original Legal Text

17. The non-transitory processor-readable medium according to claim 16, wherein the final estimate of the scene illumination color for the source image is obtained using a weighting of at least two candidate illuminants of the set of candidate illuminants.

Plain English Translation

This invention relates to digital image processing, specifically improving color accuracy in images by estimating scene illumination. The problem addressed is the challenge of accurately determining the true color of illumination in a scene, which is essential for correct color reproduction in digital images. Incorrect illumination estimation can lead to color casts or unnatural hues in the final image. The invention involves a method for estimating scene illumination color in a digital image. A set of candidate illuminants is generated based on the image data, representing possible illumination conditions. The final estimate of the scene illumination color is derived by applying a weighting scheme to at least two of these candidate illuminants. This weighting process combines the contributions of multiple illuminant candidates to produce a more accurate and robust estimate of the true scene illumination. The method may also involve refining the candidate illuminants by excluding those that are unlikely to represent the true illumination, such as those that do not match the expected color distribution of natural or artificial light sources. The weighting can be based on factors such as the likelihood of each candidate illuminant, its consistency with the image data, or its similarity to known illumination models. The result is an improved estimation of the scene illumination, leading to more accurate color correction in the final image.

Claim 18

Original Legal Text

18. The non-transitory processor-readable medium according to claim 16, wherein the trained data-driven model is trained using a set of images captured by at least two cameras.

Plain English Translation

The invention relates to a non-transitory processor-readable medium storing instructions for processing images captured by multiple cameras to train a data-driven model. The technology addresses the challenge of improving the accuracy and robustness of data-driven models by leveraging multi-camera image data. The medium includes instructions for training a data-driven model using a set of images captured by at least two cameras, which may be positioned at different angles or locations to provide diverse perspectives. This multi-camera approach enhances the model's ability to generalize across varying conditions, such as lighting, occlusion, or viewpoint changes. The trained model can then be applied to tasks like object detection, scene understanding, or autonomous navigation, where multi-view data improves performance. The invention also involves preprocessing the images to align or fuse them, ensuring consistent input for training. By utilizing multiple cameras, the system mitigates limitations of single-camera systems, such as blind spots or limited depth perception, resulting in a more reliable and versatile model. The medium further includes instructions for validating the trained model using additional test data to ensure its accuracy and robustness in real-world applications. This approach is particularly useful in fields like autonomous vehicles, surveillance, and augmented reality, where multi-view data enhances situational awareness and decision-making.

Claim 19

Original Legal Text

19. The non-transitory processor-readable medium according to claim 16, wherein the trained data-driven model is a convolutional neural network.

Plain English Translation

A convolutional neural network (CNN) is used to process and analyze image data for tasks such as object detection, classification, or segmentation. CNNs are particularly effective in extracting spatial hierarchies of features from input images through convolutional layers, pooling layers, and fully connected layers. The network is trained on a dataset of labeled images to learn patterns and relationships between pixel values and desired outputs. During inference, the trained CNN processes new images by applying learned filters to detect relevant features, such as edges, textures, or higher-level structures, and outputs predictions or classifications based on these features. This approach is widely applied in computer vision applications, including medical imaging, autonomous vehicles, and surveillance systems, where accurate and efficient image analysis is critical. The use of a CNN in this context enables automated, high-performance image processing with reduced reliance on manual feature engineering.

Claim 20

Original Legal Text

20. The non-transitory processor-readable medium according to claim 16, wherein the target image represents the scene of the source image under a canonical illuminant.

Plain English Translation

A system and method for image processing involves generating a target image from a source image captured under non-canonical lighting conditions. The source image is processed to estimate the scene's reflectance and illuminant properties, allowing the generation of a target image that represents the scene under a canonical illuminant, such as daylight or a standard lighting condition. This technique corrects color distortions caused by non-canonical lighting, improving visual consistency and accuracy in applications like computer vision, medical imaging, and photography. The method may include steps for decomposing the source image into reflectance and illuminant components, applying a transformation to adjust the illuminant to a canonical state, and reconstructing the target image. The system may use machine learning models or statistical techniques to enhance the accuracy of the reflectance and illuminant estimation. The target image is generated without altering the scene's intrinsic reflectance properties, ensuring faithful representation under the canonical illuminant. This approach is useful in scenarios where consistent color representation is critical, such as in medical diagnostics, industrial inspections, or augmented reality.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

Patent Metadata

Filing Date

May 12, 2022

Publication Date

April 2, 2024

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, FAQs, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “Automatic white balance correction for digital images using multi-hypothesis classification” (US-11949996). https://patentable.app/patents/US-11949996

© 2026 Nomic Interactive Technology LLC. Machine-readable context available at /api/llm-context/US-11949996. See llms.txt for full attribution policy.