A method for characterizing image contents automatically or semi-automatically using image acquisition parameters and metadata is presented. The method establishes probabilistic and deterministic relationships between different types of metadata and the semantic attributes and contents of images. It furnishes a mechanism that enables the automatic and semi-automatic classification, annotation, tagging, indexing, searching, identification or retrieval of images based on their contents, semantic properties and metadata characteristics. The method uses, but is not limited to, image capture metadata such as focal length, exposure time, relative aperture, flash information, ISO setting, angle of view, subject distance, timestamp, GPS information as well as other forms of metadata, including but not limited to, captions, keywords, headings, tags, comments, remarks, titles which may be automatically, semi-automatically, or manually generated. The present invention can be applied to image databases, web searching, personal search, community search, broad-based or vertical search engines for internet, intranet, extranet or other usages.
Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.
1. A method of semantically characterizing the semantic attributes and contents of digital or non-digital images by using characterization rules to determine image contents and properties from image acquisition metadata, data affiliated with the image, and/or the image itself, the image, image acquisition metadata and data affiliated with the image being stored on a non-transitory computer-readable storage medium, and using probabilistic or deterministic relationships among the metadata and/or affiliated data to create a semantic image characterization which is used to enrich pre-existing metadata and/or affiliated data or is used to index the image contents in a computer readable storage medium in order to facilitate future retrieval of the image using semantic terms via a data processing apparatus, wherein the image acquisition metadata is selected from at least one of focal length, exposure time, relative aperture, flash information, ISO setting, angle of view, subject distance, timestamp, and GPS information, and wherein the characterization rules comprise one or more rules which infer image contents and properties by determining that when focal length belongs to a specific set of values, and exposure time belongs to a specific set of values, and subject distance belongs to a specific set of values, and timestamp belongs to a specific set of values, and relative aperture belongs to a specific set of values, that the image is of a certain type of scene or contains certain types of contents or having certain properties.
A method for automatically understanding the content of images uses rules to analyze image acquisition metadata (like focal length, exposure, aperture, flash, ISO, angle, subject distance, timestamp, GPS) and other related data to figure out what the image shows. The image, metadata, and related data are stored on a computer-readable medium. The method uses probabilistic or deterministic relationships between the metadata to create a description of the image’s semantic content, which means identifying objects, scenes, or properties of the image. This description is then used to add to existing metadata or to create an index that allows users to search for images using descriptive terms. The rules work by looking for specific combinations of metadata values (e.g., focal length, exposure time, subject distance, timestamp, and relative aperture within certain ranges) to infer the image type or content.
2. A method according to claim 1 wherein the affiliated data is selected from at least one of contents classification, tags, annotations, captions, keywords, headings, comments, remarks, titles, related texts, surrounding texts, or linked text.
Building on the method of automatically understanding image content using metadata, the "affiliated data" used to characterize the image includes things like automatically generated content classifications, user-created tags, annotations, captions, keywords, headings, comments, remarks, titles, related texts found near the image, surrounding texts, or text found on linked pages. This data, in addition to the acquisition metadata, helps to determine the image content and create more descriptive search terms.
3. The method of claim 1 wherein the characterization may be enhanced by extracting and processing the affiliated data.
To improve the method of automatically understanding image content using metadata, the characterization (description of the image) is enhanced by extracting and processing the "affiliated data," which includes things like automatically generated content classifications, user-created tags, annotations, captions, keywords, headings, comments, remarks, titles, related texts found near the image, surrounding texts, or text found on linked pages. By analyzing this related data, the system can better understand the image's semantic content.
4. The method of claim 1 wherein the images may be Web images, non-Web images, images located in other public or private image repositories, and the method can be applied to image databases, web searching, personal search, community search, broad-based or vertical search engines for internet, intranet, extranet or other usages.
The method of automatically understanding image content using metadata can be applied to images found on the web, images not on the web, images in public or private repositories. This allows the method to be used in image databases, web searching, personal search, community search, broad-based search engines like Google, or more specific vertical search engines for internet, intranet, extranet, or other environments.
5. The method of claim 1 wherein the characterization may be enhanced by correlating metadata and/or affiliated data with external or internal databases.
To improve the method of automatically understanding image content using metadata, the image characterization (description) is enhanced by comparing the image's metadata (like focal length, exposure, aperture, flash, ISO, angle, subject distance, timestamp, GPS) and associated data to external or internal databases. This correlation helps to refine the understanding of the image.
6. The method of claim 5 wherein the correlation comprises using GPS coordinates and timestamp metadata to determine the weather or news information from a weather or news database for the place and time of the image, or from other databases or geographic information systems.
In the method of automatically understanding image content using metadata and comparing against databases, the correlation step involves using the image's GPS coordinates and timestamp to retrieve weather information or news events from weather or news databases relevant to the location and time the image was taken. This information, gathered from other databases or geographic information systems, is then used to enrich the image characterization.
7. The method of claim 1 wherein the semantic characterization includes one or more of objects in the image, relationships among objects in the image, attributes or properties of objects or relationships in the image, scene in the image, environment in the image, context of the image, landmarks in the image, location where the image is taken, time when the image is taken, background in the image, features in the image, occasions in the image, events in the image, reasons why the image was taken, living things and non-living things in the image, mood of people in the image, or actions in the image.
In the method of automatically understanding image content using metadata, the semantic characterization (the description of the image) includes identifying elements such as: objects in the image, the relationships between those objects, attributes or properties of those objects or relationships, the scene depicted, the environment, the context of the image, landmarks present, the location and time the image was taken, the background, notable features, the occasion or event, the reasons the image was taken, living and non-living things present, the mood of people in the image, and any actions occurring.
8. The method of claim 1 wherein the image metadata and/or affiliated data may be automatically, semi-automatically or manually generated.
In the method of automatically understanding image content using metadata, the image metadata (like focal length, exposure, aperture, flash, ISO, angle, subject distance, timestamp, GPS) and the affiliated data (like tags, captions, comments) can be generated automatically by software, semi-automatically with some human input, or manually by a person. This means the system can work with existing metadata, generate new metadata on its own, or allow users to contribute.
9. The method of claim 1 wherein the characterization may be enhanced by applying known image processing and/or face recognition algorithms.
To improve the method of automatically understanding image content using metadata, the characterization of the image (description) is enhanced by applying standard image processing techniques and/or face recognition algorithms. This allows the system to identify and understand features and content within the image itself, rather than relying solely on metadata.
10. The method according to claim 1 wherein the characterization rules makes use of conjunction and/or disjunction in combining properties of the image acquisition metadata.
In the method of automatically understanding image content using metadata, the rules used to characterize the image can combine different properties of the image acquisition metadata using "and" (conjunction) or "or" (disjunction) logic. For example, a rule might state "if focal length is X AND exposure time is Y, then the image is likely a landscape" or "if GPS coordinates are Z OR the image contains a face, then add the 'portrait' tag."
11. The method of claim 1 wherein the image acquisition metadata comprises EXIF (Exchangeable Image File Format) metadata.
In the method of automatically understanding image content using metadata, the image acquisition metadata used includes EXIF (Exchangeable Image File Format) metadata. This is a common standard for storing metadata within image files, making it easily accessible to the system.
12. The method of claim 1 wherein characterization of the image results from analysis of the image comprises the detection of faces, the recognition of faces, the recognition of fingerprints or recognition of other biometric data.
In the method of automatically understanding image content, the characterization of the image comes from analyzing the image itself. This analysis includes detecting faces, recognizing known faces, recognizing fingerprints, or identifying other biometric data present in the image.
13. The method of claim 12 wherein annotations may be added to the image from databases of biometric data.
Building upon the method where image characterization includes biometric analysis, annotations can be automatically added to the image by pulling data from databases of biometric information. This allows the system to tag individuals or provide identifying information.
14. The method of claim 13 wherein the biometric data comprising facial features or fingerprints.
In the method of adding annotations to images based on biometric data, the biometric data used for recognition and annotation includes facial features or fingerprints found in the image.
15. The method of claim 1 wherein characterization of the image comprising analysis of the image using image processing algorithms to determine shape, color, feature, or texture of the image.
In the method of automatically understanding image content, the characterization of the image includes analyzing the image using image processing algorithms to determine the shape, color, features, or texture present in the image. This allows the system to identify elements and characteristics beyond what is described in the metadata.
16. The method of claim 15 wherein the image processing algorithm comprising SIFT (scale-invariant feature transformation).
In the method of automatically understanding image content by analyzing the image, the image processing algorithm used to determine shape, color, features, or texture is SIFT (scale-invariant feature transformation). SIFT is a specific algorithm for detecting and describing local features in images, robust to changes in scale and orientation.
17. The method of claim 1 wherein enriching pre-existing metadata comprises automatically populating one or more fields within image processing and classification standards such as the MPEG-7 standard.
In the method of automatically understanding image content using metadata, enriching pre-existing metadata involves automatically filling in fields within image processing and classification standards like the MPEG-7 standard. This ensures that the generated image descriptions are compatible with established formats.
18. The method of claim 17 wherein the one or more fields are selected from datatypes comprising Structured Annotation Datatype, Keyword Annotation Datatype, and Text Annotation Datatype.
In the method of enriching metadata by populating fields in standards such as MPEG-7, the fields that are automatically populated include datatypes such as Structured Annotation Datatype (for structured information), Keyword Annotation Datatype (for keywords), and Text Annotation Datatype (for textual descriptions).
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
March 11, 2010
August 27, 2013
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.