10839552

Image Processing Apparatus, Tracking Method, and Program

PublishedNovember 17, 2020
Assigneenot available in USPTO data we have
InventorsTakuya OGAWA
Technical Abstract

Patent Claims
11 claims

Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.

Claim 1

Original Legal Text

1. An image processing apparatus comprising: a processor configured to execute: a detecting unit configured to detect an object region from an input image; an estimating unit configured to estimate a number of objects included in the detected object region; and a tracking unit configured to track objects included in the object region using the estimated number of objects, wherein the tracking unit tracks the objects included in the object region in the input image on the basis of tracking information which indicates a result of tracking of the objects in an input image temporally prior to the input image, and in which identifiers corresponding to the number of objects are associated with the object region, and a feature amount of the object region is associated with each identifier.

Plain English Translation

Image processing for object tracking. This invention addresses the challenge of accurately tracking multiple objects within an image, particularly when their number is not precisely known beforehand. The apparatus includes a processor that performs several functions. First, a detecting unit identifies a region within an input image that contains one or more objects of interest. Following detection, an estimating unit determines an approximate number of objects present within this identified object region. Subsequently, a tracking unit utilizes this estimated object count to track the individual objects. The tracking process relies on historical tracking data from a previous image. Specifically, the tracking unit associates unique identifiers with the object region, with the number of identifiers matching the estimated number of objects. Furthermore, a feature amount, representing characteristics of the object region, is linked to each of these identifiers. This allows for the tracking of objects across sequential images by maintaining their association with these identifiers and their associated features.

Claim 2

Original Legal Text

2. The image processing apparatus according to claim 1 , wherein, in the case where the number of objects estimated with respect to a first object region which includes the tracked objects and which is the object region on the input image is larger than the number of objects estimated with respect to a second object region which is an object region in an input image temporally the closest to the input image among the tracking information, the tracking unit associates a feature amount extracted from the second object region as a result of tracking with the first object region.

Plain English Translation

This invention relates to image processing for object tracking, specifically addressing challenges in maintaining accurate tracking when the number of detected objects in a current frame differs from the number in a prior frame. The system tracks objects across sequential input images and estimates the number of objects in a first object region of the current image, which includes the tracked objects. If this number exceeds the number of objects estimated in a second object region from the temporally closest prior image, the tracking unit associates a feature amount (e.g., a descriptor or characteristic data) extracted from the second object region with the first object region. This ensures continuity in tracking by leveraging prior frame data when discrepancies arise, improving robustness in dynamic scenes where object counts may fluctuate due to occlusions, motion, or detection errors. The method helps maintain consistent tracking by dynamically adjusting associations based on historical tracking information, reducing false positives or negatives in object detection. The apparatus may include a feature extraction module to derive the feature amounts and a tracking unit to manage object associations across frames. This approach is particularly useful in applications like surveillance, autonomous navigation, or video analysis where reliable object tracking is critical.

Claim 3

Original Legal Text

3. The image processing apparatus according to claim 1 , wherein, in the case where the number of objects estimated with respect to a first object region which includes the tracked objects and which is the object region on the input image is equal to or smaller than the number of objects estimated with respect to a second object region which is an object region in an input image temporally the closest to the input image among the tracking information, the tracking unit associates a feature amount extracted from the first object region as a result of tracking with the first object region.

Plain English Translation

This invention relates to image processing for object tracking in video sequences. The problem addressed is accurately associating tracked objects across frames when the number of detected objects in a current frame is fewer than in a prior frame, which can occur due to occlusions, motion blur, or other visual ambiguities. The apparatus includes a tracking unit that processes input images to identify and track objects. When analyzing a first object region in the current frame that contains tracked objects, the system compares the estimated number of objects in this region with the number of objects estimated in a second object region from the most temporally adjacent prior frame. If the first region has fewer or equal objects than the second region, the tracking unit associates a feature amount (e.g., a descriptor or signature) extracted from the first object region with that region. This ensures continuity in tracking by leveraging prior frame information when detection is uncertain. The feature amount may include spatial, temporal, or appearance-based characteristics used to maintain object identity across frames. The method helps mitigate tracking errors caused by temporary object disappearance or detection failures.

Claim 4

Original Legal Text

4. The image processing apparatus according to claim 1 , wherein the processor further configured to execute: an output unit configured to output the result of tracking by the tracking unit, the result being superimposed on the input image.

Plain English Translation

This invention relates to image processing systems designed to track objects within a video stream and display the tracking results overlaid on the input image. The system addresses the challenge of visually presenting tracking data in a way that enhances situational awareness without obscuring the original video content. The apparatus includes a processor that performs object detection and tracking within the input image sequence, identifying and following objects of interest over time. The tracking unit generates positional and movement data for these objects. The output unit then superimposes this tracking information onto the input image, allowing users to see the tracked objects highlighted or annotated directly on the video feed. This overlay can include bounding boxes, trajectories, or other visual indicators that make the tracking results immediately visible. The system is particularly useful in applications like surveillance, autonomous navigation, and augmented reality, where real-time object tracking and visualization are critical. By integrating the tracking results directly into the video stream, the apparatus provides an intuitive and efficient way to monitor and analyze moving objects in dynamic environments.

Claim 5

Original Legal Text

5. A tracking method comprising: detecting an object region from an input image; estimating a number of objects included in the detected object region; and tracking objects included in the object region using the estimated number of objects, wherein the objects included in the object region in the input image are tracked on the basis of tracking information which indicates a result of tracking of the objects in an input image temporally prior to the input image and in which identifiers corresponding to the number of objects are associated with the object region, and a feature amount of the object region is associated with each identifier.

Plain English Translation

This invention relates to object tracking in images, addressing challenges in accurately identifying and tracking multiple objects within a detected region. The method involves detecting an object region in an input image, estimating the number of objects within that region, and tracking the objects based on the estimated count. Tracking relies on prior tracking information from a temporally preceding image, where identifiers corresponding to the number of objects are linked to the object region, and feature data for the region is associated with each identifier. This approach ensures that objects are tracked consistently across frames, even when they appear in close proximity or overlap. The method improves tracking accuracy by leveraging both spatial and temporal information, reducing errors caused by occlusions or similar appearances. The feature data helps distinguish between individual objects, while the identifiers maintain continuity in tracking. This technique is particularly useful in applications like surveillance, autonomous navigation, and video analysis where reliable object tracking is critical.

Claim 6

Original Legal Text

6. The tracking method according to claim 5 , wherein, in the case where the number of objects estimated with respect to a first object region which includes the tracked objects and which is the object region on the input image is larger than the number of objects estimated with respect to a second object region which is an object region in an input image temporally the closest to the input image among the tracking information, a feature amount extracted from the second object region as a result of tracking is associated with the first object region.

Plain English Translation

This invention relates to object tracking in video or image sequences, addressing the challenge of accurately tracking objects when the number of detected objects in a current frame differs from the number in a prior frame. The method improves tracking consistency by associating feature data from a previous frame with a current frame when the object count in the current frame exceeds that in the closest prior frame. Specifically, if the number of objects detected in a first object region (the current frame) is greater than the number in a second object region (the closest prior frame), the feature data extracted from the second region is linked to the first region. This ensures that tracking information from prior frames is leveraged to maintain accurate object identification and continuity, even when detection discrepancies occur. The approach helps mitigate errors caused by temporary occlusions, detection failures, or changes in object appearance, enhancing the robustness of object tracking systems in dynamic environments. The method is particularly useful in applications like surveillance, autonomous navigation, and video analysis where reliable object tracking is critical.

Claim 7

Original Legal Text

7. The tracking method according to claim 5 , comprising: outputting the result of tracking, the result being superimposed on the input image.

Plain English Translation

A system and method for tracking objects in a sequence of images, such as video frames, addresses the challenge of accurately identifying and following moving objects in real-time applications like surveillance, autonomous navigation, or augmented reality. The method involves analyzing input images to detect and track objects over time, using techniques such as feature matching, motion estimation, or machine learning-based object recognition. The tracking process generates positional and motion data for the detected objects, which is then used to predict their future locations in subsequent frames. To enhance usability, the method includes a step of overlaying the tracking results directly onto the input images, allowing users to visually confirm the tracked objects' positions. This overlay may include bounding boxes, labels, or other visual indicators that highlight the tracked objects within the original image. The system may also incorporate error correction mechanisms to refine tracking accuracy, such as adjusting predictions based on environmental factors or object behavior. By integrating the tracking results with the input images, the method provides an intuitive way to monitor and analyze object movements in dynamic scenes.

Claim 8

Original Legal Text

8. A non-transitory computer readable medium having stored thereon program for causing a computer to execute: a process of detecting an object region from an input image; a process of estimating a number of objects included in the detected object region; and a process of tracking objects included in the object region using the estimated number of objects, wherein the objects included in the object region in the input image are tracked in the process of tracking on the basis of tracking information which indicates a result of tracking of the objects in an input image temporally prior to the input image and in which identifiers corresponding to the number of objects are associated with the object region and a feature amount of the object region is associated with each identifier.

Plain English Translation

This invention relates to computer vision systems for object detection, counting, and tracking in images. The problem addressed is accurately tracking multiple objects in a sequence of images, particularly when objects may overlap or occlude each other, leading to tracking errors. The system processes an input image by first detecting an object region containing one or more objects. It then estimates the number of distinct objects within that region. During tracking, the system uses prior tracking information from earlier images, where identifiers are assigned to each object in the region and a feature representation (e.g., appearance or motion characteristics) is associated with each identifier. This allows the system to maintain consistent tracking across frames, even when objects move or interact. The key innovation is the combination of object counting and feature-based tracking, where the estimated number of objects guides the tracking process. This helps resolve ambiguities in tracking, such as when objects merge or split, by ensuring the system accounts for the correct number of distinct objects. The approach improves robustness in dynamic scenes where traditional tracking methods may fail due to occlusions or overlapping objects. The system is implemented as a computer program stored on a non-transitory medium, executing the detection, counting, and tracking processes sequentially.

Claim 9

Original Legal Text

9. The non-transitory computer readable medium according to claim 8 , wherein, in the case where the number of objects estimated with respect to a first object region which includes the tracked objects and which is the object region on the input image is larger than the number of objects estimated with respect to a second object region which is an object region in an input image temporally the closest to the input image among the tracking information, a feature amount extracted from the second object region as a result of tracking is associated with the first object region.

Plain English Translation

This invention relates to object tracking in image processing, specifically addressing challenges in maintaining accurate tracking when the number of detected objects changes between consecutive frames. The system tracks objects in a sequence of input images by analyzing object regions and associating feature data to ensure continuity. When the number of objects in a current frame's first object region exceeds the number in a second object region from the temporally closest prior frame, the system associates the feature data from the second region with the first region. This helps resolve tracking inconsistencies caused by temporary occlusions, object splits, or merges. The method involves extracting feature amounts from object regions, comparing object counts between frames, and dynamically linking feature data to maintain tracking accuracy. The system may also adjust tracking parameters based on motion vectors or other contextual data to improve robustness. This approach is particularly useful in applications like surveillance, autonomous navigation, and video analysis where reliable object tracking is critical.

Claim 10

Original Legal Text

10. The non-transitory computer readable medium according to claim 8 , wherein, in the case where the number of objects estimated with respect to a first object region which includes the tracked objects and which is the object region on the input image is equal to or smaller than the number of objects estimated with respect to a second object region which is an object region in an input image temporally the closest to the input image among the tracking information, a feature amount extracted from the first object region as a result of tracking is associated with the first object region.

Plain English Translation

This invention relates to computer vision systems for tracking objects in a sequence of images, particularly addressing challenges in maintaining accurate object tracking when the number of detected objects changes between frames. The system estimates the number of objects in a first object region of a current input image and compares it to the number of objects in a second object region from a temporally adjacent input image. If the first region contains fewer or an equal number of objects, a feature amount extracted from the first region is associated with that region to improve tracking consistency. The method involves analyzing tracking information from previous frames to determine the closest prior frame for comparison, ensuring robustness in dynamic scenes where objects may appear, disappear, or overlap. The feature amount, such as a descriptor or bounding box data, helps maintain continuity in object identification despite variations in object count between frames. This approach enhances tracking accuracy in applications like surveillance, autonomous vehicles, and video analysis by mitigating errors caused by temporary occlusions or detection inconsistencies. The system operates on a non-transitory computer-readable medium, storing instructions for executing the tracking logic.

Claim 11

Original Legal Text

11. The non-transitory computer readable medium according to claim 8 causing the computer to execute: a process of outputting the result of tracking, the result being superimposed on the input image.

Plain English Translation

This invention relates to computer vision systems for tracking objects in real-time video streams. The problem addressed is the need to visually present tracking results in a way that enhances user understanding without obscuring the original video content. The system processes an input image or video frame, identifies and tracks objects within the scene, and then overlays the tracking results onto the input image for display. The tracking results may include visual indicators such as bounding boxes, labels, or motion paths that highlight the tracked objects. The system dynamically adjusts the overlay to ensure visibility while minimizing interference with the underlying image. This allows users to monitor tracked objects in real-time while maintaining context from the original video. The invention is particularly useful in applications like surveillance, autonomous vehicles, and augmented reality, where clear and unobstructed tracking visualizations are critical. The method ensures that tracking data is presented in a way that is both informative and non-intrusive, improving the usability of tracking systems in real-world scenarios.

Patent Metadata

Filing Date

Unknown

Publication Date

November 17, 2020

Inventors

Takuya OGAWA

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, FAQs, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “IMAGE PROCESSING APPARATUS, TRACKING METHOD, AND PROGRAM” (10839552). https://patentable.app/patents/10839552

© 2026 Nomic Interactive Technology LLC. Machine-readable context available at /api/llm-context/10839552. See llms.txt for full attribution policy.

IMAGE PROCESSING APPARATUS, TRACKING METHOD, AND PROGRAM