10915792

Domain Adaptation for Instance Detection and Segmentation

PublishedFebruary 9, 2021
Assigneenot available in USPTO data we have
Technical Abstract

Patent Claims
20 claims

Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.

Claim 1

Original Legal Text

1. A method for domain adaptation, comprising: aligning image level features between a source domain and a target domain based on an adversarial learning process while training a domain discriminator; selecting, using the domain discriminator, unlabeled samples from the target domain that are furthest away from existing annotated samples from the target domain; selecting, by a processor device, based on a prediction score of each of the unlabeled samples, samples with lower prediction scores; and annotating the samples with the lower prediction scores.

Plain English translation pending...
Claim 2

Original Legal Text

2. The method as recited in claim 1 , further comprising: iteratively retraining a model that annotates the unlabeled samples based on the annotated samples with the lower prediction scores, wherein the model implements at least one predetermined task.

Plain English translation pending...
Claim 3

Original Legal Text

3. The method as recited in claim 2 , wherein the at least one predetermined task includes at least one of instance object detection and segmentation.

Plain English translation pending...
Claim 4

Original Legal Text

4. The method as recited in claim 2 , wherein retraining the model further comprises: inputting an updated label set including the annotated samples with the lower prediction scores into an image-level convolutional neural network (CNN) to generate at least one feature; based on the at least one feature, propagating the updated label set to a region of interest level (ROI-level) CNN; and generating output bounding boxes as at least one object detection.

Plain English translation pending...
Claim 5

Original Legal Text

5. The method as recited in claim 4 , further comprising: predicting an instance segmentation map within each bounding box.

Plain English Translation

This invention relates to computer vision systems for object detection and segmentation in digital images. The problem addressed is the need to accurately identify and segment objects within an image, particularly when objects are densely packed or overlapping. Traditional object detection methods often rely on bounding boxes, which can be imprecise for objects with irregular shapes or when fine-grained segmentation is required. The invention improves upon prior art by combining bounding box detection with instance segmentation. After detecting objects using a bounding box technique, the system further predicts an instance segmentation map within each bounding box. This segmentation map delineates the precise boundaries of each object, distinguishing it from other objects or the background. The method ensures that each object is not only localized but also accurately segmented, improving downstream tasks such as object tracking, scene understanding, or autonomous navigation. The approach leverages deep learning models, likely convolutional neural networks (CNNs) or transformer-based architectures, to perform both detection and segmentation in a unified framework. The segmentation map is generated by refining the initial bounding box predictions, ensuring that the final output provides both coarse and fine-grained object representations. This dual-stage process enhances accuracy while maintaining computational efficiency, making it suitable for real-time applications. The invention is particularly useful in fields like autonomous driving, medical imaging, and robotics, where precise object localization and segmentation are critical.

Claim 6

Original Legal Text

6. The method as recited in claim 1 , wherein aligning the image level features between the source domain and the target domain based on the adversarial learning process further comprises: applying an adversarial loss function to encourage a distribution of labeled samples and the unlabeled samples from a label set; selecting, by the processor device, at least one higher diversity score unlabeled sample from the unlabeled samples; and selecting at least one lower prediction score higher diversity score unlabeled sample from the at least one higher diversity score unlabeled sample.

Plain English translation pending...
Claim 7

Original Legal Text

7. The method as recited in claim 6 , further comprising: annotating the at least one lower prediction score higher diversity score unlabeled sample; and updating the label set with at least one annotated lower prediction score higher diversity score unlabeled sample to form an updated labeled set.

Plain English translation pending...
Claim 8

Original Legal Text

8. The method as recited in claim 6 , wherein selecting the at least one lower prediction score higher diversity score unlabeled sample from the unlabeled samples further comprises: using prediction scores of the unlabeled samples as confidence scores.

Plain English translation pending...
Claim 9

Original Legal Text

9. The method as recited in claim 1 , wherein the source domain and the target domain are selected from at least one of different geographical areas, different weather conditions and different lighting conditions.

Plain English Translation

This invention relates to domain adaptation techniques in machine learning, specifically addressing the challenge of transferring knowledge from a source domain to a target domain when the two domains differ significantly in conditions such as geographical areas, weather, or lighting. The method involves training a model on labeled data from the source domain and then adapting it to perform effectively on unlabeled data from the target domain, despite the differences in environmental factors. The adaptation process may include aligning feature distributions between the domains, leveraging domain-specific features, or using adversarial training to minimize domain discrepancy. By selecting source and target domains that vary in geographical regions, weather patterns, or lighting conditions, the method ensures robustness in real-world applications where environmental factors can significantly impact model performance. The approach is particularly useful in computer vision tasks, such as autonomous driving or remote sensing, where models must generalize across diverse and unpredictable conditions. The invention improves upon traditional domain adaptation methods by explicitly accounting for environmental variability, leading to more reliable and adaptable machine learning systems.

Claim 10

Original Legal Text

10. The method as recited in claim 1 , wherein selecting the at least one higher diversity score unlabeled sample from the unlabeled samples further comprises: selecting unlabeled images that are furthest away from existing annotated images in the label set.

Plain English translation pending...
Claim 11

Original Legal Text

11. The method as recited in claim 1 , further comprising: using a supervised loss function and ground truth labels from the source domain and the target domain to train at least one image-level convolutional neural network (CNN).

Plain English Translation

This invention relates to domain adaptation in computer vision, specifically improving the performance of image-level convolutional neural networks (CNNs) when applied to a target domain different from the source domain where the model was originally trained. The problem addressed is the degradation in accuracy when a CNN trained on one dataset (source domain) is applied to another dataset (target domain) with different characteristics, such as variations in lighting, object poses, or imaging conditions. The method involves training at least one CNN using a supervised loss function and ground truth labels from both the source and target domains. By incorporating labeled data from the target domain during training, the CNN adapts to the target domain's specific features, reducing the domain shift and improving generalization. The supervised loss function ensures that the model learns meaningful representations from both domains, leveraging labeled examples to guide the adaptation process. This approach enhances the model's robustness and accuracy when deployed in real-world scenarios where domain differences are common. The technique is particularly useful in applications like medical imaging, autonomous driving, and industrial inspection, where domain adaptation is critical for reliable performance.

Claim 12

Original Legal Text

12. A computer system for domain adaptation, comprising: a processor device operatively coupled to a memory device, the processor device being configured to: align image level features between a source domain and a target domain based on an adversarial learning process while training a domain discriminator; select, using the domain discriminator, unlabeled samples from the target domain that are far away from existing annotated samples from the target domain; select based on a prediction score of each of the unlabeled samples, samples with lower prediction scores; and annotate the samples with the lower prediction scores.

Plain English translation pending...
Claim 13

Original Legal Text

13. The system as recited in claim 12 , wherein the processor device is further configured to: iteratively retrain a model that annotates the unlabeled samples based on the annotated samples with the lower prediction scores, wherein the model implements at least one predetermined task.

Plain English translation pending...
Claim 14

Original Legal Text

14. The system as recited in claim 13 , wherein the at least one predetermined task includes at least one of instance object detection and segmentation.

Plain English Translation

The system is designed for automated visual processing tasks, specifically addressing the challenge of efficiently identifying and categorizing objects within digital images or video frames. The system includes a neural network model trained to perform at least one predetermined task, such as instance object detection or segmentation. Instance object detection involves locating and classifying individual objects within an image, distinguishing between overlapping or adjacent objects. Segmentation further divides the image into regions corresponding to each detected object, providing precise boundaries. The system processes input data through the neural network, which outputs results indicating the presence, location, and class of objects, along with segmented regions if segmentation is performed. The neural network is trained using a dataset containing labeled images, where objects are annotated with bounding boxes for detection or pixel-level masks for segmentation. The system may also include preprocessing steps to enhance image quality and post-processing to refine detection or segmentation results. This approach improves accuracy and efficiency in applications such as autonomous vehicles, medical imaging, and surveillance, where precise object identification and segmentation are critical.

Claim 15

Original Legal Text

15. The system as recited in claim 13 , wherein, when retraining the model, the processor device is further configured to: input an updated label set including the annotated samples with the lower prediction scores into an image-level convolutional neural network (CNN) to generate at least one feature; based on the at least one feature, propagate the updated label set to a region of interest level (ROI-level) CNN; and generate output bounding boxes as at least one object detection.

Plain English translation pending...
Claim 16

Original Legal Text

16. The system as recited in claim 15 , wherein the processor device is further configured to: predict an instance segmentation map within each bounding box.

Plain English translation pending...
Claim 17

Original Legal Text

17. The system as recited in claim 13 , wherein, when aligning the image level features between the source domain and the target domain based on the adversarial learning process, the processor device is further configured to: apply an adversarial loss function to encourage a distribution of labeled samples and the unlabeled samples from a label set; select at least one higher diversity score unlabeled sample from the unlabeled samples; and selecting at least one lower prediction score higher diversity score unlabeled sample from the at least one higher diversity score unlabeled sample.

Plain English translation pending...
Claim 18

Original Legal Text

18. The system as recited in claim 12 , wherein the source domain and the target domain are selected from at least one of different geographical areas, different weather conditions and different lighting conditions.

Plain English translation pending...
Claim 19

Original Legal Text

19. The system as recited in claim 12 , wherein the processor device is further configured to: use a supervised loss function and ground truth labels from the source domain and the target domain to train at least one image-level convolutional neural network (CNN).

Plain English translation pending...
Claim 20

Original Legal Text

20. A computer program product for domain adaptation, the computer program product comprising a non-transitory computer readable storage medium having program instructions embodied therewith, the program instructions executable by a computing device to cause the computing device to perform the method comprising: aligning image level features between a source domain and a target domain based on an adversarial learning process while training a domain discriminator; selecting, using the domain discriminator, unlabeled samples from the target domain that are far away from existing annotated samples from the target domain; selecting, by a processor device, based on a prediction score of each of the unlabeled samples, samples with lower prediction scores; and annotating the samples with the lower prediction scores.

Plain English translation pending...
Patent Metadata

Filing Date

Unknown

Publication Date

February 9, 2021

Inventors

Yi-Hsuan Tsai
Kihyuk Sohn
Buyu Liu
Manmohan Chandraker
Jong-Chyi Su

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, FAQs, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “DOMAIN ADAPTATION FOR INSTANCE DETECTION AND SEGMENTATION” (10915792). https://patentable.app/patents/10915792

© 2026 Nomic Interactive Technology LLC. Machine-readable context available at /api/llm-context/10915792. See llms.txt for full attribution policy.