Patentable/Patents/US-11494616
US-11494616

Decoupling category-wise independence and relevance with self-attention for multi-label image classification

PublishedNovember 8, 2022
Assigneenot available in USPTO data we have
Inventorsnot available in USPTO data we have
Technical Abstract

Methods and systems are provided for generating a multi-label classification system. The multi-label classification system can use a multi-label classification neural network system to identify one or more labels for an image. The multi-label classification system can explicitly take into account the relationship between classes in identifying labels. A relevance sub-network of the multi-label classification neural network system can capture relevance information between the classes. Such a relevance sub-network can decouple independence between classes to focus learning on relevance between the classes.

Patent Claims
12 claims

Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.

Claim 5

Original Legal Text

5. The media of non-transitory computer-readable claim 4, wherein the error is determined using binary cross-entropy loss.

Plain English translation pending...
Claim 6

Original Legal Text

6. The media of non-transitory computer-readable claim 1, wherein the plurality of channel-wise attention maps are generated using a relevance sub-network to process the plurality of class-wise feature maps and enable determination of relevance between the plurality of classes based at least in part on class-independent features of the image.

Plain English translation pending...
Claim 7

Original Legal Text

7. The media of non-transitory computer-readable claim 1, wherein the plurality of channel-wise attention maps are generated using a relevance sub-network to process the plurality of convolutional feature maps and enable determination of relevance between the plurality of classes based at least in part on local detailed information relating to features of the image.

Plain English translation pending...
Claim 8

Original Legal Text

8. The media non-transitory computer-readable of claim 1, wherein the coupling of combined class-wise feature maps with the plurality of channel-wise attention maps produces a category-wise feature map that dynamically re-weights the combined class-wise feature maps based on the plurality of channel-wise attention maps.

Plain English translation pending...
Claim 12

Original Legal Text

12. The computer-implemented method of claim 9, wherein the plurality of class-wise feature maps have a decreased spatial resolution, the decreased spatial resolution being a spatial resolution of the convolutional feature maps.

Plain English Translation

This invention relates to computer vision and deep learning, specifically improving the efficiency of convolutional neural networks (CNNs) by reducing computational complexity in feature extraction. The problem addressed is the high computational cost and memory usage associated with processing high-resolution feature maps in CNNs, which limits their scalability and real-time performance. The method involves generating a plurality of class-wise feature maps from convolutional feature maps, where the class-wise feature maps have a decreased spatial resolution compared to the original convolutional feature maps. The decreased spatial resolution matches the spatial resolution of the convolutional feature maps, meaning the feature maps are downsampled to a lower resolution to reduce the number of spatial dimensions while preserving essential features. This reduction in spatial resolution minimizes computational overhead and memory requirements during subsequent processing steps, such as classification or object detection. The method may also include generating a plurality of class-wise feature maps from convolutional feature maps, where the class-wise feature maps have a decreased spatial resolution compared to the original convolutional feature maps. The decreased spatial resolution matches the spatial resolution of the convolutional feature maps, meaning the feature maps are downsampled to a lower resolution to reduce the number of spatial dimensions while preserving essential features. This reduction in spatial resolution minimizes computational overhead and memory requirements during subsequent processing steps, such as classification or object detection. The method may also include generating a plurality of class-wise feature maps from convolutional feature maps, wh

Claim 13

Original Legal Text

13. The computer-implemented method of claim 9, wherein the plurality of channel-wise attention maps are generated using a relevance sub-network to process the plurality of class-wise feature maps and enable determination of relevance between the plurality of classes based at least in part on class-independent features of the image.

Plain English translation pending...
Claim 14

Original Legal Text

14. The computer-implemented method of claim 9, wherein the plurality of channel-wise attention maps are generated using a relevance sub-network to process the plurality of convolutional feature maps and enable determination of relevance between the plurality of classes based at least in part on local detailed information relating to features of the image.

Plain English translation pending...
Claim 15

Original Legal Text

15. The computer-implemented method of claim 9, wherein coupling of the category-wise feature map with the plurality of channel-wise attention maps dynamically re-weights the category-wise attention map based on the channel-wise attention maps.

Plain English translation pending...
Claim 17

Original Legal Text

17. The system of claim 16, wherein the output is a plurality of labels each based on a probability for a class based on the coupled category-wise feature map and the plurality of channel-wise attention maps in relation to a predefined threshold level score.

Plain English translation pending...
Claim 18

Original Legal Text

18. The system of claim 16, wherein the output further comprises a heat map based at least in part on the coupled category-wise feature map and the plurality of channel-wise attention maps, wherein the heat map provides a visualization mapping locations of a feature of the image corresponding to one of the plurality of labels.

Plain English translation pending...
Claim 19

Original Legal Text

19. The system of claim 16, wherein the plurality of channel-wise attention maps are generated using a relevance sub-network to process the plurality of class-wise feature maps and enable determination of relevance between the plurality of classes based at least in part on class-independent features of the image.

Plain English translation pending...
Claim 20

Original Legal Text

20. The system of claim 16, wherein the plurality of channel-wise attention maps are generated using a relevance sub-network to process the plurality of convolutional feature maps and enable determination of relevance between the plurality of classes based at least in part on local detailed information relating to features of the image.

Plain English translation pending...
Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

Patent Metadata

Filing Date

July 11, 2019

Publication Date

November 8, 2022

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, FAQs, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “Decoupling category-wise independence and relevance with self-attention for multi-label image classification” (US-11494616). https://patentable.app/patents/US-11494616

© 2026 Nomic Interactive Technology LLC. Machine-readable context available at /api/llm-context/US-11494616. See llms.txt for full attribution policy.