11238070

Dense Cluster Filtering

PublishedFebruary 1, 2022
Assigneenot available in USPTO data we have
Technical Abstract

Patent Claims
20 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1

1. A computer-implemented method for assigning a label to an image comprising: obtaining a collection of images; determining first feature vectors for individual images in the collection of images; analyzing the first feature vectors to generate second feature vectors for the individual images, wherein the second feature vectors have a lower dimensionality than the first feature vectors; executing a mode seeking algorithm using the second feature vectors to identify a group of images having a lower image quality compared to images of the collection of images, the group of images having a feature space density greater than a density threshold; generating a filtered collection of images by removing the group of images having the lower image quality from the collection of images; performing a cluster analysis algorithm on the filtered collection of images to generate one or more identified clusters; assigning a label to the images of a cluster of the one or more identified clusters; obtaining a query image from a client device; determining that the query image belongs to the cluster of the one or more identified clusters; and sending the label to the client device.

2

2. The computer-implemented method of claim 1 , further comprising: identifying a second group of images of the collection of images having a feature space density greater than the density threshold, wherein the filtered collection of images is further generated by removing the second group of images from the collection of images.

3

3. The computer-implemented method of claim 1 , wherein the cluster analysis algorithm employs a Hierarchical Density Based Spatial Clustering of Applications with Noise (HDBSCAN) algorithm.

4

4. A computer-implemented method for determining clusters in a dataset, comprising: obtaining a collection of content items; determining respective item features for individual content items of the collection of content items; determining a density metric for a subset of content items of the collection of content items based at least in part upon the respective item features; determining that the density metric for the subset of content items is greater than a density threshold; generating a filtered collection of content items by removing the subset of content items having the density metric greater than the density threshold from the collection of content items; performing a clustering analysis on the filtered collection of content items to generate one or more identified clusters; and assigning a label to a cluster of the one or more identified clusters.

5

5. The computer-implemented method of claim 4 , wherein determining the respective item features comprises: obtaining a respective set of characteristics for each item of the collection of content items; and determining a feature of respective item features for a first content item using a large-margin nearest neighbor classifier based at least upon the respective set of characteristics for each item of the collection of content items.

6

6. The computer-implemented method of claim 4 , wherein determining the density metric comprises: applying a mean-shift algorithm to the collection of content items; and determining that the quantity of items of the subset of content items that reside within a window of the mean-shift algorithm is above a quantity threshold.

7

7. The computer-implemented method of claim 4 , further comprising: determining a second subset of content items having a second density metric that is greater than the density threshold, wherein the filtered collection of content items is further generated by removing the second subset of content items from the collection of content items.

8

8. The computer-implemented method of claim 4 , further comprising: determining a second subset of content items having a sparsity metric that is greater than a sparsity threshold, wherein the filtered collection of content items is further generated by removing the second subset of content items from the collection of content items.

9

9. The computer-implemented method of claim 4 , wherein the cluster analysis employs a Hierarchical Density Based Spatial Clustering of Applications with Noise (HDBSCAN) algorithm.

10

10. The computer-implemented method of claim 4 , further comprising: obtaining an additional content item; determining that the additional content item is dissimilar to the subset of content items; and including the additional content item with the filtered collection of content items.

11

11. The computer-implemented method of claim 4 , further comprising: presenting a representation of one of the content items of the cluster of content items; and receiving the label as an input.

12

12. The computer-implemented method of claim 4 , further comprising: obtaining an additional content item; determining that a similarity score of the additional content item to the cluster of content items is greater than a predetermined amount; and assigning the label to the additional content item.

13

13. The computer-implemented method of claim 4 , wherein the content items include images, instances of image data, frames of video content, or audio.

14

14. A system, comprising: at least one processor; and memory including instructions that, when executed by the at least one processor, cause the system to: obtain a collection of content items; determine respective item features for individual content items of the collection of content items; determine a density metric for a subset of content items of the collection of content items based at least in part upon the respective item features; determine that the density metric for the subset of content items is greater than a density threshold; generate a filtered collection of content items by removing the subset of content items having the density metric greater than the density threshold from the collection of content items; perform a clustering analysis on the filtered collection of content items to generate one or more identified clusters; and assign a label to a cluster of the one or more identified clusters.

15

15. The system of claim 14 , wherein the instructions that when executed cause the system to determine the respective item feature further cause the system to: obtain a respective set of characteristics for each item of the collection of content items; and determine a feature of respective item features for a first content item using a large-margin nearest neighbor classifier based at least upon the respective set of characteristics for each item of the collection of content items.

16

16. The system of claim 14 , wherein the instructions that when executed cause the system to determine the density metric further cause the system to: apply a mean-shift algorithm to the collection of content items; and determine that the quantity of items of the subset of content items that reside within a window of the mean-shift algorithm is above a quantity threshold.

17

17. The system of claim 14 , wherein the instructions when executed further cause the system to: determine a second subset of content items having a second density metric that is greater than the density threshold, wherein the filtered collection of content items is further generated by removing the second subset of content items from the collection of content items.

18

18. The system of claim 14 , wherein the instructions when executed further cause the system to: determine a second subset of content items having a sparsity metric that is greater than a sparsity threshold, wherein the filtered collection of content items is further generated by removing the second subset of content items from the collection of content items.

19

19. The system of claim 14 , wherein the cluster analysis employs a Hierarchical Density Based Spatial Clustering of Applications with Noise (HDBSCAN) algorithm.

20

20. The system of claim 14 , wherein the instructions when executed further cause the system to: obtain an additional content item; determine that the additional content item is dissimilar to the subset of content items; and include the additional content item with the filtered collection of content items.

Patent Metadata

Filing Date

Unknown

Publication Date

February 1, 2022

Inventors

Shabnam Ghadar

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “DENSE CLUSTER FILTERING” (11238070). https://patentable.app/patents/11238070

© 2026 Patentable. All rights reserved.

Patentable is a research and drafting-assistant tool, not a law firm, and does not provide legal advice. Documents we generate are drafts for review by a licensed patent attorney.