10007863

Logo Recognition in Images and Videos

PublishedJune 26, 2018
Assigneenot available in USPTO data we have
Technical Abstract

Patent Claims
20 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1

1. A method to detect a logo in images in video frames selected from a video stream, comprising: applying a saliency analysis and segmentation of selected regions in a selected video frame to determine segmented likely logo regions; processing the segmented likely logo regions with feature matching using correlation to generate a first match, neural network classification using a convolutional neural network to generate a second match, and text recognition using character segmentation and string matching to generate a third match; and deciding a most likely logo match by combining results from the first match, the second match, and the third match.

2

2. The method of claim 1 , wherein the saliency analysis comprises: applying a discrete cosine transform (OCT) on the segmented likely logo regions of an image in a selected video frame to determine spectral saliency of each segmented likely logo region.

3

3. The method of claim 1 , wherein saliency detection comprises: applying a discrete cosine transform (DCT) on the segmented likely logo regions of an image in a selected video frame to determine spectral saliency of each likely logo region; and measuring multi-scale similarity at two higher scales and a smaller scale of the spectral saliency of each likely logo region.

4

4. The method of claim 3 , wherein the multi-scale similarity measures include orientation gradient histograms, hue, saturation, value (HSV) histograms, and stroke width transform (SWT) statistics which include total number of strokes, number of horizontal strokes, number of vertical strokes, stroke density, and number of loops.

5

5. The method of claim 1 , wherein segmentation comprises: applying a stroke width transform (SWT) analysis to the selected regions to generate SWT statistics; applying a graph based segmentation algorithm to establish word boxes around likely logo character strings; and analyzing each of the word boxes to produce a set of character segmentations to delineate the characters in the likely logo character strings.

6

6. The method of claim 1 further comprising: combining neighboring keypoint regions with consistent aspect ratios and size to generate a new keypoint and region.

7

7. The method of claim 1 further comprising: detecting and combining edge segments in a keypoint region; and binning sample points on selected edges according to angle and distance with reference to a dominant orientation of the selected edges.

8

8. The method of claim 1 further comprising: using multiple text classifiers for robust logo text detection.

9

9. The method of claim 1 further comprising: using stroke heuristics to select the text classifier.

10

10. The method of claim 1 further comprising: using N-gram matching to recognize a segment.

11

11. An apparatus comprising: at least one processor; and a memory in communication with the at least one processor, the memory including non-transitory computer-readable code which, when executed, cause the at least one processor to at least: apply a saliency analysis and segmentation of selected regions in a selected video frame to determine segmented likely logo regions; process the segmented likely logo regions with feature matching using correlation to generate a first match, neural network classification using a convolutional neural network to generate a second match, and text recognition using character segmentation and string matching to generate a third match; and decide a most likely logo match by combining results from the first match, the second match, and the third match.

12

12. The apparatus of claim 11 , wherein the saliency analysis comprises: applying a discrete cosine transform (OCT) on the segmented likely logo regions of an image in a selected video frame to determine spectral saliency of each segmented likely logo region.

13

13. The apparatus of claim 11 , wherein saliency detection comprises: applying a discrete cosine transform (DCT) on the segmented likely logo regions of an image in a selected video frame to determine spectral saliency of each likely logo region; and measuring multi-scale similarity at two higher scales and a smaller scale of the spectral saliency of each likely logo region.

14

14. The apparatus of claim 13 , wherein the multi-scale similarity measures include orientation gradient histograms, hue, saturation, value (HSV) histograms, and stroke width transform (SWT) statistics which include total number of strokes, number of horizontal strokes, number of vertical strokes, stroke density, and number of loops.

15

15. A non-transitory computer-readable storage medium storing code which, when executed, cause a machine to at least: apply a saliency analysis and segmentation of selected regions in a selected video frame to determine segmented likely logo regions; process the segmented likely logo regions with feature matching using correlation to generate a first match, neural network classification using a convolutional neural network to generate a second match, and text recognition using character segmentation and string matching to generate a third match; and decide a most likely logo match by combining results from the first match, the second match, and the third match.

16

16. The computer-readable storage medium of claim 15 , wherein segmentation comprises: applying a stroke width transform (SWT) analysis to the selected regions to generate SWT statistics; applying a graph based segmentation algorithm to establish word boxes around likely logo character strings; and analyzing each of the word boxes to produce a set of character segmentations to delineate the characters in the likely logo character strings.

17

17. The computer-readable storage medium of claim 15 further comprising: combining neighboring keypoint regions with consistent aspect ratios and size to generate a new keypoint and region.

18

18. The computer-readable storage medium of claim 15 further comprising detecting and combining edge segments in a keypoint region; and binning sample points on selected edges according to angle and distance with reference to a dominant orientation of the selected edges.

19

19. The computer-readable storage medium of claim 15 further comprising: using multiple text classifiers for robust logo text detection.

20

20. The computer-readable storage medium of claim 15 further comprising: using stroke heuristics to select the text classifier.

Patent Metadata

Filing Date

Unknown

Publication Date

June 26, 2018

Inventors

Jose Pio Pereira
Kyle Brocklehurst
Sunil Suresh Kulkarni
Peter Wendt

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “Logo Recognition in Images and Videos” (10007863). https://patentable.app/patents/10007863

© 2026 Patentable. All rights reserved.

Patentable is a research and drafting-assistant tool, not a law firm, and does not provide legal advice. Documents we generate are drafts for review by a licensed patent attorney.