Patentable/Patents/US-10915798
US-10915798

Systems and methods for hierarchical webly supervised training for recognizing emotions in images

PublishedFebruary 9, 2021
Assigneenot available in USPTO data we have
Inventorsnot available in USPTO data we have
Technical Abstract

Disclosed herein are embodiments of systems, methods, and products for a webly supervised training of a convolutional neural network (CNN) to predict emotion in images. A computer may query one or more image repositories using search keywords generated based on the tertiary emotion classes of Parrott's emotion wheel. The computer may filter images received in response to the query to generate a weakly labeled training dataset labels associated with the images that are noisy or wrong may be cleaned prior to training of the CNN. The computer may iteratively train the CNN leveraging the hierarchy of emotion classes by increasing the complexity of the labels (tags) for each iteration. Such curriculum guided training may generate a trained CNN that is more accurate than the conventionally trained neural networks.

Patent Claims
20 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1

1. A computer implemented method comprising: retrieving, by a computer, a set of images and associated tags from a data repository, at least one of the tags being indicative of emotion; selecting, by the computer, a subset of the set of images by removing images based on associated tags; training, by the computer, a convolutional neural network at a first training stage, the first training stage includes applying the subset of images to the convolutional neural network configured to identify a probability that each image is associated with each class of a set of first classes of emotion, thereby adjusting at least one weighting within the convolutional neural network; training, by the computer, the convolutional neural network at a second training stage, the second training stage includes applying the subset of images to the convolutional neural network configured to identify a probability that each image is associated with each class of a set of second classes of emotion, and the set of second classes is greater than the set of first classes; training, by the computer, the convolutional neural network at a third training stage, the third training stage includes applying the subset of images to the convolutional neural network configured to identify a probability that each image is associated with each class of a set of third classes of emotion, and the set of third classes is greater than the set of second classes, each of the first, second, and third training stages causing at least one weighting of the convolutional neural network to be adjusted; inputting, by the computer, a new image into the convolutional neural network; and labelling, by the computer, the new image based on a probability of at least one class from the set of third classes of emotion for the new image.

2

2. The method of claim 1 , wherein selecting the subset of the set of images comprises: cleaning, by the computer, associated tags to generate a second set of associated tags.

3

3. The method of claim 2 , wherein cleaning associated tags comprises: removing, by the computer, associated tags to that are unrelated to an image in the subset of the set of images.

4

4. The method of claim 2 , wherein cleaning associated tags comprises: removing, by the computer, non-English tags from the associated tags.

5

5. The method of claim 1 , further comprising: querying, by the computer, the data repository through an internet search using one or more search keywords.

6

6. The method of claim 1 , wherein the associated tags include user-generated tags.

7

7. The method of claim 1 , wherein the set of second classes of emotion is more granular than the set of first classes of emotion.

8

8. The method of claim 1 , wherein the set of third classes emotion is more granular than set of second classes of emotion.

9

9. The method of claim 1 , further comprising: receiving, by a computer, a second set of images; inputting, by the computer, the second set of images into the convolutional neural network; labelling, by the computer, each image in the second set of images with one or more classes of emotion; and sorting, by the computer, the second set of images based on the one or more classes of emotions labelled to each image in the second set of images.

10

10. The method of claim 1 , further comprising: receiving, by the computer, a real time video feed; selecting, by the computer, a frame in the real time video feed; inputting, by the computer, the selected frame in the real time video feed into the convolutional neural network; and labelling, by the computer, the selected frame with at least one class of emotion.

11

11. A computer implemented method comprising: receiving, by a convolutional neural network hosted on a computer, an input from a graphical user interface of a new image and an associated tag; generating, by the convolutional neural network hosted on the computer, for the new image an emotion tag corresponding to one or more classes of emotions, whereby the convolutional neural network is trained with a plurality of stages that identify a probability that a training image is associated with a hierarchical class of emotions comprised of a lower and a higher hierarchical set of classes of emotions, wherein the lower hierarchical set is more granular than the higher hierarchical set, each stage causing an adjusted weighting of the convolutional neural network, and whereby the convolutional neural network uses the associated tag to generate a probability that the new image is associated with a class of emotion within the hierarchical class of emotions; and outputting, by the computer, the emotion tag for display on a revised graphical user interface.

12

12. The method of claim 11 , further comprising: cleaning, by the computer, a set of unfiltered tags associated with the new image to generate a set of filtered tags containing the associated tag.

13

13. The method of claim 12 , wherein cleaning further comprises: removing, by the computer, one or more unfiltered tags that are unrelated to the new image.

14

14. The method of claim 12 , wherein cleaning further comprises: removing, by the computer, non-English tags from the unfiltered tags.

15

15. The method of claim 12 , wherein the set of unfiltered tags includes user-generated tags.

16

16. The method of claim 11 , wherein the plurality of stages comprises at least three stages.

17

17. The method of claim 11 , wherein within the hierarchical class of emotions, a different hierarchical set of classes of emotion is more granular than the lower hierarchal set of classes of emotions.

18

18. The method of claim 11 , wherein the emotion tag is same as the associated tag.

19

19. A computer readable non-transitory medium containing one or more computer instructions, which when executed by a processor cause the processor to: select a subset of a set of images by filtering images based on associated tags; iteratively train a neural network at a plurality of stages, each stage applying the convolutional neural network to identify a probability that each image is associated with an emotion in a class of emotions, each stage applying a new class of emotions with more emotions than in a previous stage, and each stage causing an adjusted weighting of the neural network; and label a new image inputted into the neural network based on a probability of at least one emotion from a class of emotions from the latest stage.

20

20. The computer readable non-transitory medium of claim 19 , wherein when selecting the subset of the set of images, the one or more computer instructions further cause the processor to clean associated tags to generate a second set of associated tags.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

Patent Metadata

Filing Date

May 15, 2018

Publication Date

February 9, 2021

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “Systems and methods for hierarchical webly supervised training for recognizing emotions in images” (US-10915798). https://patentable.app/patents/US-10915798

© 2026 Patentable. All rights reserved.

Patentable is a research and drafting-assistant tool, not a law firm, and does not provide legal advice. Documents we generate are drafts for review by a licensed patent attorney.