{"schema_version":"1.0","canonical_url":"https://patentable.app/patents/US-11244205","patent":{"patent_number":"US-11244205","title":"Generating multi modal image representation for an image","assignee":null,"inventors":[],"filing_date":"2019-03-29T00:00:00.000Z","publication_date":"2022-02-08T00:00:00.000Z","cpc_codes":["G06N","G06F","G06F","G06F","G06N","G06N","G06N","G06V","G06V","G06V","G06V","G06V","G06V","G06V"],"num_claims":20,"abstract":"Technologies for generating a multi-modal representation of an image based on the image content are provided. The disclosed techniques include receiving an image, to be classified, that comprises one or more embedded text characters. The one or more embedded text characters are identified from the image and a first machine learning model is used to generate a text vector that represents a numerical representation of the one or more embedded text characters. A second machine learning model is used to generate an image vector that represents a numerical representation of the graphical portion of the image. The text vector and the image vector are used as input to generate a multi-modal vector that contains information from both the text vector and the image vector. The image may be classified into one of a plurality of image classifications based upon the information in the multi-modal vector."},"analysis":{"summary":null,"layman_explanation":null,"technical_analysis":null,"business_analysis":null,"faqs":null,"topics":[],"tech_cluster":null},"seo":{"title":"Generating multi modal image representation for an image","description":"Technologies for generating a multi-modal representation of an image based on the image content are provided. The disclosed techniques include receiving an image, to be classified, that comprises one ","keywords":[]},"attribution":{"source":"Patentable","source_url":"https://patentable.app","canonical_url":"https://patentable.app/patents/US-11244205","license":"CC-BY-4.0-like","license_terms":"AI-generated analysis on this page (summary, layman_explanation, technical_analysis, business_analysis, faqs) may be reused with attribution and a visible link back to the canonical URL above. Patent abstracts, claims, and bibliographic data are USPTO public domain.","required_link":"https://patentable.app/patents/US-11244205","citation_suggestion":"Patentable. \"Generating multi modal image representation for an image\" (US-11244205). https://patentable.app/patents/US-11244205","copyright_holder":"Nomic Interactive Technology LLC"},"links":{"html":"https://patentable.app/patents/US-11244205","json":"https://patentable.app/api/llm-context/US-11244205","site":"https://patentable.app","llms_txt":"https://patentable.app/llms.txt"},"generated_at":"2026-05-30T16:38:07.902Z"}