{"schema_version":"1.0","canonical_url":"https://patentable.app/patents/US-11521372","patent":{"patent_number":"US-11521372","title":"Utilizing machine learning models, position based extraction, and automated data labeling to process image-based documents","assignee":null,"inventors":[],"filing_date":"2020-03-20T00:00:00.000Z","publication_date":"2022-12-06T00:00:00.000Z","cpc_codes":["G06V","G06F","G06F","G06N","G06N","G06N","G06N","G06V","G06V","G06V","G06V","G06V","G06V","G06V","G06V","G06N","G06N","G06N","G06N","G06V","G06V"],"num_claims":20,"abstract":"A device may receive image data that includes an image of a document and lexicon data identifying a lexicon, and may perform an extraction technique on the image data to identify at least one field in the document. The device may utilize form segmentation to automatically generate label data identifying labels for the image data, and may process the image data, the label data, and data identifying the at least one field, with a first model, to identify visual features. The device may process the image data and the visual features, with a second model, to identify sequences of characters, and may process the image data and the sequences of characters, with a third model, to identify strings of characters. The device may compare the lexicon data and the strings of characters to generate verified strings of characters that may be utilized to generate a digitized document."},"analysis":{"summary":null,"layman_explanation":null,"technical_analysis":null,"business_analysis":null,"faqs":null,"topics":[],"tech_cluster":null},"seo":{"title":"Utilizing machine learning models, position based extraction, and automated data labeling to process image-based documents","description":"A device may receive image data that includes an image of a document and lexicon data identifying a lexicon, and may perform an extraction technique on the image data to identify at least one field in","keywords":[]},"attribution":{"source":"Patentable","source_url":"https://patentable.app","canonical_url":"https://patentable.app/patents/US-11521372","license":"CC-BY-4.0-like","license_terms":"AI-generated analysis on this page (summary, layman_explanation, technical_analysis, business_analysis, faqs) may be reused with attribution and a visible link back to the canonical URL above. Patent abstracts, claims, and bibliographic data are USPTO public domain.","required_link":"https://patentable.app/patents/US-11521372","citation_suggestion":"Patentable. \"Utilizing machine learning models, position based extraction, and automated data labeling to process image-based documents\" (US-11521372). https://patentable.app/patents/US-11521372","copyright_holder":"Nomic Interactive Technology LLC"},"links":{"html":"https://patentable.app/patents/US-11521372","json":"https://patentable.app/api/llm-context/US-11521372","site":"https://patentable.app","llms_txt":"https://patentable.app/llms.txt"},"generated_at":"2026-05-31T08:37:10.860Z"}