{"schema_version":"1.0","canonical_url":"https://patentable.app/patents/US-8509525","patent":{"patent_number":"US-8509525","title":"Clustering of forms from large-scale scanned-document collection","assignee":null,"inventors":[],"filing_date":"2011-04-06T00:00:00.000Z","publication_date":"2013-08-13T00:00:00.000Z","cpc_codes":["G06V","G06F","G06V","G06V"],"num_claims":19,"abstract":"Techniques for identifying documents sharing common underlying structures in a large collection of documents and processing the documents using the identified structures are disclosed. Images of the document collection are processed to detect occurrences of a predetermined set of image features that are common or similar among forms. The images are then indexed in an image index based on the detected image features. A graph of nodes is built. Nodes in the graph represent images and are connected to nodes representing similar document images by edges. Documents sharing common underlying structures are identified by gathering strongly inter-connected nodes in the graph. The identified documents are processed based at least in part on the resulting clusters."},"analysis":{"summary":null,"layman_explanation":null,"technical_analysis":null,"business_analysis":null,"faqs":null,"topics":[],"tech_cluster":null},"seo":{"title":"Clustering of forms from large-scale scanned-document collection","description":"Techniques for identifying documents sharing common underlying structures in a large collection of documents and processing the documents using the identified structures are disclosed. Images of the d","keywords":[]},"attribution":{"source":"Patentable","source_url":"https://patentable.app","canonical_url":"https://patentable.app/patents/US-8509525","license":"CC-BY-4.0-like","license_terms":"AI-generated analysis on this page (summary, layman_explanation, technical_analysis, business_analysis, faqs) may be reused with attribution and a visible link back to the canonical URL above. Patent abstracts, claims, and bibliographic data are USPTO public domain.","required_link":"https://patentable.app/patents/US-8509525","citation_suggestion":"Patentable. \"Clustering of forms from large-scale scanned-document collection\" (US-8509525). https://patentable.app/patents/US-8509525","copyright_holder":"Nomic Interactive Technology LLC"},"links":{"html":"https://patentable.app/patents/US-8509525","json":"https://patentable.app/api/llm-context/US-8509525","site":"https://patentable.app","llms_txt":"https://patentable.app/llms.txt"},"generated_at":"2026-05-30T16:46:08.091Z"}