{"schema_version":"1.0","canonical_url":"https://patentable.app/patents/US-9753964","patent":{"patent_number":"US-9753964","title":"Similarity clustering in linear time with error-free retrieval using signature overlap with signature size matching","assignee":null,"inventors":[],"filing_date":"2017-01-19T00:00:00.000Z","publication_date":"2017-09-05T00:00:00.000Z","cpc_codes":["G06F","G06F","G06F","G06F","G06F"],"num_claims":27,"abstract":"A method for a processing device to determine whether to assign a data item to at least one cluster of data items is disclosed. The processing device may identify a signature of the data item, the signature including a set of elements. The processing device derive a first size value of the number of elements of the identified signature based on a set of size values of signatures that includes a maximum size value representing the largest number of elements in a signature. The processing device may derive a second size value of the number of elements of a second signature that is similar to the identified signature based on the set of size values of signatures. The processing device may select a subset of the set of elements of the identified signature to form at least one partial signature of the identified signature wherein the number of elements in the partial signature represents the number of elements in common between a signature having the first size value and a second similar signature having the second size value. The processing device may combine the selected subset of elements into at least one token. The processing device may determine whether the at least one token is present in a memory, the memory configured to contain an existing set of tokens. The processing device may determine whether to assign the data item to at least one cluster based on whether the at least one token is present in the memory."},"analysis":{"summary":null,"layman_explanation":null,"technical_analysis":null,"business_analysis":null,"faqs":null,"topics":[],"tech_cluster":null},"seo":{"title":"Similarity clustering in linear time with error-free retrieval using signature overlap with signature size matching","description":"A method for a processing device to determine whether to assign a data item to at least one cluster of data items is disclosed. The processing device may identify a signature of the data item, the sig","keywords":[]},"attribution":{"source":"Patentable","source_url":"https://patentable.app","canonical_url":"https://patentable.app/patents/US-9753964","license":"CC-BY-4.0-like","license_terms":"AI-generated analysis on this page (summary, layman_explanation, technical_analysis, business_analysis, faqs) may be reused with attribution and a visible link back to the canonical URL above. Patent abstracts, claims, and bibliographic data are USPTO public domain.","required_link":"https://patentable.app/patents/US-9753964","citation_suggestion":"Patentable. \"Similarity clustering in linear time with error-free retrieval using signature overlap with signature size matching\" (US-9753964). https://patentable.app/patents/US-9753964","copyright_holder":"Nomic Interactive Technology LLC"},"links":{"html":"https://patentable.app/patents/US-9753964","json":"https://patentable.app/api/llm-context/US-9753964","site":"https://patentable.app","llms_txt":"https://patentable.app/llms.txt"},"generated_at":"2026-06-06T05:36:44.427Z"}