{"schema_version":"1.0","canonical_url":"https://patentable.app/patents/US-8489645","patent":{"patent_number":"US-8489645","title":"Techniques for estimating item frequencies in large data sets","assignee":null,"inventors":[],"filing_date":"2004-09-27T00:00:00.000Z","publication_date":"2013-07-16T00:00:00.000Z","cpc_codes":["G06Q","G06F","G06Q"],"num_claims":14,"abstract":"Techniques for estimating items (e.g., data item or objects) frequencies in large data sets are disclosed. For example, a technique for determining items and their frequencies at multiple levels of interest in a collection of nested bags includes the following steps. A hierarchy of a plurality of levels of nested bags and the levels of interest are inputted. Among the plurality of levels, a subset of bags is sampled from at least one level. At each level of interest, the frequency is counted of each distinct item in the bags obtained in the sampling step. At each level of interest, the item frequencies obtained in the counting step are extrapolated based on sampling ratios associated with the sampling step. At each level of interest, the items are sorted according to their frequencies obtained from the extrapolating step and those items with highest frequencies are retained. A bag may refer to one or more subsets or groups of data items or objects. Also, a bag may, itself, contain one or more other bags."},"analysis":{"summary":null,"layman_explanation":null,"technical_analysis":null,"business_analysis":null,"faqs":null,"topics":[],"tech_cluster":null},"seo":{"title":"Techniques for estimating item frequencies in large data sets","description":"Techniques for estimating items (e.g., data item or objects) frequencies in large data sets are disclosed. For example, a technique for determining items and their frequencies at multiple levels of in","keywords":[]},"attribution":{"source":"Patentable","source_url":"https://patentable.app","canonical_url":"https://patentable.app/patents/US-8489645","license":"CC-BY-4.0-like","license_terms":"AI-generated analysis on this page (summary, layman_explanation, technical_analysis, business_analysis, faqs) may be reused with attribution and a visible link back to the canonical URL above. Patent abstracts, claims, and bibliographic data are USPTO public domain.","required_link":"https://patentable.app/patents/US-8489645","citation_suggestion":"Patentable. \"Techniques for estimating item frequencies in large data sets\" (US-8489645). https://patentable.app/patents/US-8489645","copyright_holder":"Nomic Interactive Technology LLC"},"links":{"html":"https://patentable.app/patents/US-8489645","json":"https://patentable.app/api/llm-context/US-8489645","site":"https://patentable.app","llms_txt":"https://patentable.app/llms.txt"},"generated_at":"2026-05-30T15:35:45.658Z"}