{"schema_version":"1.0","canonical_url":"https://patentable.app/patents/US-8515745","patent":{"patent_number":"US-8515745","title":"Selecting speech data for speech recognition vocabulary","assignee":null,"inventors":[],"filing_date":"2012-08-24T00:00:00.000Z","publication_date":"2013-08-20T00:00:00.000Z","cpc_codes":["G10L","G06F","G10L","G06F"],"num_claims":24,"abstract":"Methods, systems, and apparatus for selecting training data. In an aspect, a method comprises: obtaining search session data comprising search sessions that include search queries, wherein each search query comprises words; determining a threshold out of vocabulary rate indicating a rate at which a word in a search query is not included in a vocabulary; determining a threshold session out of vocabulary rate, the session out of vocabulary rate indicating a rate at which search sessions have an out of vocabulary rate that meets the threshold out of vocabulary rate; selecting a vocabulary of words that, for a set of test data, has a session out of vocabulary rate that meets the threshold session out of vocabulary rate, the vocabulary of words being selected from the one or more words included in each of the search queries included in the search sessions."},"analysis":{"summary":null,"layman_explanation":null,"technical_analysis":null,"business_analysis":null,"faqs":null,"topics":[],"tech_cluster":null},"seo":{"title":"Selecting speech data for speech recognition vocabulary","description":"Methods, systems, and apparatus for selecting training data. In an aspect, a method comprises: obtaining search session data comprising search sessions that include search queries, wherein each search","keywords":[]},"attribution":{"source":"Patentable","source_url":"https://patentable.app","canonical_url":"https://patentable.app/patents/US-8515745","license":"CC-BY-4.0-like","license_terms":"AI-generated analysis on this page (summary, layman_explanation, technical_analysis, business_analysis, faqs) may be reused with attribution and a visible link back to the canonical URL above. Patent abstracts, claims, and bibliographic data are USPTO public domain.","required_link":"https://patentable.app/patents/US-8515745","citation_suggestion":"Patentable. \"Selecting speech data for speech recognition vocabulary\" (US-8515745). https://patentable.app/patents/US-8515745","copyright_holder":"Nomic Interactive Technology LLC"},"links":{"html":"https://patentable.app/patents/US-8515745","json":"https://patentable.app/api/llm-context/US-8515745","site":"https://patentable.app","llms_txt":"https://patentable.app/llms.txt"},"generated_at":"2026-05-30T16:38:06.895Z"}