{"schema_version":"1.0","canonical_url":"https://patentable.app/patents/US-11514691","patent":{"patent_number":"US-11514691","title":"Generating training sets to train machine learning models","assignee":null,"inventors":[],"filing_date":"2019-06-12T00:00:00.000Z","publication_date":"2022-11-29T00:00:00.000Z","cpc_codes":["G06N","G06F","G06F","G06F","G06F","G06N","G06V","G06V","G06V"],"num_claims":18,"abstract":"A computer system trains a machine learning model. A vector representation is generated for each document in a collection of documents. The documents are clustered based on the vector representations of the documents to produce a plurality of clusters. A training set is produced by selecting one or more documents from each cluster, wherein the selected documents represent a sample of the collection of documents to train the machine learning model. The machine learning model is trained by applying the training set to the machine learning model. Embodiments of the present invention further include a method and program product for training a machine learning model in substantially the same manner described above."},"analysis":{"summary":null,"layman_explanation":null,"technical_analysis":null,"business_analysis":null,"faqs":null,"topics":[],"tech_cluster":null},"seo":{"title":"Generating training sets to train machine learning models","description":"A computer system trains a machine learning model. A vector representation is generated for each document in a collection of documents. The documents are clustered based on the vector representations ","keywords":[]},"attribution":{"source":"Patentable","source_url":"https://patentable.app","canonical_url":"https://patentable.app/patents/US-11514691","license":"CC-BY-4.0-like","license_terms":"AI-generated analysis on this page (summary, layman_explanation, technical_analysis, business_analysis, faqs) may be reused with attribution and a visible link back to the canonical URL above. Patent abstracts, claims, and bibliographic data are USPTO public domain.","required_link":"https://patentable.app/patents/US-11514691","citation_suggestion":"Patentable. \"Generating training sets to train machine learning models\" (US-11514691). https://patentable.app/patents/US-11514691","copyright_holder":"Nomic Interactive Technology LLC"},"links":{"html":"https://patentable.app/patents/US-11514691","json":"https://patentable.app/api/llm-context/US-11514691","site":"https://patentable.app","llms_txt":"https://patentable.app/llms.txt"},"generated_at":"2026-05-30T15:57:02.611Z"}