{"schema_version":"1.0","canonical_url":"https://patentable.app/patents/US-11514370","patent":{"patent_number":"US-11514370","title":"Selective batching for inference system for transformer-based generation tasks","assignee":null,"inventors":[],"filing_date":"2021-12-03T00:00:00.000Z","publication_date":"2022-11-29T00:00:00.000Z","cpc_codes":["G06N","G06N","G06F","G06F","G06N","G06N"],"num_claims":18,"abstract":"An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal state length by selectively batching a subset of operations in the transformer model but processing requests in the batch individually for a subset of operations in the transformer model. In one embodiment, the operation to be processed individually is an attention operation of an encoder or a decoder of the transformer model. By selective batching, the inference system can allow batching operations to be performed for a batch of requests with variable input or target length or internal state length to utilize the parallel computation capabilities of hardware accelerators while preventing unnecessary computations that occur for workarounds that restrain the data of a batch of requests to a same length."},"analysis":{"summary":null,"layman_explanation":null,"technical_analysis":null,"business_analysis":null,"faqs":null,"topics":[],"tech_cluster":null},"seo":{"title":"Selective batching for inference system for transformer-based generation tasks","description":"An inference system applies a machine-learning transformer model to a batch of requests with variable input length or variable target length or variable internal state length by selectively batching a","keywords":[]},"attribution":{"source":"Patentable","source_url":"https://patentable.app","canonical_url":"https://patentable.app/patents/US-11514370","license":"CC-BY-4.0-like","license_terms":"AI-generated analysis on this page (summary, layman_explanation, technical_analysis, business_analysis, faqs) may be reused with attribution and a visible link back to the canonical URL above. Patent abstracts, claims, and bibliographic data are USPTO public domain.","required_link":"https://patentable.app/patents/US-11514370","citation_suggestion":"Patentable. \"Selective batching for inference system for transformer-based generation tasks\" (US-11514370). https://patentable.app/patents/US-11514370","copyright_holder":"Nomic Interactive Technology LLC"},"links":{"html":"https://patentable.app/patents/US-11514370","json":"https://patentable.app/api/llm-context/US-11514370","site":"https://patentable.app","llms_txt":"https://patentable.app/llms.txt"},"generated_at":"2026-05-30T22:39:22.485Z"}