{"schema_version":"1.0","canonical_url":"https://patentable.app/patents/US-11487999","patent":{"patent_number":"US-11487999","title":"Spatial-temporal reasoning through pretrained language models for video-grounded dialogues","assignee":null,"inventors":[],"filing_date":"2020-04-28T00:00:00.000Z","publication_date":"2022-11-01T00:00:00.000Z","cpc_codes":["H04N","G06F","G06N","G06N","G06N","G06N","G06V","G06V","G06V","G06V","H04N","H04N","H04N","G06F"],"num_claims":20,"abstract":"A system and method for generating a response in a video grounded dialogue are provided. A video-grounded dialogue neural network language model receives video input and text input. The text input includes a dialogue history between the model and a human user and a current utterance by the user. Encoded video input is generated using video encoding layers. Encoded text input is generated using text encoding layers. The encoded video input and the encoded text input are concatenated in to a single input sequence. A generative pre-trained transformer model generates the response to the current utterance from the singe input sequence."},"analysis":{"summary":null,"layman_explanation":null,"technical_analysis":null,"business_analysis":null,"faqs":null,"topics":[],"tech_cluster":null},"seo":{"title":"Spatial-temporal reasoning through pretrained language models for video-grounded dialogues","description":"A system and method for generating a response in a video grounded dialogue are provided. A video-grounded dialogue neural network language model receives video input and text input. The text input inc","keywords":[]},"attribution":{"source":"Patentable","source_url":"https://patentable.app","canonical_url":"https://patentable.app/patents/US-11487999","license":"CC-BY-4.0-like","license_terms":"AI-generated analysis on this page (summary, layman_explanation, technical_analysis, business_analysis, faqs) may be reused with attribution and a visible link back to the canonical URL above. Patent abstracts, claims, and bibliographic data are USPTO public domain.","required_link":"https://patentable.app/patents/US-11487999","citation_suggestion":"Patentable. \"Spatial-temporal reasoning through pretrained language models for video-grounded dialogues\" (US-11487999). https://patentable.app/patents/US-11487999","copyright_holder":"Nomic Interactive Technology LLC"},"links":{"html":"https://patentable.app/patents/US-11487999","json":"https://patentable.app/api/llm-context/US-11487999","site":"https://patentable.app","llms_txt":"https://patentable.app/llms.txt"},"generated_at":"2026-05-30T21:22:38.898Z"}