Patentable/Patents/US-11521608
US-11521608

Methods and systems for correcting, based on speech, input generated using automatic speech recognition

PublishedDecember 6, 2022
Assigneenot available in USPTO data we have
Inventorsnot available in USPTO data we have
Technical Abstract

Methods and systems for correcting, based on subsequent second speech, an error in an input generated from first speech using automatic speech recognition, without an explicit indication in the second speech that a user intended to correct the input with the second speech, include determining that a time difference between when search results in response to the input were displayed and when the second speech was received is less than a threshold time, and based on the determination, correcting the input based on the second speech. The methods and systems also include determining that a difference in acceleration of a user input device, used to input the first speech and second speech, between when the search results in response to the input were displayed and when the second speech was received is less than a threshold acceleration, and based on the determination, correcting the input based on the second speech.

Patent Claims
8 claims

Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.

Claim 2

Original Legal Text

2. The method of claim 1, wherein determining that no input associated with the browsing search results was received via the user input device between the first time and the second time comprises determining that no input to scroll through the browsing search results, read descriptions of the browsing search results, open the browsing search results, or play the browsing search results was received via the user input device between the first time and the second time.

Plain English Translation

This invention relates to user interaction monitoring in digital browsing environments, specifically detecting inactivity or lack of engagement with search results. The problem addressed is the need to accurately determine when a user has not interacted with displayed search results, which can be useful for optimizing content delivery, user experience, or system performance. The method involves tracking user input during a browsing session to assess engagement with search results. It determines whether no input was received via a user input device between a first time (e.g., when results are displayed) and a second time (e.g., a later point in the session). The absence of input is defined as no actions such as scrolling through results, reading descriptions, opening links, or playing media associated with the results. This allows systems to infer user disinterest or distraction, enabling adaptive responses like refreshing content, prompting further interaction, or logging the inactivity for analytics. The approach ensures precise detection of engagement by considering multiple interaction types, distinguishing between passive viewing and active engagement. This can improve user experience by dynamically adjusting content or triggering notifications when engagement drops. The method is applicable to web browsers, search engines, or any system displaying searchable content.

Claim 6

Original Legal Text

6. The method of claim 1, further comprising adjusting the threshold time based on an average time between inputs associated with a user.

Plain English Translation

This method improves a system by changing how long it waits for a user's next action based on how quickly that user usually responds.

Claim 8

Original Legal Text

8. The method of claim 1, wherein determining the second time when the second speech was received comprises measuring, via the user input device, a time when an earliest pronunciation subsequent to the first time was received.

Plain English Translation

This invention relates to speech recognition systems that process sequential speech inputs. The problem addressed is accurately determining the timing of subsequent speech inputs to improve speech recognition accuracy in conversational or multi-speaker environments. The method involves a speech recognition system that receives a first speech input at a first time. The system then processes a second speech input, which is received after the first. To determine the precise timing of the second speech input, the system measures the time when the earliest pronunciation following the first time is detected. This measurement is performed using a user input device, such as a microphone or other audio capture hardware. The system identifies the earliest point in the second speech input where pronunciation begins, ensuring accurate synchronization between the two inputs. This helps distinguish between overlapping or sequential speech segments, improving recognition performance in dynamic environments. The method may also involve additional steps such as filtering background noise or adjusting for latency in the input device to enhance timing precision. The overall goal is to improve the reliability of speech recognition by accurately tracking the temporal relationship between consecutive speech inputs.

Claim 9

Original Legal Text

9. The method of claim 1, wherein determining the first time when the browsing search results were generated for display comprises detecting, using the control circuitry, a time when signals transmitted to pixels of a display screen first changed subsequent to the first time.

Plain English Translation

This invention relates to tracking the generation of browsing search results on a display screen. The problem addressed is accurately determining the precise moment when search results are first displayed to a user, which is critical for performance monitoring, user experience analysis, and advertising metrics. Existing methods may rely on timestamps from search engines or application logs, which can be imprecise due to processing delays or rendering variations. The solution involves a method for detecting the exact time when browsing search results are first rendered on a display screen. A system with control circuitry monitors signals transmitted to the display screen's pixels. The circuitry identifies the first instance when these signals change after the search results are generated, indicating the moment the results are visually presented to the user. This approach provides a more accurate timestamp than traditional methods by directly measuring the display output rather than relying on indirect indicators. The method may also involve analyzing the content of the search results to confirm they are fully rendered before recording the timestamp. This ensures that partial or incomplete displays are not mistakenly recorded as the initial display time. The system can be integrated into web browsers, search engines, or other applications where precise display timing is required. The invention improves the accuracy of performance metrics, user behavior tracking, and advertising effectiveness measurements by providing a reliable way to determine when content is actually visible to the user.

Claim 11

Original Legal Text

11. The system of claim 10, wherein the control circuitry is configured, when determining that no input associated with the browsing search results was received via the user input device between the first time and the second time, to determine that no input to scroll through the browsing search results, read descriptions of the browsing search results, open the browsing search results, or play the browsing search results was received via the user input device between the first time and the second time.

Plain English Translation

This invention relates to a system for monitoring user interaction with browsing search results, particularly in scenarios where a user may be distracted or disengaged. The system addresses the problem of determining whether a user is actively engaging with search results or has become inactive, which is useful for applications like advertising, user behavior analysis, or content recommendation. The system includes control circuitry that tracks user input via a user input device, such as a mouse, keyboard, or touchscreen, over a defined time period between a first time and a second time. If no input is detected during this interval, the system concludes that the user did not perform any actions related to the search results, such as scrolling, reading descriptions, opening links, or playing media. This determination helps assess user engagement levels and can trigger follow-up actions, such as prompting the user or adjusting content delivery. The control circuitry may also compare the time interval to a predefined threshold to classify the user's behavior as inactive or disengaged. This functionality is particularly useful in environments where user attention is critical, such as online advertising or educational platforms. The system ensures that passive or distracted users are identified, allowing for more accurate analytics and improved user experience.

Claim 15

Original Legal Text

15. The system of claim 10, wherein the control circuitry is further configured to adjust the threshold time based on an average time between inputs associated with a user.

Plain English Translation

A system for managing user input processing includes control circuitry that monitors and adjusts the timing of input responses. The system operates in the domain of user interface technology, specifically addressing the problem of optimizing response times to user inputs to improve efficiency and reduce delays. The control circuitry is designed to dynamically adjust a threshold time, which determines how long the system waits before processing an input, based on the average time between inputs from a specific user. This adaptive approach ensures that the system responds promptly when the user is actively interacting with it, while avoiding unnecessary delays when the user is less active. The system may also include input detection circuitry to capture user inputs and processing circuitry to execute commands based on those inputs. By analyzing the user's input patterns, the system can tailor its response behavior to enhance the overall user experience. This adaptive threshold adjustment helps balance responsiveness with system efficiency, particularly in applications where rapid input sequences are common, such as gaming, touchscreen interfaces, or voice command systems. The system may be implemented in various devices, including smartphones, tablets, or other interactive systems where input timing plays a critical role in performance.

Claim 17

Original Legal Text

17. The system of claim 10, wherein the control circuitry is configured, when determining the second time when the second speech was received, to measure, via the user input device, a time when an earliest pronunciation subsequent to the first time was received.

Plain English Translation

This invention relates to speech recognition systems designed to improve the accuracy of detecting and processing sequential speech inputs. The problem addressed is the challenge of distinguishing between overlapping or closely timed speech inputs from multiple users or sources, which can lead to misinterpretation or loss of data in interactive systems. The system includes control circuitry that processes speech inputs from a user input device, such as a microphone. When a first speech input is received at a first time, the system monitors for subsequent speech inputs. To determine the timing of a second speech input, the control circuitry measures the time when the earliest pronunciation following the first input is detected. This ensures that the system accurately identifies the start of the second speech input, even if it occurs shortly after the first, reducing errors in speech segmentation and recognition. The system may also include additional features, such as adjusting processing parameters based on environmental noise levels or user preferences, to further enhance speech detection accuracy. The control circuitry may also filter out background noise or non-speech sounds to isolate valid speech inputs. By precisely tracking the timing of sequential speech inputs, the system improves the reliability of interactive applications, such as voice assistants, transcription services, or real-time communication tools.

Claim 18

Original Legal Text

18. The system of claim 10, wherein the control circuitry is configured, when determining the first time when the browsing search results were generated for display, to detect a time when signals transmitted to pixels of a display screen first changed subsequent to the first time.

Plain English Translation

A system for tracking the display of browsing search results includes control circuitry that determines the exact moment when search results are first presented on a display screen. The circuitry identifies this moment by detecting the first instance when signals sent to the display screen's pixels change after the search results are generated. This change in pixel signals indicates that the display has updated to show the new content. The system may also include a display screen, a processor, and memory storing instructions for executing the tracking function. The control circuitry may further analyze user interactions with the displayed search results, such as clicks or selections, to gather usage data. The system ensures accurate timing measurements by correlating the generation of search results with the actual display update, which is critical for performance analytics, user behavior studies, or advertising metrics. The technology addresses the challenge of precisely determining when content is visually presented to a user, which is essential for reliable data collection in digital environments.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

Patent Metadata

Filing Date

May 24, 2017

Publication Date

December 6, 2022

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, FAQs, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “Methods and systems for correcting, based on speech, input generated using automatic speech recognition” (US-11521608). https://patentable.app/patents/US-11521608

© 2026 Nomic Interactive Technology LLC. Machine-readable context available at /api/llm-context/US-11521608. See llms.txt for full attribution policy.