Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.
1. A method for identifying a possible error made by a speech recognition system comprising: with an apparatus using at least one hardware-implemented processor, identifying an instance where the system rejects a first hypothesis of a first utterance, followed by the system accepting a second hypothesis of a second utterance, wherein the first and second hypotheses substantially match word-for-word.
A speech recognition error detection method uses a processor to identify potential errors by finding instances where the speech recognition system rejects an initial hypothesis for a first spoken phrase, but then accepts a subsequent hypothesis for a second spoken phrase, where the rejected and accepted hypotheses are almost identical, differing by few words. This identifies situations where the system may have incorrectly rejected the initial, correct hypothesis.
2. The method of claim 1 , wherein the first and second hypotheses substantially match word-for-word by matching word-for-word, except that one of the first and second hypotheses includes at least one additional recognized model that is negligible for purposes of identifying the possible error.
This method identifies potential errors in a speech recognition system. Using a hardware-implemented processor, the system detects a specific event sequence: first, the speech recognition system rejects a proposed transcription (first hypothesis) for a spoken phrase (first utterance). Subsequently, it accepts another proposed transcription (second hypothesis) for a different spoken phrase (second utterance). A possible error is flagged when these two hypotheses "substantially match word-for-word." This "substantially matching" condition means the hypotheses are identical word-for-word, with the allowance that one of them may include one or more additional recognized models or words that are considered insignificant for the purpose of identifying the potential error. ERROR (embedding): Error: Failed to save embedding: Could not find the 'embedding' column of 'patent_claims' in the schema cache
3. The method of claim 1 , wherein a count of occurrences of the possible error is used in adjusting an adaptation of a model for a word associated with the possible error.
The speech recognition error detection method, which identifies potential errors when the system rejects a first hypothesis and accepts a similar second hypothesis for different utterances, uses a count of these identified error occurrences to adjust the acoustic model of the word associated with the error. This allows the system to learn from past mistakes and improve recognition accuracy for specific words by analyzing where it has made these errors.
4. The method of claim 1 , wherein the system rejects the first hypothesis due to a confidence factor of the first hypothesis not exceeding an acceptance threshold.
In the error detection method for speech recognition, the system rejects the initial hypothesis when its confidence score is below a set threshold. The method then identifies cases where the system later accepts a nearly identical hypothesis for a subsequent utterance, suggesting that the initial rejection may have been incorrect due to a low confidence factor, even if the hypothesis was correct.
5. The method of claim 1 , wherein the system rejects at least one word in the first hypothesis.
In the error detection method for speech recognition that flags similar rejected and accepted utterances, the first hypothesis is rejected because the speech recognition system could not properly recognize at least one word in the spoken phrase. This method identifies that even though a word (or words) were not recognized and rejected, the second utterance had a similar hypothesis and was accepted, indicating that there may be an issue with the first.
6. The method of claim 1 , wherein the first and second utterances are spoken consecutively, proximately or within a predetermined amount of time.
The speech recognition error detection method, which analyzes rejected and accepted hypotheses that are nearly identical from separate utterances, is particularly effective when the two utterances are spoken in close succession - consecutively, nearly simultaneously, or within a specified time window. This proximity increases the likelihood that the utterances refer to the same intended phrase, making the comparison more meaningful.
7. A method for identifying a possible error made by a speech recognition system comprising: with an apparatus using at least one hardware-implemented processor, identifying when the system generates first and second hypotheses of two utterances and the system accepts the second hypothesis, wherein the two hypotheses do not match word-for-word, but the hypotheses mostly match word-for-word.
A speech recognition error detection method using a processor compares the system's generated hypotheses for two separate utterances. It identifies potential errors when the first and second hypotheses are not identical but mostly match word-for-word, and the system accepts the second hypothesis. This highlights cases where the system may have incorrectly determined an error and rejected the first utterance.
8. The method of claim 7 , wherein the hypotheses mostly match word-for-word by matching word-for-word except for a predetermined number of words.
In the method for speech recognition error detection, the first and second hypotheses from different utterances are considered to "mostly match word-for-word" if they are identical except for a limited, pre-defined number of words. This allows for some variation in the utterances while still enabling the system to identify likely errors in the initial rejection.
9. The method of claim 7 , wherein a count of occurrences of the possible error is used in adjusting an adaptation a model for a word associated with the possible error.
The method for speech recognition error detection, which compares mostly similar but non-identical hypotheses from different utterances, uses a count of the identified error occurrences to improve the acoustic model for a word associated with the potential error. This allows the system to learn from the identified mistakes and adapt its speech recognition parameters for specific words and phrases.
10. The method of claim 7 , wherein the two utterances are spoken consecutively, proximately, or within a predetermined amount of time.
The speech recognition error detection method, comparing mostly similar hypotheses from different utterances, is most effective when the two utterances are spoken consecutively, proximately, or within a predetermined time period. This proximity implies a contextually related nature between the utterances.
11. The method of claim 7 , wherein there are no intervening utterances that indicate that the first of the two utterances was correctly recognized by the system.
The speech recognition error detection method, where the system compares mostly matching but non-identical hypotheses from different utterances, requires that no intervening utterances are correctly recognized between the first and second utterances to suggest the first utterance was actually correct. This prevents situations where the first utterance was actually incorrect.
12. The method of claim 7 , wherein the two hypotheses differ in that a word in the second hypothesis is substituted with a word in the first hypothesis.
In the speech recognition error detection method based on similar but not identical hypotheses for separate utterances, the difference between the first and second hypotheses is that a word in the second, accepted hypothesis is a substitution for a word in the rejected first hypothesis. This enables the detection of word substitutions that lead to incorrect rejections.
13. The method of claim 7 , wherein the hypotheses differ in that a word in the second hypothesis is substituted with garbage in the first hypothesis.
In the method of speech recognition error detection comparing slightly different hypotheses for different utterances, the difference between the first and second hypotheses is that a real word in the accepted second hypothesis is replaced with "garbage" or an unrecognized word in the rejected first hypothesis. This identifies cases where the system may have failed to recognize a word, leading to rejection of the utterance.
14. A method for identifying a possible error made by a speech recognition system, comprising: with an apparatus using at least one hardware-implemented processor, identifying when a hypothesis generated by the system does not match an expected response word-for-word, but the hypothesis mostly matches the expected response word-for-word.
A speech recognition error detection method uses a processor to compare a hypothesis generated by the system against an expected response. It identifies a potential error when the generated hypothesis doesn't perfectly match the expected response word-for-word, but mostly does. This helps detect instances where the system made a minor mistake but understood the overall meaning.
15. The method of claim 14 , wherein a count of occurrences of the possible errors is used in adjusting adaptation of a model for a word associated with the possible errors.
The method for speech recognition error detection by comparing a generated hypothesis to an expected response, where the generated hypothesis mostly matches the expected response, uses a count of the detected error occurrences to adjust the model for the word associated with the error. This improves recognition of specific words over time by learning from mismatches against known correct responses.
16. The method of claim 14 , wherein the hypothesis mostly matches the expected response word-for-word by matching word-for-word except for one word.
In the error detection method for speech recognition, where a generated hypothesis is compared against an expected response, the hypothesis "mostly matches" when it is identical to the expected response except for only one differing word. For example, if the expected response is "turn on the light" and the generated hypothesis is "turn of the light," this method flags that as a possible error.
17. An apparatus for identifying a possible error made by a speech recognition system comprising: a processor that is operable to identify an instance where the system rejects a first hypothesis of a first utterance, followed by the system accepting a second hypothesis of a second utterance, wherein the first and second hypotheses substantially match word-for-word.
A speech recognition error detection apparatus includes a processor programmed to identify potential errors in the speech recognition system by detecting when the system rejects a first hypothesis for a first spoken utterance and then accepts a second hypothesis for a second spoken utterance, where the two rejected and accepted hypotheses are substantially identical (word-for-word).
18. The apparatus of claim 17 , wherein the first and second hypotheses substantially match word-for-word by matching word-for-word, except that one of the first and second hypotheses includes at least one additional recognized model that is negligible for purposes of identifying the possible error.
The speech recognition error detection apparatus, which identifies identical rejected/accepted utterances, refines the matching criteria by allowing slight variations. Specifically, the first and second hypotheses are considered to substantially match even if one of them includes an additional recognized word that is deemed "negligible" for error identification.
19. The apparatus of claim 17 , wherein a count of occurrences of the possible error is used in adjusting an adaptation of a model for a word associated with the possible error.
The speech recognition error detection apparatus, finding possible errors by identifying similar rejected and accepted hypotheses, uses a count of the detected error events to adjust the acoustic model for the word associated with the error. This adaptation enhances future recognition accuracy for words where such errors frequently occur.
20. The apparatus of claim 17 , wherein the system rejects the first hypothesis due to a confidence factor of the first hypothesis not exceeding an acceptance threshold.
The speech recognition error detection apparatus, comparing nearly identical rejected and accepted hypotheses, rejects the first hypothesis due to its confidence score falling below a predefined acceptance threshold. The system flags this because a subsequent similar hypothesis was accepted, indicating a potential incorrect rejection due to confidence issues.
21. The apparatus of claim 17 , wherein the system rejects at least one word in the first hypothesis.
The speech recognition error detection apparatus identifies when the speech recognition system rejects at least one word in the first hypothesis, which is similar to the accepted second hypothesis from another utterance. The apparatus detects that even with an unrecognized word, the context of the overall phrase was similar and accepted later.
22. The apparatus of claim 17 , wherein the first and second utterances are spoken consecutively, proximately or within a predetermined amount of time.
The speech recognition error detection apparatus, designed to find errors through similar rejected/accepted hypotheses of distinct utterances, is optimized for scenarios where the first and second utterances are spoken consecutively, proximately, or within a set time limit. These conditions suggests that the phrases are related or the same.
23. An apparatus for identifying a possible error made by a speech recognition system comprising: a processor that is operable to identify when the system generates first and second hypotheses of two utterances and the system accepts the second hypothesis, wherein the two hypotheses do not match word-for-word, but the hypotheses mostly match word-for-word.
A speech recognition error detection apparatus includes a processor that finds situations where the speech recognition system creates first and second hypotheses from separate utterances, accepting the second. When those hypotheses don't perfectly match but are highly similar, the system identifies a potential error with the rejection of the first utterance.
24. The apparatus of claim 23 , wherein the hypotheses mostly match word-for-word by matching word-for-word except for a predetermined number of words.
The apparatus for speech recognition error detection, detecting mostly-matching hypotheses, defines "mostly match" as matching word-for-word except for a predetermined number of words. This enables flexibility in error identification by accepting minor variations in the compared hypotheses.
25. The apparatus of claim 23 , wherein a count of occurrences of the possible error is used in adjusting an adaptation a model for a word associated with the possible error.
The speech recognition error detection apparatus, which uses two different hypotheses (with the second accepted), tracks the number of times these errors happen and updates the model for the words related to those potential errors. This improves the accuracy of the speech recognition over time by using previous results to increase precision.
26. The apparatus of claim 23 , wherein the two utterances are spoken consecutively, proximately, or within a predetermined amount of time.
The speech recognition error detection apparatus, which analyzes different hypotheses of separate utterances, checks for the utterances happening consecutively, nearly simultaneously, or within a specified time period. This improves accuracy because it focuses on utterances that happen close in time to each other, inferring potential duplication.
27. The apparatus of claim 23 , wherein there are no intervening utterances that indicate that the first of the two utterances was correctly recognized by the system.
The apparatus for speech recognition error detection comparing mostly-matching hypotheses ensures the absence of intervening utterances that are correctly recognized to indicate that the first of the two utterances was actually correctly understood. This refines accuracy by verifying that there was no success from a similar time.
28. The apparatus of claim 23 , wherein the two hypotheses differ in that a word in the second hypothesis is substituted with a word in the first hypothesis.
The speech recognition error detection apparatus, which compares the first and second hypotheses, identifies that a word in the accepted second hypothesis is a replacement for a word in the rejected first hypothesis. This enables the apparatus to detect errors in word substitutions.
29. The apparatus of claim 23 , wherein the hypothesis differ in that a word in the second hypothesis is substituted with garbage in the first hypothesis.
The speech recognition error detection apparatus, which compares the first and second hypotheses of different utterances, identifies the difference being that a word in the accepted second hypothesis replaced an unrecognized word (garbage) in the first hypothesis. This enables identifying instances when word recognition failures occur.
30. An apparatus for identifying a possible error rate made by a speech recognition system, comprising: a processor that is operable to identify when a hypothesis generated by the system does not match an expected response word-for-word, but the hypothesis mostly matches the expected response word-for-word.
A speech recognition error detection apparatus uses a processor to compare a hypothesis generated by the speech recognition system to an expected response. The processor identifies a potential error rate when the generated hypothesis does not precisely match the expected response, but is mostly the same, indicating an instance when a minor error may have occurred.
31. The apparatus of claim 30 , wherein a count of occurrences of the possible error is used in adjusting an adaptation of a model for a word associated with the possible error.
The apparatus for speech recognition error detection through hypothesis-vs-expected-response, counts the number of times possible errors occur in hypothesis-vs-expected-response scenarios and adapts the acoustic model for the word in the error. This process strengthens speech recognition and reduces error over time.
32. The apparatus of claim 30 , wherein the hypothesis mostly matches the expected response word-for-word by matching word-for-word except for a predetermined number of words.
The error detection apparatus, which finds hypotheses mostly resembling an expected answer, finds them similar if they match word-for-word except for a defined number of words. This enables the system to ignore insignificant variances when discovering mistakes.
Unknown
October 21, 2014
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.