Voice User Interface

PublishedMay 3, 2022

Assigneenot available in USPTO data we have

InventorsCarlos VAQUERO AVILÉS-CASCO David MARTÍNEZ GONZÁLEZ Ryan ROBERTS

Technical Abstract

Patent Claims

19 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A method of speaker authentication, comprising: receiving a speech signal; while the speech signal is received, dividing the speech signal into segments; following each segment, obtaining an authentication score based on said segment and previously received segments, wherein the authentication score represents a probability that the speech signal comes from a specific registered speaker, wherein obtaining the authentication score following each segment comprises: obtaining a first authentication score based on a first segment of the segments; and upon receipt of each subsequent segment of the speech signal received after the first segment, obtaining a respective subsequent authentication score based on said subsequent segment and the previously received segments of the speech signal by forming a weighted sum of the first authentication score and each subsequent authentication score; and outputting an authentication result based on the first authentication score and each subsequent authentication score in response to an authentication request; wherein the weighted sum is formed by performing one of the following: applying weights that depend on respective signal-to-noise ratios applicable to the respective segments; and applying weights that depend on respective quantities of speech present in the respective segments.

2. A method according to claim 1 , wherein the authentication score is obtained by comparing features of the speech signal with a model generated during enrolment of the specific registered speaker.

3. A method according to claim 1 , wherein the speech signal represents multiple discrete sections of speech.

4. A method according to claim 1 , wherein the first segment represents a trigger phrase, the method comprising performing the steps of obtaining the authentication score and outputting the authentication result in response to detecting that the trigger phrase has been spoken.

5. A method according to claim 4 , comprising, after the trigger phrase, dividing the speech signal into segments of equal lengths.

6. A method according to claim 5 , comprising, after the trigger phrase, dividing the speech signal into segments comprising equal durations of net speech.

7. A method according to claim 1 , comprising comparing the authentication score with a first threshold score, and determining a positive authentication result if the authentication score exceeds the first threshold score.

8. A method according to claim 7 , wherein the first threshold score is set in response to a signal received from a separate process.

9. A method according to claim 8 , comprising receiving the signal from the separate process, and selecting the first threshold score from a plurality of available threshold scores.

10. A method according to claim 8 , wherein the signal received from the separate process indicates a requested level of security.

11. A method according to claim 8 , wherein the separate process is a speech recognition process.

12. A method according to claim 1 , wherein the authentication request requests that the authentication result be output when the authentication score exceeds a threshold.

13. A method according to claim 1 , wherein the step of, following each segment of the segments, obtaining an authentication score based on said segment and previously received segments comprises: obtaining the authentication score by merging the first authentication score and each subsequent authentication score.

14. A method according to claim 13 , comprising forming the weighted sum of the first authentication score and each subsequent authentication score by performing one or more of the following: disregarding some or all outlier scores, and disregarding low outlier scores while retaining high outlier scores.

15. A method according to claim 1 , wherein the step of, following each segment of the segments, obtaining an authentication score based on said segment and previously received segments comprises: following each subsequent segment of the speech signal, combining the subsequent segment of the speech signal with each previously received segment of the speech signal to form a new combined speech signal; and obtaining the authentication score based on said new combined speech signal.

16. A method according to claim 1 , wherein the step of, following each segment, obtaining an authentication score based on said segment and previously received segments comprises: extracting features from each segment of the segments; obtaining the first authentication score based on the extracted features of the first segment of the speech signal; and following each subsequent segment of the speech signal, combining the extracted features of the subsequent segment of the speech signal with the extracted features of each previously received segment of the speech signal; and obtaining the authentication score based on said combined extracted features.

17. A method according to claim 1 , comprising, after determining a positive authentication result: starting a timer that runs for a predetermined period of time; and treating the specific registered speaker as authenticated for as long as the timer is running.

18. A non-transitory computer readable storage medium having computer-executable instructions stored thereon that, when executed by processor circuitry, cause the processor circuitry to perform a method according to claim 1 .

19. A device for processing a received signal representing a user's speech, for performing speaker recognition, wherein the device is configured to: receive a speech signal; whilst the speech signal is received, divide the speech signal into segments; following each segment, obtain an authentication score based on said segment and previously received segments, wherein the authentication score represents a probability that the speech signal comes from a specific registered speaker, wherein obtaining the authentication score following each segment comprises: obtaining a first authentication score based on a first segment of the segments; and upon receipt of each subsequent segment of the speech signal received after the first segment, obtaining a respective subsequent authentication score based on said subsequent segment and the previously received segments of the speech signal by forming a weighted sum of the first authentication score and each subsequent authentication score; and output an authentication result based on the first authentication score and each subsequent authentication score in response to an authentication request; wherein the weighted sum is formed by performing one of the following: applying weights that depend on respective signal-to-noise ratios applicable to the respective segments; and applying weights that depend on respective quantities of speech present in the respective segments.

Patent Metadata

Filing Date

Unknown

Publication Date

May 3, 2022

Inventors

Carlos VAQUERO AVILÉS-CASCO

David MARTÍNEZ GONZÁLEZ

Ryan ROBERTS

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search