Speaker Adaptation of Vocabulary for Speech Recognition

PublishedOctober 25, 2011

Assigneenot available in USPTO data we have

Technical Abstract

Patent Claims

15 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A computer system for speaker adaptation of vocabulary for a speech recognition system, the computer system comprising: at least one processor adapted to: identify a pronunciation style of at least one phoneme for a speaker from a plurality of pronunciation styles for the at least one phoneme, the identification of the pronunciation style being based upon a reading of an enrollment script by the speaker, the enrollment script including at least one keyword, wherein the at least one keyword contains the at least one phoneme and is representative of a category of words having the at least one phoneme; determine at least one pronunciation style for each of the words in the category consistent with the identified pronunciation style; and restrict a vocabulary used in the speech recognition system to pronunciations consistent with the identified pronunciation style.

2. The computer system of as claimed in claim 1 , further comprising a component operable for recording an enrollment audio corresponding to a reading of the enrollment script by the speaker, the enrollment audio including predetermined keywords including the at least one keyword.

3. The computer system as claimed in claim 2 , wherein the predetermined keywords are representative of words having a phonetically similar baseform.

4. The computer system as claimed in claim 1 , wherein the at least one processor is further adapted to align phonetic units from the enrollment audio to corresponding phonetic units of the enrollment script.

5. The computer system as claimed in claim 1 , wherein the at least one processor is further adapted to: retain, in a recognized vocabulary for the speaker, one of plural alternative pronunciations, the one of plural alternative pronunciations being selected based upon a best match with a phonetic unit from the reading of the enrollment script; and exclude, from the recognized vocabulary for the speaker, remaining alternative pronunciations of the plural alternative pronunciations.

6. The computer system as claimed in claim 1 , wherein the at least one processor is further adapted to identify pronunciation styles of plural phonemes for the speaker, the identification of the pronunciation styles being based upon the reading of the enrollment script.

7. A computer-readable storage device storing instructions that when executed by at least one processor cause the at least one processor to execute acts of: identifying a pronunciation style of at least one phoneme for a speaker from a plurality of pronunciation styles for the at least one phoneme, the identification of the pronunciation style being based upon a reading of an enrollment script by the speaker, the enrollment script including at least one keyword, wherein the at least one keyword contains the at least one phoneme and is representative of a category of words having the at least one phoneme; determining at least one pronunciation style for each of the words in the category consistent with the identified pronunciation style; and restricting a vocabulary used in the speech recognition system to pronunciations consistent with the identified pronunciation style.

8. The computer-readable storage device of claim 7 , further comprising instructions that when executed by the at least one processor cause the at least one processor to record an enrollment audio corresponding to a reading of the enrollment script by the speaker, the enrollment audio including predetermined keywords including the at least one keyword.

9. The computer-readable storage device of claim 8 , wherein the keywords are representative of words having a phonetically similar baseform.

10. The computer-readable storage device of claim 8 further comprising instructions that when executed by the at least one processor cause the at least one processor to align phonetic units from the enrollment audio to corresponding phonetic units of the enrollment script.

11. The computer-readable storage device of claim 8 further comprising instructions that when executed by the at least one processor cause the at least one processor to perform acts of: retaining in a recognized vocabulary for the speaker one of plural alternative pronunciations, the one of plural alternative pronunciations being selected based upon a best match with a phonetic unit from the reading of the enrollment script; and excluding from the recognized vocabulary for the speaker remaining alternative pronunciations of the plural alternative pronunciations.

12. A method for adapting a speaker vocabulary in a speech recognition system, the method comprising: identifying a pronunciation style of at least one phoneme for a speaker from a plurality of pronunciation styles for the at least one phoneme, the identification of the pronunciation style being based upon a reading of an enrollment script by the speaker, the enrollment script including at least one keyword, wherein the at least one keyword contains the at least one phoneme and is representative of a category of words having the at least one phoneme; determining at least one pronunciation style for each of the words in the category consistent with the identified pronunciation style; and restricting a vocabulary used in the speech recognition system to pronunciations consistent with the identified pronunciation style.

13. The method of claim 12 , further comprising recording an enrollment audio corresponding to a reading of the enrollment script by the speaker, the enrollment audio including predetermined keywords having a phonetically similar baseform.

14. The method of claim 13 , further comprising further comprising aligning phonetic units from the enrollment audio to corresponding phonetic units of the enrollment script.

15. The method of claim 13 , further comprising acts of: retaining in a recognized vocabulary for the speaker one of plural alternative pronunciations, the one of plural alternative pronunciations being selected based upon a best match with a phonetic unit from the reading of the enrollment script; and excluding from the recognized vocabulary for the speaker remaining alternative pronunciations of the plural alternative pronunciations.

Patent Metadata

Filing Date

Unknown

Publication Date

October 25, 2011

Inventors

Nitendra Rajput

Ashish Verma

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search