7024362

Objective Measure for Estimating Mean Opinion Score of Synthesized Speech

PublishedApril 4, 2006
Assigneenot available in USPTO data we have
Technical Abstract

Patent Claims
29 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1

1. A method for estimating naturalness of synthesized speech, wherein naturalness is a subjective quality of synthesized speech, the method comprising: generating a set of synthesized utterances from textual information; subjectively rating each of the synthesized utterances; calculating a score for each of the synthesized utterances using an objective measure, the objective measure being a function of textual information used to form the utterances; ascertaining a relationship between the scores of the objective measure and subjective ratings of the synthesized utterances; and using the relationship to estimate naturalness of synthesized speech.

2

2. The method of claim 1 , wherein the objective measure is a function of a concatenative cost of the textual information used to form words in the utterances.

3

3. The method of claim 2 , wherein the objective measure comprises an indication of a position of a speech unit in a phrase.

4

4. The method of claim 2 , wherein the objective measure comprises an indication of a position of a speech unit in a word.

5

5. The method of claim 2 , wherein the objective measure comprises an indication of a category for a phoneme preceding a speech unit.

6

6. The method of claim 2 , wherein the objective measure comprises an indication of a category for a phoneme following a speech unit.

7

7. The method of claim 2 , wherein the objective measure comprises an indication of a category for the tone of a preceding speech unit.

8

8. The method of claim 2 , wherein the objective measure comprises an indication of a category for the tone of a following speech unit.

9

9. The method of claim 2 , wherein the objective measure comprises an indication of a prosodic mismatch between successive speech units.

10

10. The method of claim 2 , wherein the objective measure comprises an indication of level of stress of a speech unit.

11

11. The method of claim 2 , wherein the objective measure score for each synthesized utterance is a function of a length of said each synthesized utterance.

12

12. The method of claim 11 , wherein the length comprises a number of speech units in an utterance.

13

13. The method of claim 2 , wherein calculating a score includes generating context vectors for each synthesized utterance wherein the context vectors comprise at least two coordinates of textual information from a set including: an indication of a position of a speech unit in a phrase; an indication of a position of a speech unit in a word; an indication of a category for a phoneme preceding a speech unit; an indication of a category for a phoneme following a speech unit; an indication of a category for the tone of a preceding speech unit; an indication of a category for the tone of a following speech unit; and an indication of a level of stress of a speech unit; and an indication of a degree of coupling with a neighboring speech unit.

14

14. The method of claim 13 , wherein calculating a score includes generating context vectors for each of the synthesized utterances wherein the context vectors comprise at least three coordinates of textual information from the set.

15

15. The method of claim 13 , wherein calculating a score includes generating context vectors for each of the synthesized utterances wherein the context vectors comprise at least four coordinates of textual information from the set.

16

16. The method of claim 13 , wherein calculating a score includes generating context vectors for each of the synthesized utterances wherein the context vectors comprise at least six coordinates of textual information from the set.

17

17. The method of claim 13 , wherein the objective measure includes an indication of prosodic mismatch of successive speech units.

18

18. The method of claim 13 , wherein the coordinates are weighted.

19

19. A method for developing a speech synthesizer, the method comprising: obtaining a set of synthesized utterances based on textual information from the speech synthesizer; subjectively rating naturalness of each of the synthesized utterances; calculating a score for each of the synthesized utterances using an objective measure, the objective measure being a function of textual information of speech units for each of the utterances; ascertaining a relationship between the scores of the objective measure and ratings of the synthesized utterances; varying a parameter of the speech synthesizer; obtaining speech units for another utterance after the parameter of the speech synthesizer has been varied;and calculating a second score for said another utterance using the objective measure; and using the relationship and the second score to estimate naturalness of said another utterance.

20

20. The method of claim 19 , wherein the objective measure is a function of a concatenative cost of the textual information used to form a word in each utterance.

21

21. The method of claim 20 , wherein obtaining speech units for another utterance includes obtaining speech units for a second set of utterances, wherein calculating a second score includes calculating corresponding scores for each of the utterances of the second set of utterances, and wherein using the relationship includes using the relationship to estimate naturalness of each of said second set of utterances.

22

22. The method of claim 20 , wherein the parameter comprises an amount of speech units available for synthesis.

23

23. The method of claim 20 , wherein the parameter comprises an algorithm for selecting speech units.

24

24. The method of claim 20 , wherein calculating a score includes generating context vectors for each synthesized utterance wherein the context vectors comprise at least two coordinates of textual information from a set including: an indication of a position of a speech unit in a phrase; an indication of a position of a speech unit in a word; an indication of a category for a phoneme preceding a speech unit; an indication of a category for a phoneme following a speech unit; an indication of a category for the tone of a preceding speech unit; an indication of a category for a tone of a following speech unit; and an indication of a level of stress of a speech unit.

25

25. The method of claim 24 , wherein calculating a score includes generating context vectors for each of the synthesized utterances wherein the context vectors comprise at least three coordinates of textual information from the set.

26

26. The method of claim 24 , wherein calculating a score includes generating context vectors for each of the synthesized utterances wherein the context vectors comprise at least four coordinates of textual information from the set.

27

27. The method of claim 24 , wherein calculating a score includes generating context vectors for each of the synthesized utterances wherein the context vectors comprise at least five coordinates of textual information from the set.

28

28. The method of claim 24 , wherein the objective measure includes an indication of prosodic mismatch of successive speech units.

29

29. The method of claim 24 , wherein the coordinates are weighted.

Patent Metadata

Filing Date

Unknown

Publication Date

April 4, 2006

Inventors

Min Chu
Hu Peng

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “OBJECTIVE MEASURE FOR ESTIMATING MEAN OPINION SCORE OF SYNTHESIZED SPEECH” (7024362). https://patentable.app/patents/7024362

© 2026 Patentable. All rights reserved.

Patentable is a research and drafting-assistant tool, not a law firm, and does not provide legal advice. Documents we generate are drafts for review by a licensed patent attorney.