Artificial Intelligence-Based Text-To-Speech System and Method

PublishedAugust 22, 2023

Assigneenot available in USPTO data we have

Technical Abstract

Patent Claims

17 claims

Legal claims defining the scope of protection, as filed with the USPTO.

2. The TTS system of claim 1 wherein the operations further comprise converting a frequency domain signal combined from a neural network and the pre-existing knowledgebase into the corrected speech signal.

3. The TTS system of claim 1 wherein the data in the pre-existing knowledgebase of phonemes comprises average basic acoustic signal data of how a speaker speaks derived from the recorded audible speech.

4. The TTS system of claim 1 wherein the operations further comprise correcting for psychoacoustic perceived speech signal distortions of the pre-existing knowledgebase of phonemes.

5. The TTS system of claim 1 wherein the operations further comprise upsampling a frequency of an input vector to another frequency of an intermediate vector.

6. The TTS system of claim 5, wherein the input vector comprises at least one of a base frequency, a phoneme duration, and a phoneme sequence.

7. The TTS system of claim 1 wherein the operations further comprise correcting voiced phonemes of the pre-existing knowledgebase of phonemes.

8. The TTS system of claim 1 wherein the operations further comprise correcting unvoiced phonemes of the pre-existing knowledgebase of phonemes.

9. The TTS system of claim 1 wherein a neural network is configured based on psychoacoustic modeling of phonemes.

11. The method of claim 10 wherein the operations further comprise converting a frequency domain signal combined from a neural network and the pre-existing knowledgebase into the corrected speech signal.

12. The method of claim 10 wherein the data in the pre-existing knowledgebase of phonemes comprises average basic acoustic signal data of how a speaker speaks derived from the recorded audible speech.

13. The method of claim 10 wherein the operations further comprise correcting for psychoacoustic perceived speech signal distortions of the pre-existing knowledgebase of phonemes.

14. The method of claim 10 wherein the operations further comprise upsampling a frequency of an input vector to another frequency of an intermediate vector.

15. The method of claim 14, wherein the input vector comprises at least one of a base frequency, a phoneme duration, and a phoneme sequence.

16. The method of claim 10 wherein the operations further comprise correcting voiced phonemes of the pre-existing knowledgebase of phonemes.

17. The method of claim 10 wherein the operations further comprise correcting unvoiced phonemes of the pre-existing knowledgebase of phonemes.

18. The method of claim 10 wherein a neural network is configured based on psychoacoustic modeling of phonemes.

20. The non-transitory computer-readable medium of claim 19 wherein the operations further comprise correcting for psychoacoustic perceived speech signal distortions of the pre-existing knowledgebase of phonemes.

Patent Metadata

Filing Date

Unknown

Publication Date

August 22, 2023

Inventors

Martin Reber

Vijeta Avijeet

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search