Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.
1. A method, comprising: receiving, with a service computing system from an application developer computing system, voice user interface (VUI) training data of an application that is under development within the application developer computing system; identifying a training phrase within the VUI training data; generating, with training phrase synonym analyzer within the application developer computing system, a training phrase synonym data structure comprising a plurality of linked data pairs, each linked data pair comprising the identified training phrase and a unique synonym to the identified training phrase; receiving, with an acoustic module within the application developer computing system, a selected background noise; receiving, with a language module within the application developer computing system, a selected speaker voice, selected speaker language, and a selected speaker dialect; generating an audio output comprising synthesized human speech of each unique synonym of the identified training phrase in the selected speaker voice, selected speaker language, and selected speaker dialect, the audio output further comprising the selected background noise in the background to the synthesized human speech; audibly presenting the audio output upon a speaker of the application developer computing system and simultaneously capturing the synthesized human speech of each unique synonym of the identified training phrase with a microphone of the application developer computing system; converting the captured synthesized human speech of each unique synonym of the identified training phrase into text (textualized training phrase synonym) with a selected speech to text framework; comparing text of each unique synonym of the identified training phrase with the textualized training phrase synonym corresponding thereto; scoring each textualized training phrase synonym based upon similarity of the textualized training phrase synonym to the text of the unique synonym corresponding thereto; generating an output training data score data structure comprising the score of each textualized training phrase synonym, the text of the training phrase, the text of each textualized training phrase synonym, the selected background noise, the selected speaker voice, the selected speaker language, and the selected speaker dialect, the output training data score data structure ranking those training phrase synonyms that are most misunderstood by the selected speech to text framework to those training phrase synonyms that are accurately understood by the selected speech to text framework; and sending the output training data score data structure to the application developer computing system.
This invention relates to improving voice user interface (VUI) training for applications under development. The method addresses challenges in accurately recognizing synonyms of training phrases in various acoustic conditions and speaker characteristics. The system receives VUI training data from an application developer, including training phrases and their synonyms. A synonym analyzer generates a data structure linking each training phrase with its unique synonyms. The system then synthesizes speech for each synonym using a selected speaker voice, language, and dialect, while incorporating a chosen background noise. The synthesized speech is played through a speaker and simultaneously captured by a microphone. A speech-to-text framework converts the captured audio into text, which is compared to the original synonym text. Each synonym is scored based on recognition accuracy, and the results are compiled into a ranked data structure. This structure includes scores, training phrases, synonyms, background noise, speaker voice, language, and dialect, highlighting which synonyms are most misunderstood by the speech-to-text framework. The ranked data is sent back to the developer to refine VUI training. The method ensures robust VUI performance by identifying and addressing recognition weaknesses in different acoustic and linguistic contexts.
2. The method of claim 1 , further comprising: displaying the output training data score data structure within a graphical user interface (GUI) upon a display of the application developer computing system; modifying the VUI of the application (original VUI) to create a modified VUI of the application within the application developer computing system based upon the output training data score data structure; and sending the application with the modified VUI to an application user computing system, wherein the modified VUI, when called by the application user computing system, has increased accurately understood speech input relative to the original VUI, when called by the application user computing system.
This invention relates to improving the accuracy of voice user interfaces (VUIs) in applications by analyzing training data and modifying the VUI based on performance metrics. The method involves processing training data to generate a score data structure that evaluates the accuracy of speech recognition in the original VUI. This score data structure is displayed in a graphical user interface (GUI) for developers, who then use it to modify the VUI to enhance speech recognition accuracy. The modified VUI is integrated into the application and deployed to user systems, resulting in improved understanding of speech inputs compared to the original VUI. The process leverages performance metrics to iteratively refine the VUI, ensuring better alignment with user speech patterns and reducing recognition errors. This approach is particularly useful in applications where voice interaction is critical, such as virtual assistants, customer service bots, or accessibility tools. The invention focuses on dynamically adapting VUIs based on real-world usage data to enhance user experience and functionality.
3. The method of claim 1 , wherein comparing text of each unique synonym of the identified training phrase with the textualized training phrase synonym corresponding thereto comprises a character comparison between the text of each unique synonym with the textualized training phrase synonym corresponding thereto.
This invention relates to natural language processing and text comparison techniques, specifically for evaluating synonyms of training phrases in machine learning systems. The problem addressed is the need for accurate and efficient comparison of synonyms to ensure proper training of language models, particularly when synonyms may have slight variations in text representation. The method involves identifying a training phrase and generating a set of unique synonyms for that phrase. Each synonym is then compared to a corresponding textualized training phrase synonym using a character-by-character comparison. This comparison determines the degree of similarity between the synonym and the original training phrase, allowing the system to assess whether the synonym is a valid or useful variant for training purposes. The character comparison may involve analyzing individual characters, substrings, or other textual elements to identify differences or similarities between the synonym and the reference textualized synonym. The method ensures that synonyms are accurately evaluated for their relevance and correctness, improving the quality of training data for natural language processing models. By using a character-based comparison, the system can detect even minor textual variations that might affect the model's performance, ensuring that only high-quality synonyms are used in training. This approach enhances the robustness and accuracy of language models by refining the synonym evaluation process.
4. The method of claim 1 , wherein comparing text of each unique synonym of the identified training phrase with the textualized training phrase synonym corresponding thereto comprises a natural language comparison between the text of each unique synonym with the textualized training phrase synonym corresponding thereto.
This invention relates to natural language processing (NLP) and text comparison techniques, specifically for evaluating synonyms of training phrases in machine learning models. The problem addressed is the need for accurate and context-aware comparison of synonyms to improve the performance of language-based systems, such as chatbots, search engines, or voice assistants. The method involves identifying a training phrase from a dataset and generating a set of unique synonyms for that phrase. Each synonym is then compared to a textualized version of the original training phrase using a natural language comparison technique. This comparison assesses semantic similarity, ensuring that synonyms maintain the intended meaning of the original phrase. The comparison may involve analyzing word embeddings, semantic vectors, or other NLP-based similarity metrics to determine how closely each synonym aligns with the original phrase in meaning and context. The method also includes evaluating the synonyms based on their relevance and accuracy in representing the original phrase. Synonyms that do not meet a predefined similarity threshold may be filtered out or adjusted to improve their alignment with the training phrase. This process helps refine the synonym set, ensuring that only high-quality, contextually appropriate alternatives are retained for use in training or deployment. By using natural language comparison techniques, the method improves the reliability of synonym-based language models, reducing errors in interpretation and enhancing the overall accuracy of NLP applications.
5. The method of claim 1 , wherein the speaker converts an electrical signal of the audio output to audio of the audio output and wherein the microphone converts the audio of the audio output to an electrical signal of the audio output.
This invention relates to audio signal processing systems, specifically addressing the conversion between electrical signals and audio signals in a feedback loop. The system includes a speaker and a microphone that work together to facilitate bidirectional conversion of audio signals. The speaker receives an electrical signal representing an audio output and converts it into audible sound. The microphone then captures this audio output and converts it back into an electrical signal. This process enables real-time monitoring and processing of audio signals, which can be used in applications such as audio feedback systems, noise cancellation, or signal analysis. The invention ensures accurate and efficient conversion between electrical and audio forms, maintaining signal integrity throughout the process. The system may also include additional components, such as amplifiers or filters, to enhance signal quality and performance. The described method improves audio signal handling by ensuring seamless conversion between electrical and acoustic domains, addressing challenges in maintaining signal fidelity in feedback loops.
6. A computer program product comprising a first computer readable storage medium having first program instructions embodied therewith, the first program instructions readable by a service computing system to cause the service computing system to: receive, from an application developer computing system, voice user interface (VUI) training data of an application that is under development within the application developer computing system; identify a training phrase within the VUI training data; generate a training phrase synonym data structure comprising a plurality of linked data pairs, each linked data pair comprising the identified training phrase and a unique synonym to the identified training phrase; receive a selected background noise; receive a selected speaker voice, selected speaker language, and a selected speaker dialect; generate an audio output comprising synthesized human speech of each unique synonym of the identified training phrase in the selected speaker voice, selected speaker language, and selected speaker dialect, the audio output further comprising the selected background noise in the background to the synthesized human speech; audibly present the audio output upon a speaker of the application developer computing system and simultaneously capture the synthesized human speech of each unique synonym of the identified training phrase with a microphone of the application developer computing system; convert the captured synthesized human speech of each unique synonym of the identified training phrase into text (textualized training phrase synonym) with a selected speech to text framework; compare text of each unique synonym of the identified training phrase with the textualized training phrase synonym corresponding thereto; score each textualized training phrase synonym based upon similarity of the textualized training phrase synonym to the text of the unique synonym corresponding thereto; generate an output training data score data structure comprising the score of the textualized training phrase synonym, the text of the training phrase, the text of each textualized training phrase synonym, the selected background noise, the selected speaker voice, the selected speaker language, and the selected speaker dialect, the output training data score data structure ranking those training phrase synonyms that are most misunderstood by the selected speech to text framework to those training phrase synonyms that are accurately understood by the selected speech to text framework; and send the output training data score data structure to the application developer computing system.
This invention relates to improving voice user interface (VUI) training for applications under development. The system addresses challenges in accurately recognizing spoken variations of training phrases, which can lead to poor VUI performance. The solution involves generating and evaluating multiple synonyms of a training phrase to assess how well a speech-to-text framework interprets them under different conditions. The system receives VUI training data from an application developer, including a training phrase and its synonyms. It then synthesizes audio outputs of these synonyms using a selected speaker voice, language, dialect, and background noise. The synthesized speech is played through a speaker and simultaneously captured by a microphone. The captured audio is converted to text using a speech-to-text framework, and each textualized synonym is compared to its original form. The system scores each synonym based on recognition accuracy, ranking them from most misunderstood to accurately understood. The results, including scores, training phrases, synonyms, and selected conditions, are sent back to the developer. This helps identify and refine phrases that the speech-to-text framework struggles with, improving VUI robustness.
7. The computer program product of claim 6 , further comprising a second computer readable storage medium having second program instructions embodied therewith, the second program instructions readable by the application developer computing system to cause the application developer computing system to: display the output training data score data structure within a graphical user interface (GUI) upon a display of the application developer computing system; modify the VUI of the application (original VUI) to create a modified VUI of the application within the application developer computing system based upon the output training data score data structure; and send the application with the modified VUI to an application user computing system, wherein the modified VUI, when called by the application user computing system, has increased accurately understood speech input relative to the original VUI, when called by the application user computing system.
Voice user interface (VUI) systems often struggle with accurately interpreting speech inputs due to variations in user accents, speech patterns, and environmental noise. This reduces the effectiveness of applications relying on voice commands. A technical solution involves a computer program product that improves VUI accuracy by analyzing training data to generate a score data structure, which is then used to modify the VUI. The system includes a first storage medium with program instructions to process training data, such as user speech inputs and corresponding intents, to generate a score data structure. This structure quantifies the accuracy of the VUI in understanding speech inputs. A second storage medium contains instructions to display this score data structure in a graphical user interface (GUI) for developers. Developers use this data to modify the original VUI, creating an improved version. The modified VUI is then deployed to user systems, where it demonstrates increased accuracy in understanding speech inputs compared to the original VUI. This approach enhances the reliability of voice-based applications by leveraging data-driven optimizations.
8. The computer program product of claim 6 , wherein the comparison of each unique synonym of the identified training phrase with the textualized training phrase synonym corresponding thereto comprises a character comparison between the text of each unique synonym with the textualized training phrase synonym corresponding thereto.
This invention relates to natural language processing (NLP) and automated training of conversational agents, such as chatbots or virtual assistants. The problem addressed is the inefficiency in training these systems to recognize and respond to user inputs that vary in phrasing but convey the same meaning. Existing methods often fail to account for synonyms and alternative phrasings, leading to poor recognition accuracy. The invention provides a method for improving the training of conversational agents by comparing synonyms of training phrases with textualized versions of those phrases. Specifically, it involves identifying a training phrase used to train a conversational agent, generating a set of unique synonyms for that phrase, and converting the training phrase into a textualized form. Each synonym is then compared to its corresponding textualized training phrase synonym using a character-by-character comparison. This ensures that variations in phrasing are accurately mapped to the intended meaning, enhancing the agent's ability to recognize and respond to diverse user inputs. The method may also involve storing these comparisons in a database to improve future training iterations. The comparison process helps distinguish between synonyms that are semantically similar but syntactically different, ensuring precise matching and reducing errors in natural language understanding.
9. The computer program product of claim 6 , wherein the comparison of each unique synonym of the identified training phrase with the textualized training phrase synonym corresponding thereto comprises a natural language comparison between the text of each unique synonym with the textualized training phrase synonym corresponding thereto.
This invention relates to natural language processing (NLP) systems, specifically improving the accuracy of synonym matching in training phrase processing. The problem addressed is the difficulty in accurately comparing synonyms of training phrases in NLP models, where variations in phrasing or context can lead to mismatches or incorrect interpretations. The invention involves a computer program product that enhances synonym comparison in NLP training by performing a natural language-based comparison between synonyms of an identified training phrase and their corresponding textualized training phrase synonyms. The system first identifies a training phrase and extracts its unique synonyms. Each synonym is then compared to its corresponding textualized training phrase synonym using natural language processing techniques. This comparison evaluates the semantic and contextual similarity between the synonym and the textualized version, ensuring more accurate matching. The system may also include preprocessing steps like tokenization, normalization, or stemming to standardize the text before comparison. The natural language comparison may involve techniques such as word embeddings, semantic similarity scoring, or contextual analysis to determine how closely the synonym aligns with the intended meaning of the training phrase. This approach improves the reliability of synonym matching in NLP models, reducing errors in phrase interpretation and enhancing overall system performance.
10. The computer program product of claim 6 , wherein the speaker converts an electrical signal of the audio output to audio of the audio output and wherein the microphone converts the audio of the audio output to an electrical signal of the audio output.
This invention relates to audio signal processing in electronic devices, specifically addressing the challenge of accurately capturing and analyzing audio output from a device to ensure proper functionality and performance. The system includes a speaker and a microphone configured to convert between electrical signals and audio signals. The speaker receives an electrical signal representing audio output and converts it into audible sound, while the microphone captures the audible sound and converts it back into an electrical signal. This bidirectional conversion allows for real-time monitoring and feedback of the audio output, enabling applications such as audio calibration, noise cancellation, and quality assurance in audio systems. The invention ensures that the audio output is accurately reproduced and captured, facilitating precise analysis and adjustment of audio performance. The system may be integrated into devices such as smartphones, computers, or audio equipment to enhance audio fidelity and reliability. The invention improves the accuracy of audio testing and validation processes by providing a closed-loop system for signal conversion and analysis.
11. A service computing system comprising a first processor and a first memory, the first memory comprising first program instructions embodied therewith that are readable by the first processor to cause the first processor to: receive, from an application developer computing system, voice user interface (VUI) training data of an application that is under development within the application developer computing system; identify a training phrase within the VUI training data; generate a training phrase synonym data structure comprising a plurality of linked data pairs, each linked data pair comprising the identified training phrase and a unique synonym to the identified training phrase; receive a selected background noise; receive a selected speaker voice, selected speaker language, and a selected speaker dialect; generate an audio output comprising synthesized human speech of each unique synonym of the identified training phrase in the selected speaker voice, selected speaker language, and selected speaker dialect, the audio output further comprising the selected background noise in the background to the synthesized human speech; audibly present the audio output upon a speaker of the application developer computing system and simultaneously capture the synthesized human speech of each unique synonym of the identified training phrase with a microphone of the application developer computing system; convert the captured synthesized human speech of each unique synonym of the identified training phrase into text (textualized training phrase synonym) with a selected speech to text framework; compare text of each unique synonym of the identified training phrase with the textualized training phrase synonym corresponding thereto score each textualized training phrase synonym based upon similarity of the textualized training phrase synonym to the text of the unique synonym corresponding thereto; generate an output training data score data structure comprising the score of the textualized training phrase synonym, the text of the training phrase, the text of each textualized training phrase synonym, the selected background noise, the selected speaker voice, the selected speaker language, and the selected speaker dialect, the output training data score data structure ranking those training phrase synonyms that are most misunderstood by the selected speech to text framework to those training phrase synonyms that are accurately understood by the selected speech to text framework; and send the output training data score data structure to the application developer computing system.
The system is designed for improving voice user interface (VUI) training by evaluating how well a speech-to-text framework recognizes variations of training phrases. The problem addressed is ensuring that VUIs accurately interpret different ways users might express the same intent, especially under varying acoustic conditions. The system receives VUI training data from an application developer, including training phrases and their synonyms. It generates a data structure linking each training phrase with its synonyms. The system then synthesizes audio outputs of these synonyms using a selected speaker voice, language, dialect, and background noise. These synthesized phrases are played back and simultaneously recorded. The recorded speech is converted to text using a selected speech-to-text framework. The system compares the original synonym text with the converted text to score accuracy, identifying which synonyms are most frequently misunderstood. The results, including scores, training phrases, synonyms, and acoustic conditions, are compiled into a ranked data structure and sent back to the developer. This helps developers refine their VUI training data to improve speech recognition performance.
12. The service computing system of claim 11 , further comprising a second processor within the application developer computing system and a second memory within the application developer computing system, the second memory comprising second program instructions embodied therewith that are readable by the second processor to cause the second processor to: display the output training data score data structure within a graphical user interface (GUI) upon a display of the application developer computing system; modify the VUI of the application (original VUI) to create a modified VUI of the application within the application developer computing system based upon the output training data score data structure; and send the application with the modified VUI to an application user computing system, wherein the modified VUI, when called by the application user computing system, has increased accurately understood speech input relative to the original VUI, when called by the application user computing system.
The invention relates to a service computing system for improving the accuracy of voice user interfaces (VUIs) in applications. The system addresses the problem of VUIs failing to accurately understand speech inputs, leading to poor user experiences. The system includes a processor and memory within an application developer computing system. The memory contains program instructions that, when executed, cause the processor to display an output training data score data structure in a graphical user interface (GUI) on the developer's display. This data structure represents performance metrics of the original VUI. The developer then modifies the original VUI based on this data to create a modified VUI, which is sent to an application user computing system. When the modified VUI is called by the user, it demonstrates improved accuracy in understanding speech inputs compared to the original VUI. The system leverages training data to refine the VUI, ensuring better speech recognition and user interaction. The modifications are made within the developer's system before deployment, ensuring the updated VUI is optimized for real-world use. This approach enhances the reliability and usability of voice-based applications.
13. The service computing system of claim 11 , wherein the comparison of each unique synonym of the identified training phrase with the textualized training phrase synonym corresponding thereto comprises a character comparison between the text of each unique synonym with the textualized training phrase synonym corresponding thereto.
The invention relates to a service computing system designed to improve natural language processing by accurately comparing synonyms of training phrases. The system addresses the challenge of efficiently matching synonyms in text-based interactions, such as chatbots or virtual assistants, where variations in phrasing can lead to misinterpretations. The system includes a training phrase database storing phrases and their synonyms, along with a synonym comparison module. When a training phrase is identified, the system retrieves its unique synonyms and compares each one to a corresponding textualized training phrase synonym. This comparison is performed at the character level, ensuring precise matching of text strings. The system may also include a training phrase identifier that locates phrases in a database based on input data, and a synonym generator that produces synonyms for new or existing phrases. The character-level comparison ensures that even minor differences in spelling or formatting are detected, improving the accuracy of synonym matching in natural language applications. This method enhances the system's ability to recognize and process diverse linguistic variations, leading to more reliable and contextually appropriate responses in automated services.
14. The service computing system of claim 11 , wherein the comparison of each unique synonym of the identified training phrase with the textualized training phrase synonym corresponding thereto comprises a natural language comparison between the text of each unique synonym with the textualized training phrase synonym corresponding thereto.
The invention relates to a service computing system designed to improve natural language processing (NLP) by enhancing the accuracy of synonym comparisons in training data. The system addresses the challenge of accurately matching synonyms in NLP applications, where variations in phrasing or context can lead to misinterpretations. The system includes a training phrase database storing phrases and their synonyms, along with a synonym comparison module. The module retrieves a training phrase and its unique synonyms from the database, then converts each synonym into a textualized form. The system compares each unique synonym with its corresponding textualized training phrase synonym using a natural language comparison. This comparison evaluates the semantic similarity between the text of each synonym and its textualized counterpart, ensuring accurate synonym matching. The system may also include a training phrase selection module that identifies training phrases based on user input or predefined criteria, and a synonym extraction module that generates synonyms for the selected phrases. The comparison results are used to refine the training data, improving the system's ability to understand and process natural language inputs. This approach enhances the reliability of NLP applications in tasks such as chatbots, search engines, and language translation.
Unknown
February 18, 2020
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.