Implementing a Whole Sentence Recurrent Neural Network Language Model for Natural Language Processing

PublishedOctober 1, 2019

Assigneenot available in USPTO data we have

InventorsYinghui Huang Abhinav Sethy Kartik Audhkhasi Bhuvana Ramabhadran

Technical Abstract

Patent Claims

20 claims

Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.

Claim 1

Original Legal Text

1. A method comprising: providing, by a computer system, a whole sentence recurrent neural network language model for estimating a probability of likelihood of each whole sentence processed by natural language processing being correct; applying, by the computer system, a noise contrastive estimation sampler against at least one entire sentence from a corpus of a plurality of sentences to generate at least one incorrect sentence; training, by the computer system, the whole sentence recurrent neural network language model, using the at least one entire sentence from the corpus and the at least one incorrect sentence, to distinguish the at least one entire sentence as correct; and applying, by the computer system, the whole sentence recurrent neural network language model to estimate the probability of likelihood of each whole sentence processed by natural language processing being correct.

Plain English Translation

Natural Language Processing. This invention addresses the problem of accurately assessing the correctness of entire sentences processed by natural language processing systems. A computer system is used to implement a method for training a whole sentence recurrent neural network language model. This model is designed to estimate the probability that any given whole sentence is correct. The training process involves providing the computer system with a corpus of multiple sentences. For each sentence from this corpus, an incorrect version is generated by applying a noise contrastive estimation sampler. This sampler creates at least one incorrect sentence based on the original sentence. The whole sentence recurrent neural network language model is then trained using both the original, correct sentences from the corpus and the generated incorrect sentences. The objective of this training is to enable the model to effectively distinguish between correct and incorrect sentences. Finally, the trained whole sentence recurrent neural network language model is applied to process new sentences. For each processed sentence, the model estimates its probability of being correct.

Claim 2

Original Legal Text

2. The method according to claim 1 , wherein applying, by the computer system, the whole sentence recurrent neural network language model to estimate the probability of likelihood of each whole sentence processed by natural language processing further comprises: applying, by the computer system, the whole sentence recurrent neural network language model for the natural language processing comprising one of conversational interaction, conversational telephony speech transcription, multimedia captioning, and translation.

Plain English Translation

This invention relates to natural language processing (NLP) systems that use recurrent neural network (RNN) language models to estimate the probability of whole sentences. The technology addresses the challenge of accurately processing and interpreting complete sentences in various NLP applications, where traditional models often struggle with context and coherence across entire sentences. The method involves applying a whole-sentence RNN language model to evaluate the likelihood of sentences processed through NLP. The model is specifically adapted for different NLP tasks, including conversational interaction, telephony speech transcription, multimedia captioning, and translation. By analyzing entire sentences rather than individual words or fragments, the system improves accuracy and contextual understanding in these applications. For conversational interaction, the model enhances dialogue systems by better predicting sentence-level responses. In telephony speech transcription, it improves the accuracy of converting spoken language into written text by maintaining coherence across sentences. For multimedia captioning, the model generates more contextually relevant captions for videos or images. In translation, it ensures that translated sentences retain grammatical and semantic consistency. The RNN language model processes input sentences through recurrent layers, capturing dependencies and patterns across the entire sentence. This approach allows the system to generate more accurate probability estimates, improving performance in real-world NLP applications. The invention focuses on leveraging whole-sentence analysis to overcome limitations of traditional word-level or fragment-based models.

Claim 3

Original Legal Text

3. The method according to claim 1 , wherein providing, by the computer system, a whole sentence recurrent neural network language model for estimating a probability of likelihood of each whole sentence processed by natural language processing further comprises: providing, by the computer system, the whole sentence recurrent neural network language model on a recurrent neural network long short-term memory architecture.

Plain English Translation

This invention relates to natural language processing (NLP) systems that use recurrent neural network (RNN) language models to estimate the probability of entire sentences. The problem addressed is improving the accuracy and efficiency of sentence-level language modeling by leveraging long-term dependencies in text data. Traditional RNNs struggle with capturing long-range relationships in sequences, leading to degraded performance in sentence probability estimation. The solution involves implementing a whole-sentence RNN language model using a long short-term memory (LSTM) architecture. LSTMs are a specialized type of RNN designed to mitigate the vanishing gradient problem, allowing them to retain and utilize information over longer sequences. By applying this architecture, the system can better model the likelihood of entire sentences, enhancing tasks such as text generation, translation, and sentiment analysis. The LSTM-based model processes input sentences sequentially, updating its hidden state to incorporate contextual information from earlier parts of the sentence while maintaining relevance to later parts. This approach improves the model's ability to predict sentence-level probabilities accurately, addressing limitations of simpler RNN structures. The invention is particularly useful in applications requiring robust sentence-level language understanding, such as automated content generation and machine translation.

Claim 4

Original Legal Text

4. The method according to claim 1 , wherein applying, by the computer system, the whole sentence recurrent neural network language model to estimate the probability of likelihood of each whole sentence processed by natural language processing further comprises: scoring, by the computer system, by the whole sentence recurrent neural network language model, the probability of each whole sentence directly without independently computing conditional probabilities for each separate word in each whole sentence.

Plain English Translation

This invention relates to natural language processing (NLP) systems that use recurrent neural network (RNN) language models to evaluate the likelihood of entire sentences. Traditional NLP approaches often compute conditional probabilities for each word in a sentence independently, which can be computationally expensive and may not capture the full contextual meaning of the sentence. The invention improves upon this by applying a whole-sentence RNN language model that directly scores the probability of an entire sentence without breaking it down into individual word-level probabilities. This approach reduces computational overhead and provides a more holistic evaluation of sentence likelihood. The system processes input sentences using the RNN model, which is trained to assess the coherence and plausibility of complete sentences rather than analyzing words in isolation. By eliminating the need for intermediate word-level probability calculations, the method enhances efficiency while maintaining or improving accuracy in language modeling tasks. This technique is particularly useful in applications requiring real-time NLP, such as machine translation, text generation, and sentiment analysis, where processing speed and contextual understanding are critical.

Claim 5

Original Legal Text

5. The method according to claim 1 , wherein applying, by the computer system, a noise contrastive estimation sampler against at least one entire sentence from a corpus to generate at least one incorrect sentence further comprises: applying, by the computer system, the noise contrastive estimation sampler against the at least one entire sentence from the corpus by performing one of a substitution, an insertion, and a deletion of one or more words in the at least one entire sentence to generate the at least one incorrect sentence.

Plain English Translation

This invention relates to natural language processing and machine learning, specifically improving text generation models by training them to distinguish between correct and incorrect sentences. The problem addressed is the challenge of training language models to generate coherent and contextually accurate text, often hindered by the inability to effectively differentiate between valid and invalid sentence structures. The method involves using a noise contrastive estimation (NCE) sampler to generate incorrect sentences from a corpus of correct sentences. The NCE sampler applies transformations such as word substitution, insertion, or deletion to modify the original sentences, creating incorrect versions. These incorrect sentences are then used alongside the original correct sentences to train a language model, helping it learn to distinguish between valid and invalid text. This approach enhances the model's ability to generate more accurate and contextually appropriate sentences by exposing it to both correct and deliberately corrupted examples during training. The transformations ensure that the incorrect sentences retain some structural similarity to the original, making the learning process more effective. This method is particularly useful in applications like machine translation, text summarization, and conversational AI, where generating coherent and contextually relevant text is critical.

Claim 6

Original Legal Text

6. The method according to claim 1 , wherein applying, by the computer system, a noise contrastive estimation sampler against at least one entire sentence from a corpus to generate at least one incorrect sentence further comprises: randomly selecting, by the computer system, a plurality of positions in the at least one entire sentence from the corpus to introduce a substitution, an insertion, and a deletion of one or more words in the at least one entire sentence to generate the at least one incorrect sentence.

Plain English Translation

This invention relates to natural language processing and machine learning, specifically improving text generation models by training them to distinguish between correct and incorrect sentences. The problem addressed is the challenge of training models to robustly handle noisy or corrupted text, which is common in real-world applications. The solution involves generating incorrect sentences from a corpus of correct sentences using a noise contrastive estimation sampler. This process randomly selects multiple positions within a sentence to introduce substitutions, insertions, and deletions of one or more words, creating corrupted versions of the original text. These incorrect sentences are then used alongside the original correct sentences to train the model, enhancing its ability to recognize and correct errors. The method ensures diverse and realistic noise patterns by varying the types and locations of modifications, improving the model's generalization to different error types. This approach is particularly useful for applications like text correction, machine translation, and speech recognition, where models must handle imperfect input data. The technique leverages randomness to create a wide range of corrupted sentences, ensuring the model learns to distinguish between valid and invalid text patterns effectively.

Claim 7

Original Legal Text

7. The method according to claim 1 , wherein applying, by the computer system, the whole sentence recurrent neural network language model to estimate the probability of likelihood of each whole sentence processed by natural language processing being correct further comprises: selecting, by the computer system, a selection of the plurality of sentences from the corpus; applying, by the computer system, the noise contrastive estimation sampler against each sentence in the selection of the plurality of sentences to generate a plurality of imposter sentences; applying, by the computer system, each separate set of each sentence in the selection of the plurality of sentences and a selection of imposter sentences of the plurality of imposter sentences generated for each sentence to the whole sentence recurrent neural network language model; generating, by the computer system, through the whole sentence recurrent neural network language model, a first score for each sentence and at least one additional score for the selection of imposter sentences; applying, by the computer system, a linear boundary to classify the first score and the additional score in one of two classes in a linear space, wherein the two classes represent an incorrect sentence and a correct sentence; and evaluating, by the computer system, an accuracy of the natural language processing system in performing sequential classification tasks based on an accuracy of the classifications of the first score in the class of the correct sentence and at least one additional score as an incorrect sentence.

Plain English Translation

This invention relates to natural language processing (NLP) systems that use recurrent neural network (RNN) language models to evaluate sentence correctness. The problem addressed is improving the accuracy of NLP systems in classifying sentences as correct or incorrect, particularly in sequential classification tasks. The method involves selecting a subset of sentences from a corpus and generating imposter sentences using a noise contrastive estimation sampler. Each original sentence and its corresponding imposter sentences are processed through a whole-sentence RNN language model to generate scores. A linear boundary is then applied to classify these scores into two classes: correct sentences and incorrect sentences. The system evaluates the NLP system's accuracy by comparing the classification results of the original sentences against the imposter sentences. This approach enhances the model's ability to distinguish between valid and invalid sentences, improving overall performance in sequential classification tasks. The method leverages contrastive learning to refine the model's decision-making process, ensuring more reliable sentence evaluations.

Claim 8

Original Legal Text

8. The method according to claim 1 , wherein training, by the computer system, the whole sentence recurrent neural network language model, using the at least one entire sentence from the corpus and the at least one incorrect sentence, to distinguish the at least one entire sentence as correct further comprises: applying, by the computer system, each of the at least one entire sentence from the corpus and the at least one incorrect sentence to at least one recurrent neural network layer comprising a plurality of long short-term memory for holding data for an arbitrary period of time; pushing, by the computer system, an output from each of the plurality of long short-term memory to a neural network scorer for each of the at least one entire sentence and the at least one incorrect sentence; generating, by the neural network scorer, a separate output score assigned by the at least one recurrent neural network layer for each of the at least one entire sentence and the at least one incorrect sentence representing an unnormalized probability of each sentence; and evaluating, by a neural network layer receiving output from the whole sentence recurrent neural network language model, each separate output score as an output of a digital 1 if the output score is a probability indicating the entire sentence is correct and an output of a digital 0 if the output score is a probability indicating the entire sentence is not correct.

Plain English Translation

This invention relates to natural language processing and machine learning, specifically improving the accuracy of recurrent neural network (RNN) language models in distinguishing correct sentences from incorrect ones. The problem addressed is the difficulty in training RNNs to reliably identify grammatical or semantically correct sentences within a corpus, particularly when presented with incorrect or distorted sentences. The method involves training a whole-sentence RNN language model using both correct sentences from a corpus and artificially generated incorrect sentences. The training process applies each sentence to at least one RNN layer containing long short-term memory (LSTM) units, which retain data over extended periods. The LSTM outputs are then passed to a neural network scorer, which generates an unnormalized probability score for each sentence, indicating its likelihood of being correct. These scores are evaluated by a subsequent neural network layer, which outputs a binary classification: a digital 1 if the sentence is correct (high probability) or a digital 0 if incorrect (low probability). This approach enhances the model's ability to distinguish between correct and incorrect sentences by leveraging LSTM-based sequence processing and probabilistic scoring. The method improves language model robustness in applications like grammar checking, text generation, and error detection.

Claim 9

Original Legal Text

9. A computer system comprising one or more processors, one or more computer-readable memories, one or more computer-readable storage devices, and program instructions, stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, the stored program instructions comprising: program instructions to provide a whole sentence recurrent neural network language model for estimating a probability of likelihood of each whole sentence processed by natural language processing being correct; program instructions to apply a noise contrastive estimation sampler against at least one entire sentence from a corpus of a plurality of sentences to generate at least one incorrect sentence; program instructions to train the whole sentence recurrent neural network language model, using the at least one entire sentence from the corpus and the at least one incorrect sentence, to distinguish the at least one entire sentence as correct; and program instructions to apply the whole sentence recurrent neural network language model to estimate the probability of likelihood of each whole sentence processed by natural language processing being correct.

Plain English Translation

This invention relates to natural language processing (NLP) and specifically to improving the accuracy of sentence-level language models. The problem addressed is the difficulty in assessing the correctness of entire sentences in NLP systems, where traditional models often struggle with contextual coherence and grammatical correctness across full sentences. The system includes a recurrent neural network (RNN) language model designed to evaluate the likelihood that a given sentence is correct. The model is trained using a noise contrastive estimation (NCE) sampler, which generates incorrect sentences by altering valid sentences from a corpus. The training process involves comparing these incorrect sentences with the original correct sentences, enabling the model to learn distinctions between valid and invalid sentence structures. Once trained, the model can estimate the probability that a processed sentence is correct, improving NLP applications such as grammar checking, machine translation, and text generation. The approach enhances sentence-level evaluation by leveraging full-sentence context rather than relying on word-level or partial-sentence analysis.

Claim 10

Original Legal Text

10. The computer system according to claim 9 , wherein the program instructions to apply the whole sentence recurrent neural network language model to estimate the probability of likelihood of each whole sentence processed by natural language processing further comprise: program instructions to apply the whole sentence recurrent neural network language model for the natural language processing comprising one of conversational interaction, conversational telephony speech transcription, multimedia captioning, and translation.

Plain English Translation

This invention relates to a computer system for natural language processing (NLP) that uses a whole-sentence recurrent neural network (RNN) language model to estimate the likelihood of sentences. The system processes natural language input through an RNN-based model that evaluates entire sentences rather than individual words or phrases, improving accuracy in understanding context and meaning. The model is trained to predict the probability of a given sentence occurring in a specific language, enhancing tasks such as conversational interaction, speech transcription, multimedia captioning, and translation. By analyzing complete sentences, the system better captures dependencies and nuances in language, leading to more precise and contextually relevant outputs. The RNN language model is integrated into the NLP pipeline to refine sentence-level predictions, making it suitable for applications requiring high accuracy in language understanding, such as virtual assistants, automated transcription services, and real-time translation tools. The system improves over traditional word-level or phrase-level models by leveraging the full context of sentences, reducing errors in ambiguous or complex linguistic structures.

Claim 11

Original Legal Text

11. The computer system according to claim 9 , wherein the program instructions to provide a whole sentence recurrent neural network language model for estimating a probability of likelihood of each whole sentence processed by natural language processing further comprise: program instructions to provide the whole sentence recurrent neural network language model on a recurrent neural network long short-term memory architecture.

Plain English Translation

This invention relates to natural language processing (NLP) systems that use recurrent neural networks (RNNs) to estimate the likelihood of entire sentences. The system addresses the challenge of accurately modeling the probability of complete sentences in NLP tasks, which is critical for applications like machine translation, text generation, and speech recognition. Traditional language models often struggle with capturing long-range dependencies and contextual nuances in sentences, leading to poor performance in tasks requiring sentence-level understanding. The system employs a whole-sentence recurrent neural network (RNN) language model to compute the probability of entire sentences rather than individual words or phrases. This approach improves contextual coherence and semantic accuracy by considering the full sentence structure. The RNN is implemented using a long short-term memory (LSTM) architecture, which enhances the model's ability to retain and utilize long-term dependencies within sentences. LSTMs are particularly effective for handling sequential data, as they mitigate the vanishing gradient problem common in standard RNNs, allowing the model to learn and retain information over extended sequences. The system processes input sentences through the LSTM-based RNN, which generates a probability distribution representing the likelihood of the entire sentence. This output can be used for various NLP applications, such as text generation, where the model predicts the most probable next sentence, or in machine translation, where it evaluates the coherence of translated sentences. The use of LSTMs ensures that the model effectively captures both short-term and long-term dependencies, improving overall performance in sentence-level NLP tasks.

Claim 12

Original Legal Text

12. The computer system according to claim 9 , wherein the program instructions to apply the whole sentence recurrent neural network language model to estimate the probability of likelihood of each whole sentence processed by natural language processing further comprise: program instructions to score, by the whole sentence recurrent neural network language model, the probability of each whole sentence directly without independently computing conditional probabilities for each separate word in each whole sentence.

Plain English Translation

This invention relates to natural language processing (NLP) systems that use recurrent neural network (RNN) language models to evaluate the likelihood of entire sentences. Traditional NLP approaches often break sentences into individual words and compute conditional probabilities for each word separately, which can be computationally expensive and may not capture the full contextual meaning of a sentence. The invention addresses this by applying a whole-sentence RNN language model that directly scores the probability of an entire sentence without decomposing it into individual word-level probabilities. This approach improves efficiency and accuracy by leveraging the model's ability to process and understand sentences as complete units rather than isolated words. The system includes program instructions that enable the RNN to generate a probability score for each sentence based on its overall structure and meaning, rather than relying on sequential word-by-word analysis. This method enhances the performance of NLP tasks such as text generation, translation, and sentiment analysis by reducing computational overhead and improving contextual understanding. The invention is particularly useful in applications requiring real-time processing of large volumes of text, where efficiency and accuracy are critical.

Claim 13

Original Legal Text

13. The computer system according to claim 9 , the program instructions to apply a noise contrastive estimation sampler against at least one entire sentence from a corpus to generate at least one incorrect sentence further comprise: program instructions to apply the noise contrastive estimation sampler against the at least one entire sentence from the corpus by performing one of a substitution, an insertion, and a deletion of one or more words in the at least one entire sentence to generate the at least one incorrect sentence.

Plain English Translation

This invention relates to natural language processing (NLP) and machine learning, specifically improving text generation models by generating incorrect sentences for training. The problem addressed is the need for robust training data to enhance the accuracy and generalization of language models. The system uses a noise contrastive estimation (NCE) sampler to create incorrect sentences from a corpus by modifying entire sentences through word-level substitutions, insertions, or deletions. These incorrect sentences serve as negative examples, helping the model distinguish between correct and incorrect language patterns. The NCE sampler ensures that the generated incorrect sentences are plausible but incorrect, providing a challenging training set. This approach improves the model's ability to recognize and correct errors, leading to better performance in tasks like text generation, translation, and error detection. The method is particularly useful for training models that require high accuracy in understanding and generating natural language.

Claim 14

Original Legal Text

14. The computer system according to claim 9 , wherein the program instructions to apply a noise contrastive estimation sampler against at least one entire sentence from a corpus to generate at least one incorrect sentence further comprise: program instructions to randomly select a plurality of positions in the at least one entire sentence from the corpus to introduce a substitution, an insertion, and a deletion of one or more words in the at least one entire sentence to generate the at least one incorrect sentence.

Plain English Translation

This invention relates to natural language processing (NLP) and machine learning, specifically improving text generation models by training them to distinguish between correct and incorrect sentences. The problem addressed is the challenge of training models to generate coherent and grammatically accurate text, which often requires distinguishing between valid and invalid sentence structures. The solution involves using a noise contrastive estimation (NCE) sampler to generate incorrect sentences by systematically introducing errors into correct sentences from a corpus. The method randomly selects multiple positions within a sentence to apply substitutions, insertions, or deletions of one or more words, creating distorted versions of the original sentence. These incorrect sentences are then used alongside correct sentences to train the model, helping it learn to differentiate between valid and invalid text patterns. The approach enhances the model's ability to generate high-quality, coherent text by exposing it to both correct and intentionally corrupted examples during training. This technique is particularly useful in improving the robustness and accuracy of language models in applications like machine translation, text summarization, and conversational AI.

Claim 15

Original Legal Text

15. The computer system according to claim 9 , wherein the program instructions to apply the whole sentence recurrent neural network language model to estimate the probability of likelihood of each whole sentence processed by natural language processing being correct further comprise: program instructions to select a selection of the plurality of sentences from the corpus; program instructions to apply the noise contrastive estimation sampler against each sentence in the selection of the plurality of sentences to generate a plurality of imposter sentences; program instructions to apply each separate set of each sentence in the selection of the plurality of sentences and a selection of imposter sentences of the plurality of imposter sentences generated for each sentence to the whole sentence recurrent neural network language model; program instructions to generate, by the whole sentence recurrent neural network language model, a first score for each sentence and at least one additional score for the selection of imposter sentences; program instructions to apply a linear boundary to classify the first score and the additional score in one of two classes in a linear space, wherein the two classes represent an incorrect sentence and a correct sentence; and program instructions to evaluate an accuracy of the natural language processing system in performing sequential classification tasks based on an accuracy of the classifications of the first score in the class of the correct sentence and at least one additional score as an incorrect sentence.

Plain English Translation

This invention relates to natural language processing (NLP) systems that use recurrent neural network (RNN) language models to evaluate sentence correctness. The system addresses the challenge of accurately determining whether a sentence processed by an NLP system is correct or incorrect, which is critical for tasks like text generation, translation, and error detection. The system applies a whole-sentence RNN language model to estimate the probability of a sentence being correct. It selects a subset of sentences from a corpus and generates imposter sentences for each sentence using a noise contrastive estimation sampler. The system then applies the RNN model to each original sentence and its corresponding imposter sentences, producing a score for the original sentence and additional scores for the imposter sentences. A linear boundary is used to classify these scores into two classes: correct sentences and incorrect sentences. The system evaluates the NLP system's accuracy in sequential classification tasks by comparing the classification of the original sentence's score (as correct) against the imposter sentences' scores (as incorrect). This approach improves the reliability of NLP systems by distinguishing between valid and invalid sentences more effectively.

Claim 16

Original Legal Text

16. The computer system according to claim 9 , wherein the program instructions to train the whole sentence recurrent neural network language model, using the at least one entire sentence from the corpus and the at least one incorrect sentence, to distinguish the at least one entire sentence as correct further comprise: program instructions to apply each of the at least one entire sentence from the corpus and the at least one incorrect sentence to at least one recurrent neural network layer comprising a plurality of long short-term memory for holding data for an arbitrary period of time; program instructions to push an output from each of the plurality of long short-term memory to a neural network scorer for each of the at least one entire sentence and the at least one incorrect sentence; program instructions to generate, by the neural network scorer, a separate output score assigned by the at least one recurrent neural network layer for each of the at least one entire sentence and the at least one incorrect sentence representing an unnormalized probability of each sentence; and program instructions to evaluate, by a neural network layer receiving output from the whole sentence recurrent neural network language model, each separate output score as an output of a digital 1 if the output score is a probability indicating the entire sentence is correct and an output of a digital 0 if the output score is a probability indicating the entire sentence is not correct.

Plain English Translation

This invention relates to natural language processing and machine learning, specifically improving the accuracy of language models by training them to distinguish correct sentences from incorrect ones. The system uses a recurrent neural network (RNN) with long short-term memory (LSTM) units to process entire sentences from a corpus alongside artificially generated incorrect sentences. The LSTM layers retain data over arbitrary time periods, allowing the model to capture long-range dependencies in the input sentences. Each processed sentence is then passed to a neural network scorer, which generates an unnormalized probability score representing the likelihood of the sentence being correct. These scores are evaluated by a final neural network layer, which outputs a binary classification: a digital 1 if the sentence is deemed correct and a digital 0 if it is incorrect. This approach enhances the model's ability to detect grammatical and semantic errors by leveraging both correct and incorrect sentence examples during training. The system is designed to improve language model performance in applications such as grammar checking, text generation, and automated proofreading.

Claim 17

Original Legal Text

17. A computer program product comprises a computer readable storage medium having program instructions embodied therewith, wherein the computer readable storage medium is not a transitory signal per se, the program instructions executable by a computer to cause the computer to: provide, by a computer, a whole sentence recurrent neural network language model for estimating a probability of likelihood of each whole sentence processed by natural language processing being correct; apply, by the computer, a noise contrastive estimation sampler against at least one entire sentence from a corpus of a plurality of sentences to generate at least one incorrect sentence; train, by the computer, the whole sentence recurrent neural network language model, using the at least one entire sentence from the corpus and the at least one incorrect sentence, to distinguish the at least one entire sentence as correct; and apply, by the computer, the whole sentence recurrent neural network language model to estimate the probability of likelihood of each whole sentence processed by natural language processing being correct.

Plain English Translation

This invention relates to natural language processing (NLP) and specifically to improving the accuracy of sentence-level language models using noise contrastive estimation. The problem addressed is the challenge of accurately estimating the likelihood that a given sentence is grammatically and semantically correct, which is critical for applications like machine translation, text generation, and error detection. Traditional language models often struggle with whole-sentence evaluation, focusing instead on word-level or n-gram probabilities. The solution involves a computer program product with a non-transitory storage medium containing instructions for a computer to execute several steps. First, a whole-sentence recurrent neural network (RNN) language model is provided to estimate the probability that a processed sentence is correct. Next, a noise contrastive estimation (NCE) sampler is applied to at least one sentence from a corpus to generate at least one incorrect sentence. The RNN model is then trained using both the correct sentence(s) from the corpus and the generated incorrect sentence(s) to learn to distinguish correct sentences from incorrect ones. Finally, the trained model is applied to estimate the probability that a processed sentence is correct. This approach leverages contrastive learning to improve the model's ability to differentiate between valid and invalid sentences, enhancing its performance in NLP tasks requiring sentence-level evaluation. The use of NCE helps mitigate biases in training data by explicitly contrasting correct and incorrect examples.

Claim 18

Original Legal Text

18. The computer program product according to claim 17 , further comprising the program instructions executable by a computer to cause the computer to: apply, by the computer, the whole sentence recurrent neural network language model for the natural language processing comprising one of conversational interaction, conversational telephony speech transcription, multimedia captioning, and translation.

Plain English Translation

This invention relates to a computer program product for natural language processing (NLP) using a whole sentence recurrent neural network (RNN) language model. The technology addresses the challenge of accurately processing and generating human language in various applications by leveraging a neural network architecture that considers entire sentences rather than individual words or phrases. The RNN language model is trained to capture contextual dependencies across entire sentences, improving performance in tasks where understanding full context is critical. The program includes instructions for applying the RNN language model to specific NLP applications, such as conversational interaction, where the model facilitates real-time dialogue systems; conversational telephony speech transcription, where it converts spoken language into text while preserving conversational nuances; multimedia captioning, where it generates accurate and contextually relevant captions for videos or images; and translation, where it converts text or speech from one language to another while maintaining semantic and syntactic coherence. The model's ability to process whole sentences enhances accuracy and contextual understanding in these applications, making it suitable for advanced NLP tasks requiring deep linguistic analysis.

Claim 19

Original Legal Text

19. The computer program product according to claim 17 , further comprising the program instructions executable by a computer to cause the computer to: provide, by the computer, the whole sentence recurrent neural network language model on a recurrent neural network long short-term memory architecture.

Plain English Translation

This invention relates to natural language processing (NLP) and machine learning, specifically improving language models using recurrent neural networks (RNNs). The problem addressed is the limited ability of traditional RNNs to capture long-term dependencies in sequential data, which affects the accuracy of language modeling tasks. The solution involves a computer program product that implements a whole-sentence recurrent neural network language model on a long short-term memory (LSTM) architecture. The LSTM architecture enhances the model's capacity to retain and utilize contextual information over extended sequences, improving performance in tasks like text generation, translation, and sentiment analysis. The program includes instructions for training the model on large datasets, optimizing its parameters, and deploying it for real-world applications. The LSTM-based approach mitigates the vanishing gradient problem common in traditional RNNs, enabling more effective learning of long-range dependencies. This advancement enhances the model's ability to generate coherent and contextually relevant text, making it suitable for applications requiring high-precision language understanding and generation.

Claim 20

Original Legal Text

20. The computer program product according to claim 17 , further comprising the program instructions executable by a computer to cause the computer to: score, by the computer, by the whole sentence recurrent neural network language model, the probability of each whole sentence directly without independently computing conditional probabilities for each separate word in each whole sentence.

Plain English Translation

This invention relates to natural language processing (NLP) and machine learning, specifically improving the efficiency of sentence-level language modeling. Traditional recurrent neural network (RNN) language models compute conditional probabilities for each word in a sentence sequentially, which is computationally expensive and inefficient. The invention addresses this by using a whole-sentence recurrent neural network language model that scores the probability of an entire sentence directly, without breaking it down into individual word-level computations. This approach reduces computational overhead and speeds up processing by eliminating the need to calculate intermediate probabilities for each word. The model processes the entire sentence as a single unit, leveraging contextual information across the full sentence to generate a more accurate and efficient probability score. This method is particularly useful in applications requiring real-time NLP, such as chatbots, translation systems, and text generation tools, where computational efficiency is critical. The invention enhances performance by simplifying the modeling process while maintaining or improving accuracy.

Patent Metadata

Filing Date

Unknown

Publication Date

October 1, 2019

Inventors

Yinghui Huang

Abhinav Sethy

Kartik Audhkhasi

Bhuvana Ramabhadran

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search