Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.
1. A method performed by one or more computers, the method comprising: receiving a network input; and generating a network output from the network input, wherein the network output comprises a plurality of outputs from a vocabulary of outputs arranged according to an output order, the generating comprising, at each of a plurality of generation time steps: identifying a current partial network output that has already been generated as of the generation time step, the current partial network output comprising zero or more outputs from the vocabulary of outputs arranged according to a partial output order; generating, using a decoder neural network conditioned on (i) at least a portion of the network input and (ii) any outputs in the current partial network output, a decoder output that defines, for each of a plurality of insertion locations, a respective score distribution over the vocabulary of outputs, wherein each insertion location is a different new location in the partial output order at which there is no output in the current partial network output, wherein the decoder neural network is an attention-based neural network that is configured to generate the decoder output by applying an attention mechanism over an encoded representation of the network input and a self-attention mechanism over the outputs in the current partial network output; selecting, using the decoder output, one or more of the insertion locations and, for each selected insertion location, an inserted output from the vocabulary; and generating a new partial network output that comprises (i) the zero or more outputs in the current partial network output and (ii) for each selected insertion location, the inserted output from the vocabulary inserted at the corresponding new location in the partial output order.
Natural Language Processing. This invention addresses the problem of generating structured network outputs, such as sequences of words or tokens, from network inputs. The method involves a step-by-step generation process performed by one or more computers. At each generation time step, the system identifies a partially generated output. A decoder neural network, which is an attention-based model, is then used. This decoder is conditioned on both a portion of the original network input and the already generated parts of the output. The decoder produces score distributions over a predefined vocabulary of possible outputs for various potential insertion points within the current partial output. These insertion points represent new locations where an output can be added. The system then selects one or more of these insertion locations and chooses an output from the vocabulary for each selected location. Finally, a new partial network output is created by incorporating the selected outputs at their respective new locations into the existing partial output. This iterative process continues until a complete network output, comprising a plurality of outputs from the vocabulary arranged in a specific order, is generated.
2. The method of claim 1 , wherein generating the decoder output using the decoder neural network comprises: generating a decoder input that includes the encoded representation of the network input and the outputs in the current partial network output arranged according to the partial output order.
This invention relates to neural network-based systems for generating structured outputs, such as sequences or hierarchical data, where the output is constructed incrementally. The problem addressed is efficiently generating partial outputs in a specified order while maintaining coherence and accuracy in the final output. Traditional methods often struggle with maintaining consistency across partial outputs or require complex adjustments to handle dependencies between elements. The method involves using a decoder neural network to produce an output by processing an encoded representation of the input data along with previously generated partial outputs. The decoder input is constructed by combining the encoded representation with the current partial output, arranged according to a predefined partial output order. This ensures that each new element in the output is generated based on both the original input and the context provided by previously generated elements. The decoder neural network then processes this combined input to produce the next element in the output sequence, iteratively building the final structured output. This approach improves coherence and reduces errors by leveraging contextual information from earlier partial outputs during generation. The method is particularly useful in applications like natural language processing, structured data prediction, or hierarchical decision-making systems.
3. The method of claim 2 , wherein generating the decoder input further comprises adding two marker outputs to the current partial network output, wherein the decoder neural network is configured to generate a respective representation vector for each location in the partial output order after the two marker outputs have been added, and wherein generating the decoder output comprises: generating a respective slot representation for each insertion location by concatenating the representation vectors for each adjacent pair of locations in the partial output order; and generating a score distribution for each insertion location from at least the slot representation for the insertion location.
This invention relates to neural network-based sequence generation, specifically improving the decoding process in autoregressive models. The problem addressed is the inefficiency and suboptimal performance of traditional autoregressive decoding methods, which generate sequences one element at a time, often leading to redundant computations and suboptimal predictions. The method involves generating a decoder input by adding two marker outputs to a current partial network output. These markers serve as placeholders or boundary indicators within the sequence. The decoder neural network then processes this modified input to generate a representation vector for each location in the partial output order after the markers are added. For each insertion location in the sequence, a slot representation is created by concatenating the representation vectors of adjacent pairs of locations. These slot representations are then used to generate a score distribution for each insertion location, which indicates the likelihood of inserting a new element at that position. This approach allows the model to consider multiple potential insertion points simultaneously, improving efficiency and accuracy in sequence generation tasks. The method is particularly useful in applications like machine translation, text summarization, and other natural language processing tasks where autoregressive decoding is commonly used.
4. The method of claim 1 , wherein the vocabulary includes an end-of-sequence token.
A method for natural language processing involves generating text sequences using a neural network model with a specialized vocabulary. The vocabulary includes an end-of-sequence token, which signals the completion of a generated text sequence. This token helps the model determine when to stop generating further text, improving the accuracy and coherence of the output. The neural network model processes input data, such as text prompts or partial sentences, and generates a sequence of tokens from the vocabulary. The inclusion of the end-of-sequence token ensures that the generated text is properly terminated, preventing incomplete or overly long outputs. This method is particularly useful in applications like chatbots, machine translation, and text summarization, where precise and well-structured text generation is essential. The end-of-sequence token allows the model to dynamically adjust the length of the generated text based on the context, enhancing the overall performance and usability of the system.
5. The method of claim 4 , wherein selecting, using the decoder output, one or more of the insertion locations and, for each selected insertion location, an inserted output from the vocabulary comprises: determining that an insertion location-output combination with a highest score across all insertion location-output combinations does not include the end-of-sequence token; and in response, selecting only the insertion location-output combination with a highest score across all insertion location-output combinations.
This invention relates to natural language processing (NLP) and text generation systems, specifically improving the selection of insertion locations and outputs in sequence-to-sequence models. The problem addressed is the inefficient or suboptimal selection of insertion points and generated text in autoregressive decoding, which can lead to redundant or incorrect outputs. The method involves a decoding process where a sequence-to-sequence model generates candidate outputs for insertion at multiple locations in a partially constructed sequence. For each possible insertion location, the model evaluates multiple candidate outputs from a predefined vocabulary, assigning a score to each location-output combination. The selection process then identifies the combination with the highest score across all possible pairs. If this top-scoring combination does not include an end-of-sequence token, it is selected for insertion into the sequence. This ensures that the highest-confidence output is chosen, improving the coherence and accuracy of the generated text. The approach optimizes the decoding process by prioritizing the most probable next steps, reducing errors and improving efficiency in text generation tasks.
6. The method of claim 4 , wherein selecting, using the decoder output, one or more of the insertion locations and, for each selected insertion location, an inserted output from the vocabulary comprises: determining that there is at least one insertion location for which the output with the highest score is not the end-of-sequence token; and in response, selecting only an insertion location-output combination with a highest score across all insertion location-output combinations that include an insertion location for which the output with the highest score is not the end-of-sequence token.
This invention relates to natural language processing, specifically improving sequence generation in models like transformers. The problem addressed is inefficient or suboptimal insertion of tokens during decoding, which can lead to poor sequence quality. The method involves selecting insertion locations and corresponding outputs from a vocabulary based on scoring mechanisms to enhance sequence generation. The process begins by evaluating insertion locations and potential outputs. For each insertion location, the model assigns scores to possible outputs, including an end-of-sequence token. If at least one insertion location has a highest-scoring output that is not the end-of-sequence token, the method selects only the insertion location-output combination with the highest overall score among all valid combinations. This ensures that insertions are made where they are most beneficial, avoiding premature termination and improving sequence coherence. The approach optimizes token selection by prioritizing high-confidence, non-terminal outputs, leading to more accurate and fluent generated sequences.
7. The method of claim 4 , wherein selecting, using the decoder output, one or more of the insertion locations and, for each selected insertion location, an inserted output from the vocabulary comprises: identifying, from the decoder output and for each insertion location, an output that has a highest score for the insertion location; determining that there is at least one insertion location for which the output with the highest score is not the end-of-sequence token; and in response, selecting each insertion location for which the output with the highest score is not the end-of-sequence token and the corresponding output that has the highest score for the insertion location.
This invention relates to natural language processing, specifically methods for generating text sequences using a decoder model. The problem addressed is improving the selection of insertion locations and corresponding outputs in sequence generation, particularly when the decoder output includes multiple potential insertion points. The method involves analyzing the decoder output to identify the highest-scoring output for each insertion location. If at least one insertion location has a highest-scoring output that is not an end-of-sequence token, those locations and their corresponding outputs are selected for insertion. This ensures that the generated sequence continues until a valid termination condition is met, avoiding premature termination. The approach enhances the accuracy and coherence of generated text by dynamically evaluating insertion points and their associated outputs based on their scores, rather than relying on fixed or arbitrary criteria. The method is particularly useful in applications like machine translation, text summarization, and conversational AI, where precise and contextually appropriate sequence generation is critical. The invention improves upon prior art by providing a more adaptive and score-driven selection process for insertion locations and outputs.
8. The method of claim 1 , wherein the decoder neural network is configured to generate a respective slot representation for each insertion location.
A system and method for natural language processing involves a neural network-based decoder that generates structured output by inserting elements into predefined slots. The decoder processes input data to produce representations for each insertion location, enabling the generation of structured outputs such as tables, forms, or other organized data formats. The method addresses challenges in transforming unstructured text or speech into structured data, improving accuracy and efficiency in applications like automated data entry, question answering, and document parsing. The decoder neural network is trained to map input sequences to specific slots, ensuring proper alignment of extracted information. This approach enhances the precision of information extraction and reduces errors in structured data generation. The system may be applied in various domains, including customer service automation, legal document processing, and healthcare data management, where converting unstructured inputs into structured formats is critical. The method improves upon traditional rule-based or template-matching approaches by leveraging deep learning to adapt to diverse input patterns and contexts. The decoder's ability to generate slot representations for each insertion location ensures that extracted data is accurately placed, maintaining the integrity of the structured output. This innovation supports scalable and adaptable solutions for structured data generation in real-world applications.
9. The method of claim 8 , wherein generating the decoder output comprises: projecting a decoder hidden state matrix generated from the slot representations using a projection matrix to generate a content-location logit matrix; flattening the content-location logit matrix into a content-location logit vector; and applying a softmax over the content-location logit vector to generate a probability distribution over all insertion location-output combinations.
This invention relates to natural language processing, specifically methods for generating structured text outputs from input sequences. The problem addressed is the efficient and accurate generation of text with specific slot values inserted into predefined locations, such as filling templates or structured forms. Traditional methods often struggle with maintaining coherence and correctness when inserting multiple values into different positions. The method involves processing input sequences to extract slot representations, which encode information about specific values to be inserted into a structured output. A decoder hidden state matrix is generated from these slot representations, capturing contextual relationships between the slots. This matrix is then projected using a projection matrix to produce a content-location logit matrix, which represents the likelihood of each slot value being placed at each possible insertion location. The logit matrix is flattened into a content-location logit vector, and a softmax function is applied to convert this vector into a probability distribution over all possible insertion location-output combinations. This distribution guides the selection of the most probable slot-value placements, ensuring coherent and contextually appropriate structured text generation. The approach improves accuracy by jointly modeling both content selection and placement decisions.
10. The method of claim 8 , wherein generating the decoder output comprises: generating a respective probability for each location by applying a softmax to a product of a decoder hidden state matrix generated from the slot representations and a learned query vector; for each location: projecting the slot representation for the location into a score vector that includes a respective score for each output in the vocabulary using a projection matrix; applying a softmax over the score vector to generate an initial probability for each output in the vocabulary; and multiplying each initial probability by the probability for the location to generate a final probability for each output in the vocabulary.
This invention relates to natural language processing (NLP) and specifically to methods for generating decoder outputs in sequence-to-sequence models, such as those used in machine translation or text generation. The problem addressed is improving the accuracy and efficiency of decoding steps in NLP tasks by refining probability calculations for output sequences. The method involves generating a decoder output by first computing a probability for each location in the output sequence. This is done by applying a softmax function to the product of a decoder hidden state matrix—derived from slot representations—and a learned query vector. For each location, the slot representation is projected into a score vector using a projection matrix, where each score corresponds to an output in the vocabulary. A softmax function is then applied to this score vector to produce initial probabilities for each vocabulary output. These initial probabilities are further refined by multiplying them with the previously computed location probability, resulting in final probabilities for each output in the vocabulary. This approach enhances the precision of output predictions by integrating contextual and positional information through learned transformations and probabilistic weighting. The method is particularly useful in tasks requiring high-fidelity text generation, such as translation or summarization, where accurate sequence modeling is critical.
11. The method of claim 8 , wherein generating the decoder output comprises: generating a context vector by applying max pooling over the slot representations; generating a bias vector from the context vector that includes a respective bias value for each output in the vocabulary; and generating the decoder output from the bias vector and the slot representations.
This invention relates to natural language processing, specifically improving sequence-to-sequence models for tasks like machine translation or text generation. The problem addressed is the inefficiency of traditional decoder mechanisms in handling long sequences or complex dependencies between input and output elements. The method involves a decoder that processes slot representations derived from an input sequence. First, a context vector is generated by applying max pooling over these slot representations, capturing the most salient features. A bias vector is then derived from this context vector, where each entry corresponds to a bias value for a specific output in the vocabulary. This bias vector is combined with the slot representations to produce the final decoder output. The approach enhances the model's ability to focus on relevant input features while generating outputs, improving accuracy and efficiency in tasks requiring long-range dependencies. The use of max pooling ensures that critical information is retained, while the bias vector allows for fine-grained adjustments to the output probabilities. This method is particularly useful in applications where input and output sequences have complex relationships, such as machine translation or dialogue systems.
12. One or more non-transitory computer-readable storage media storing instructions that when executed by one or more computers cause the one or more computers to perform operations comprising: receiving a network input; and generating a network output from the network input, wherein the network output comprises a plurality of outputs from a vocabulary of outputs arranged according to an output order, the generating comprising, at each of a plurality of generation time steps: identifying a current partial network output that has already been generated as of the generation time step, the current partial network output comprising zero or more outputs from the vocabulary of outputs arranged according to a partial output order; generating, using a decoder neural network conditioned on (i) at least a portion of the network input and (ii) any outputs in the current partial network output, a decoder output that defines, for each of a plurality of insertion locations, a respective score distribution over the vocabulary of outputs, wherein each insertion location is a different new location in the partial output order at which there is no output in the current partial network output, wherein the decoder neural network is an attention-based neural network that is configured to generate the decoder output by applying an attention mechanism over an encoded representation of the network input and a self-attention mechanism over the outputs in the current partial network output; selecting, using the decoder output, one or more of the insertion locations and, for each selected insertion location, an inserted output from the vocabulary; and generating a new partial network output that comprises (i) the zero or more outputs in the current partial network output and (ii) for each selected insertion location, the inserted output from the vocabulary inserted at the corresponding new location in the partial output order.
This invention relates to a neural network-based system for generating structured outputs, such as sequences or hierarchical data, from an input. The problem addressed is the efficient and accurate generation of outputs where the order and placement of elements within the output structure are critical. The system uses a decoder neural network, specifically an attention-based model, to produce outputs from a predefined vocabulary. The decoder is conditioned on both the input data and any partially generated outputs, allowing it to dynamically determine where and what to insert next. At each generation step, the system identifies the current partial output, generates a score distribution over the vocabulary for each possible insertion location, selects the best insertion(s), and updates the partial output. The attention mechanism allows the model to weigh the relevance of different parts of the input and previously generated outputs, ensuring coherent and contextually appropriate placements. This approach is particularly useful in tasks like text generation, structured data synthesis, or any application requiring ordered or hierarchical output generation. The system iteratively refines the output until the full sequence or structure is complete.
13. A system comprising one or more computers and one or more storage devices storing instructions that when executed by one or more computers cause the one or more computers to perform operations comprising: receiving a network input; and generating a network output from the network input, wherein the network output comprises a plurality of outputs from a vocabulary of outputs arranged according to an output order, the generating comprising, at each of a plurality of generation time steps: identifying a current partial network output that has already been generated as of the generation time step, the current partial network output comprising zero or more outputs from the vocabulary of outputs arranged according to a partial output order; generating, using a decoder neural network conditioned on (i) at least a portion of the network input and (ii) any outputs in the current partial network output, a decoder output that defines, for each of a plurality of insertion locations, a respective score distribution over the vocabulary of outputs, wherein each insertion location is a different new location in the partial output order at which there is no output in the current partial network output, wherein the decoder neural network is an attention-based neural network that is configured to generate the decoder output by applying an attention mechanism over an encoded representation of the network input and a self-attention mechanism over the outputs in the current partial network output; selecting, using the decoder output, one or more of the insertion locations and, for each selected insertion location, an inserted output from the vocabulary; and generating a new partial network output that comprises (i) the zero or more outputs in the current partial network output and (ii) for each selected insertion location, the inserted output from the vocabulary inserted at the corresponding new location in the partial output order.
The system involves a neural network-based approach for generating structured outputs from network inputs, addressing challenges in sequential or ordered data generation. The system processes an input and produces an output sequence from a predefined vocabulary, where the output elements are arranged in a specific order. During generation, the system iteratively builds the output by inserting new elements at specific positions in the sequence. At each step, the system evaluates the partially generated output and the input to determine where and what to insert next. A decoder neural network, conditioned on the input and the current partial output, generates scores for possible insertions at multiple locations in the sequence. The decoder uses attention mechanisms to focus on relevant parts of the input and the partial output, ensuring coherent and contextually appropriate insertions. The system then selects the best insertion locations and corresponding outputs from the vocabulary, updating the partial output accordingly. This process repeats until the full output sequence is generated. The approach is particularly useful for tasks requiring structured or ordered outputs, such as text generation, machine translation, or structured data synthesis.
14. The system of claim 13 , wherein generating the decoder output using the decoder neural network comprises: generating a decoder input that includes the encoded representation of the network input and the outputs in the current partial network output arranged according to the partial output order.
The invention relates to neural network systems for generating structured outputs, particularly in tasks like machine translation or text generation where outputs must follow a specific order. The problem addressed is efficiently producing partial outputs in a predefined sequence while maintaining accuracy and coherence. The system uses an encoder-decoder architecture where an encoder neural network processes an input (e.g., text or data) into an encoded representation. A decoder neural network then generates an output sequence step-by-step, incorporating both the encoded representation and previously generated partial outputs. The decoder dynamically arranges these partial outputs according to a predefined order, ensuring the final output adheres to the required structure. This approach improves the system's ability to handle sequential dependencies and constraints, such as grammatical rules in language tasks or logical sequences in structured data generation. The invention enhances prior systems by integrating partial outputs into the decoding process, reducing errors and improving consistency in the final output.
15. The system of claim 14 , wherein generating the decoder input further comprises adding two marker outputs to the current partial network output, wherein the decoder neural network is configured to generate a respective representation vector for each location in the partial output order after the two marker outputs have been added, and wherein generating the decoder output comprises: generating a respective slot representation for each insertion location by concatenating the representation vectors for each adjacent pair of locations in the partial output order; and generating a score distribution for each insertion location from at least the slot representation for the insertion location.
This invention relates to neural network-based systems for sequence generation, particularly in natural language processing or machine translation tasks. The problem addressed is improving the accuracy and efficiency of generating structured outputs, such as sequences with specific insertion constraints or partial ordering requirements. The system processes a partial output sequence by generating a decoder input that includes two marker outputs. These markers are added to the current partial network output to indicate potential insertion points. A decoder neural network then generates a representation vector for each location in the partial output sequence after the markers are inserted. For each possible insertion location, the system creates a slot representation by concatenating the representation vectors of adjacent locations in the partial output order. From these slot representations, the system generates a score distribution for each insertion location, which helps determine the most likely next element to insert into the sequence. This approach enhances the model's ability to handle structured outputs by explicitly modeling insertion points and their contextual relationships. The method improves sequence generation by leveraging positional markers and contextual representations to guide the insertion process.
16. The system of claim 14 , wherein the vocabulary includes an end-of-sequence token.
A system for natural language processing (NLP) or machine learning tasks involves a vocabulary used to encode and decode sequences of data, such as text or other structured inputs. The vocabulary includes a specialized end-of-sequence token that signals the conclusion of a sequence during processing. This token helps the system recognize when a sequence has been fully processed, improving accuracy in tasks like text generation, translation, or classification. The vocabulary may also include other tokens, such as padding or special markers, to handle variable-length sequences efficiently. The system may use this vocabulary in conjunction with neural networks, transformers, or other models to process and generate sequences. The end-of-sequence token ensures proper termination of sequences, preventing errors in downstream tasks. This approach is particularly useful in applications where sequence boundaries must be clearly defined, such as in chatbots, language models, or automated transcription systems. The system may be implemented in software, hardware, or a combination of both, and may be integrated into larger AI or data processing pipelines.
17. The system of claim 16 , wherein selecting, using the decoder output, one or more of the insertion locations and, for each selected insertion location, an inserted output from the vocabulary comprises: determining that an insertion location-output combination with a highest score across all insertion location-output combinations does not include the end-of-sequence token; and in response, selecting only the insertion location-output combination with a highest score across all insertion location-output combinations.
This invention relates to a system for generating sequences, such as text or code, using a decoder model. The system addresses the challenge of efficiently inserting tokens into sequences while optimizing for quality and coherence. The decoder model processes input data and generates potential insertion locations and corresponding outputs from a predefined vocabulary. The system evaluates each possible insertion location-output combination based on a scoring mechanism, which may involve metrics like probability, relevance, or contextual fit. The key innovation involves selecting the highest-scoring insertion location-output combination, provided it does not include an end-of-sequence token. This ensures that the sequence generation continues until a valid termination condition is met, improving the completeness and accuracy of the output. The system may also include mechanisms for refining the scoring process, such as adjusting weights or incorporating additional constraints, to further enhance the quality of the generated sequences. The approach is particularly useful in applications like natural language processing, machine translation, or code synthesis, where precise and coherent sequence generation is critical.
18. The system of claim 16 , wherein selecting, using the decoder output, one or more of the insertion locations and, for each selected insertion location, an inserted output from the vocabulary comprises: determining that there is at least one insertion location for which the output with the highest score is not the end-of-sequence token; and in response, selecting only an insertion location-output combination with a highest score across all insertion location-output combinations that include an insertion location for which the output with the highest score is not the end-of-sequence token.
This invention relates to natural language processing systems, specifically improving text generation by optimizing insertion locations in sequences. The problem addressed is inefficient or suboptimal text generation where insertion points may not be selected based on the most relevant or highest-scoring outputs, leading to lower-quality generated text. The system processes an input sequence and generates a set of insertion locations where new tokens can be inserted. A decoder evaluates possible outputs for each insertion location, assigning scores to each potential token from a predefined vocabulary. The system then selects insertion locations and corresponding outputs based on these scores, prioritizing combinations where the highest-scored output is not an end-of-sequence token. This ensures that only the most relevant insertion points are chosen, improving the coherence and accuracy of the generated text. The selection process involves first identifying insertion locations where the highest-scoring output is not an end-of-sequence token. Only these locations are considered for further processing. Among these, the system selects the insertion location-output combination with the highest overall score, ensuring optimal text generation by focusing on the most promising candidates. This approach enhances the efficiency and quality of text generation in natural language processing applications.
19. The system of claim 16 , wherein selecting, using the decoder output, one or more of the insertion locations and, for each selected insertion location, an inserted output from the vocabulary comprises: identifying, from the decoder output and for each insertion location, an output that has a highest score for the insertion location; determining that there is at least one insertion location for which the output with the highest score is not the end-of-sequence token; and in response, selecting each insertion location for which the output with the highest score is not the end-of-sequence token and the corresponding output that has the highest score for the insertion location.
This invention relates to natural language processing systems, specifically methods for selecting insertion locations in sequence generation tasks. The problem addressed is improving the accuracy and efficiency of sequence generation by dynamically determining optimal insertion points and corresponding outputs from a vocabulary. The system processes a sequence of tokens generated by a decoder, where the decoder produces outputs for multiple potential insertion locations within the sequence. For each insertion location, the system evaluates the decoder's output to identify the token with the highest score. If at least one insertion location has a highest-scoring token that is not an end-of-sequence token, the system selects those locations and their corresponding highest-scoring tokens for insertion. This ensures that only meaningful tokens are inserted, avoiding premature termination of the sequence generation process. The method enhances sequence generation by dynamically filtering out low-confidence or irrelevant insertions, improving the coherence and relevance of the generated output. This approach is particularly useful in applications like machine translation, text summarization, and conversational AI, where precise and contextually appropriate token selection is critical. The system avoids rigid, rule-based insertion strategies, instead relying on adaptive scoring to optimize sequence construction.
20. The system of claim 16 , wherein the decoder neural network is configured to generate a respective slot representation for each insertion location, and wherein generating the decoder output comprises: projecting a decoder hidden state matrix generated from the slot representations using a projection matrix to generate a content-location logit matrix; flattening the content-location logit matrix into a content-location logit vector; and applying a softmax over the content-location logit vector to generate a probability distribution over all insertion location-output combinations.
This invention relates to neural network-based systems for generating structured outputs, particularly in tasks like text generation or sequence prediction where content must be placed into specific slots or positions. The problem addressed is efficiently generating outputs where both the content and its placement must be determined, often requiring complex modeling of dependencies between content and location. The system includes a decoder neural network that processes input data to produce structured outputs. The decoder generates a slot representation for each possible insertion location in the output. These representations capture contextual information relevant to each slot. To produce the final output, the system projects a hidden state matrix derived from the slot representations using a projection matrix, resulting in a content-location logit matrix. This matrix is then flattened into a content-location logit vector, which is processed with a softmax function to yield a probability distribution over all possible combinations of insertion locations and output content. This approach allows the system to jointly model content generation and placement, improving accuracy and coherence in structured output tasks. The method is particularly useful in applications like dialogue systems, machine translation, or structured data generation where precise placement of content is critical.
Unknown
August 11, 2020
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.