10761962

Automated Software Program Repair

PublishedSeptember 1, 2020
Assigneenot available in USPTO data we have
Technical Abstract

Patent Claims
20 claims

Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.

Claim 1

Original Legal Text

1. A method comprising: generating a first abstract syntax tree with respect to a first iteration of first source code of a first software program, the first iteration excluding a particular change in a particular portion of the first source code; generating a second abstract syntax tree with respect to a second iteration of the first source code, the second iteration including the particular change in the particular portion, the particular change including a plurality of modifications made with respect to the particular portion of the first source code; identifying a first sub-tree of the first abstract syntax tree that corresponds to the particular portion with respect to the first iteration of the first source code; identifying a plurality of second sub-trees of the second abstract syntax tree that correspond to the particular portion with respect to the second iteration of the first source code; generating a first textual representation of the first sub-tree; generating a plurality of second textual representations in which a respective second textual representation is generated for each of the second sub-trees; performing a difference determination between the first textual representation and each of the second textual representations; identifying, from the second textual representations based on the difference determination, one or more differing textual representations that differ from the first textual representation, each differing textual representation corresponding to one or more respective modifications of the particular change; determining a smallest-sized set of the differing textual representations that corresponds to a same particular event as the particular change, the particular event occurring with respect to the first source code from the first iteration to the second iteration; identifying, as secondary textual representations, the differing textual representations that are outside of the smallest-sized set, the secondary textual representations corresponding to secondary modifications of the plurality of modifications; identifying, as secondary trees, the second sub-trees that correspond to the secondary textual representations; modifying the second abstract syntax tree by removing the secondary trees from the second abstract syntax tree; obtaining a third iteration of the first source code by regenerating the first source code based on the modified second abstract syntax tree; and performing repair operations with respect to one or more of the first source code and second source code of a second software program based on the third iteration of the first source code.

Plain English Translation

This invention relates to software program analysis and repair, specifically addressing the challenge of isolating and repairing specific changes in source code while minimizing unintended modifications. The method involves analyzing two iterations of a software program's source code: one before and one after a particular change. Abstract syntax trees (ASTs) are generated for both iterations to represent the program's structure. The ASTs are then analyzed to identify sub-trees corresponding to the changed portion of the code. Textual representations of these sub-trees are generated and compared to determine differences caused by the change. The method isolates the smallest set of modifications that represent the intended change, while identifying and removing secondary modifications that do not contribute to the primary event. The AST is modified by removing these secondary modifications, and the source code is regenerated from the modified AST. This refined version of the code is then used to perform repair operations on the original or another software program, ensuring that only the intended changes are applied. The approach improves code repair accuracy by isolating relevant modifications and discarding extraneous changes.

Claim 2

Original Legal Text

2. The method of claim 1 , wherein performing the repair operations with respect to the second source code includes: identifying one or more errors in the second source code of based on executing a test suite with respect to the second source code; and identifying one or more repair candidates for the one or more errors based on the third iteration of the first source code.

Plain English Translation

The invention relates to automated software repair techniques, specifically improving the accuracy and efficiency of fixing errors in source code. The problem addressed is the challenge of automatically identifying and correcting errors in software while minimizing incorrect modifications. The method involves iterative refinement of source code repairs by leveraging multiple versions of the code. First, a test suite is executed against a modified version of the source code to detect errors. Then, potential repair candidates are identified by analyzing a previous iteration of the original source code. This approach ensures that repairs are based on verified, stable versions of the code, reducing the risk of introducing new errors. The process involves comparing different iterations of the source code to determine the most effective repair strategy. By using historical versions of the code, the method improves the reliability of automated repairs, making it particularly useful for large-scale software maintenance and continuous integration pipelines. The technique helps developers maintain code quality by automating error detection and repair while preserving the integrity of the software.

Claim 3

Original Legal Text

3. The method of claim 2 , wherein identifying the one or more repair candidates based on the third iteration of the first source code is based on the one or more repair candidates having a code pattern similar to that of the third iteration of the first source code.

Plain English Translation

The invention relates to automated software repair techniques, specifically methods for identifying and selecting repair candidates for fixing bugs in source code. The problem addressed is the challenge of efficiently and accurately identifying potential fixes for software bugs by analyzing multiple iterations of source code. The method involves generating multiple iterations of a source code file, where each iteration represents a different version or modification of the code. The iterations are analyzed to identify repair candidates, which are potential fixes for the bug. The selection of repair candidates is based on their similarity in code patterns to a specific iteration of the source code, particularly the third iteration. This similarity-based approach helps narrow down the most relevant and effective repair candidates, improving the accuracy and efficiency of automated bug fixing. The method may also involve comparing the repair candidates to the original source code or other iterations to further refine the selection process. The overall goal is to automate the identification of high-quality repair candidates that closely match the structure and logic of the modified source code, thereby increasing the likelihood of successful bug resolution.

Claim 4

Original Legal Text

4. The method of claim 1 , further comprising: identifying a particular second sub-tree that corresponds to a particular differing textual representation that is included in the smallest-sized set, the identifying of the particular second sub-tree being based on the particular second sub-tree having a larger number of levels than the other second sub-trees that correspond to the other differing textual representations included in the smallest-sized set; identifying a plurality of additional sub-trees that are sub-trees of the particular second sub-tree; generating a plurality of additional textual representations in which a respective additional textual representation is generated for each of the additional sub-trees; performing an additional difference determination between the first textual representation and each of the additional textual representations; identifying, based on the additional difference determination, one or more additional differing textual representations that differ from the first textual representation, each additional differing textual representation corresponding to one or more respective modifications of the particular change; determining an additional smallest-sized set of the differing textual representations that corresponds to the same particular event as the first textual representation; identifying, as additional secondary textual representations, the additional differing textual representations that are outside of the additional smallest-sized set, the additional secondary textual representations corresponding to the secondary modifications of the plurality of modifications; and identifying, as additional secondary trees, the additional sub-trees that correspond to the additional secondary textual representations; wherein modifying the second abstract syntax tree further includes removing the additional secondary trees from the second abstract syntax tree.

Plain English Translation

This invention relates to processing textual representations derived from abstract syntax trees (ASTs) to identify and refine modifications corresponding to specific events. The problem addressed involves distinguishing between primary and secondary modifications in textual representations, particularly when multiple variations exist for the same event. The method begins by analyzing a set of differing textual representations derived from sub-trees of an AST, selecting the sub-tree with the most hierarchical levels (indicating greater structural detail). Additional sub-trees under this selected sub-tree are then processed to generate further textual representations. These are compared to an original textual representation to identify new differing representations, which are grouped into a smallest-sized set corresponding to the same event. Representations outside this set are classified as secondary modifications, and their corresponding sub-trees are removed from the AST. This refinement process ensures that only the most relevant modifications are retained, improving accuracy in event-based textual analysis. The approach is particularly useful in software development, version control, or natural language processing where distinguishing primary and secondary changes is critical.

Claim 5

Original Legal Text

5. The method of claim 1 , wherein determining the smallest-sized set includes: performing an event correspondence determination with respect to the particular change, the event correspondence determination identifying the particular event as corresponding to the particular change; performing the event correspondence determination with respect to each possible set of a plurality of possible sets of differing textual representations in which each possible set of differing textual representations includes one or more differing textual representation; identifying, as matching sets and based on the event correspondence determinations made with respect to the plurality of possible sets, which of the plurality of possible sets of differing textual representations correspond to the particular event; and identifying, as the smallest-sized set, a particular matching set of the plurality of possible sets that includes the fewest number of differing textual representations.

Plain English Translation

This invention relates to identifying the smallest set of differing textual representations corresponding to a particular event in a system where changes are tracked. The problem addressed is efficiently determining which minimal set of textual variations accurately reflects a specific event, reducing redundancy and improving accuracy in change tracking. The method involves analyzing a particular change and determining its correspondence to a specific event. This is done by performing an event correspondence determination for the change, which confirms that the event matches the change. The method then evaluates multiple possible sets of differing textual representations, where each set contains one or more variations of the text. For each possible set, an event correspondence determination is performed to check if the set corresponds to the event. The sets that match the event are identified as matching sets. Among these, the smallest-sized set—the one with the fewest differing textual representations—is selected as the optimal set representing the event. This ensures that the minimal necessary variations are used, improving efficiency and accuracy in tracking changes.

Claim 6

Original Legal Text

6. The method of claim 5 , wherein performing the event correspondence determination with respect to the particular change includes: identifying the particular event as a fault introduction event that corresponds to the particular change based on identifying a first software test of the first source code that passed without the particular change included in the first source code and that failed with the particular change included in the first source code; identifying the particular event as a fault correction event that corresponds to the particular change based on identifying a second software test of the first source code that failed without the particular change included in the first source code and that passed with the particular change included in the first source code; identifying the particular event as a defect introduction event that corresponds to the particular change based on a first defect not being identified from a first static analysis performed on the first source code without the particular change being included in the first source code and based on the first defect being identified from a second static analysis performed on the first source code with the particular change included in the first source code; identifying the particular event as a defect correction event that corresponds to the particular change based on a second defect that is identified from a third static analysis performed on the first source code with the particular change included in the first source code and based on the second defect not being identified from a fourth static analysis performed on the first source code with the particular change included in the first source code; or identifying the particular event as a platform migration event that corresponds to the particular change based on a first build of the first source code with the particular change included therein having an error that is omitted with respect to a second build of the first source code with the particular change included therein, the first build being performed using a first version of a particular platform and the second build being performed using a second version of the particular platform.

Plain English Translation

This invention relates to software development and testing, specifically to methods for determining the correspondence between code changes and software events. The problem addressed is the difficulty in identifying whether a specific code change introduced or fixed a fault, defect, or platform compatibility issue. The method analyzes test results and static analysis outcomes to correlate changes with events. For fault introduction, it checks if a test passed before a change and failed after. For fault correction, it verifies if a test failed before and passed after. For defect introduction, it detects a defect in static analysis only after the change. For defect correction, it confirms a defect is no longer detected after the change. For platform migration, it identifies build errors in one platform version that are resolved in another. This approach automates the tracking of change impacts, improving debugging and quality assurance in software development.

Claim 7

Original Legal Text

7. The method of claim 1 , wherein the particular change introduces a particular error in the first source code and the method further comprises: determining that a sub-portion of the particular portion corresponds to the particular error based on a comparison between the first iteration of the first source code and the third iteration of the first source code; wherein performing the repair operations includes modifying the sub-portion in response to determining that the sub-portion corresponds to the particular error.

Plain English Translation

This invention relates to automated error detection and repair in source code. The problem addressed is identifying and correcting errors introduced by changes in source code, particularly when those changes lead to functional or logical errors. The method involves analyzing multiple iterations of source code to detect errors caused by specific changes. When a change introduces an error, the method compares the source code before and after the change to isolate the specific sub-portion of the code responsible for the error. Once identified, the method performs repair operations by modifying only the problematic sub-portion, ensuring the correction is targeted and precise. This approach improves efficiency by avoiding unnecessary modifications to unaffected parts of the code. The method is particularly useful in software development environments where frequent code changes may introduce errors that need to be quickly and accurately resolved. By leveraging iterative comparisons, the system ensures that repairs are both effective and minimally invasive, reducing the risk of introducing new errors during the correction process.

Claim 8

Original Legal Text

8. One or more non-transitory computer-readable storage media configured to store instructions that, in response to being executed, cause a system to perform operations, the operations comprising: generating a first abstract syntax tree with respect to a first iteration of first source code of a first software program, the first iteration excluding a particular change in a particular portion of the first source code; generating a second abstract syntax tree with respect to a second iteration of the first source code, the second iteration including the particular change in the particular portion, the particular change including a plurality of modifications made with respect to the particular portion of the first source code; identifying a first sub-tree of the first abstract syntax tree that corresponds to the particular portion with respect to the first iteration of the first source code; identifying a plurality of second sub-trees of the second abstract syntax tree that correspond to the particular portion with respect to the second iteration of the first source code; generating a first textual representation of the first sub-tree; generating a plurality of second textual representations in which a respective second textual representation is generated for each of the second sub-trees; performing a difference determination between the first textual representation and each of the second textual representations; identifying, from the second textual representations based on the difference determination, one or more differing textual representations that differ from the first textual representation, each differing textual representation corresponding to one or more respective modifications of the particular change; determining a smallest-sized set of the differing textual representations that corresponds to a same particular event as the particular change, the particular event occurring with respect to the first source code from the first iteration to the second iteration; identifying, as secondary textual representations, the differing textual representations that are outside of the smallest-sized set, the secondary textual representations corresponding to secondary modifications of the plurality of modifications; identifying, as secondary trees, the second sub-trees that correspond to the secondary textual representations; modifying the second abstract syntax tree by removing the secondary trees from the second abstract syntax tree; obtaining a third iteration of the first source code by regenerating the first source code based on the modified second abstract syntax tree; and performing repair operations with respect to one or more of the first source code and second source code of a second software program based on the third iteration of the first source code.

Plain English Translation

This invention relates to software analysis and repair, specifically for identifying and isolating meaningful changes in source code iterations to facilitate debugging or repair operations. The problem addressed is the difficulty in distinguishing between intentional, event-driven modifications and secondary, unrelated changes when analyzing code differences between iterations. The solution involves generating abstract syntax trees (ASTs) for two iterations of source code—one before and one after a particular change. The ASTs are parsed to identify sub-trees corresponding to the modified portion of the code. Textual representations of these sub-trees are generated and compared to determine differences. The system then isolates the smallest set of differing textual representations that correspond to the intended event, filtering out secondary modifications. The AST is modified by removing sub-trees associated with secondary changes, and the source code is regenerated from the cleaned AST. This refined iteration is used to perform repair operations on the original or another software program. The approach improves the accuracy of code analysis by focusing on meaningful changes while discarding noise from unrelated modifications.

Claim 9

Original Legal Text

9. The one or more computer-readable storage media of claim 8 , wherein performing the repair operations with respect to the second source code includes: identifying one or more errors in the second source code of based on executing a test suite with respect to the second source code; and identifying one or more repair candidates for the one or more errors based on the third iteration of the first source code.

Plain English Translation

This invention relates to automated software repair techniques, specifically for identifying and fixing errors in source code using iterative repair operations. The problem addressed is the time-consuming and error-prone nature of manual code repair, particularly when dealing with complex or large codebases. The system performs repair operations on a second source code by first executing a test suite to identify errors in the code. Once errors are detected, the system generates repair candidates by leveraging a third iteration of a first source code. The first source code is likely an earlier version or a related codebase that provides a reference for generating potential fixes. The repair candidates are then evaluated to determine the most effective solution for the identified errors. The process involves comparing the second source code with the third iteration of the first source code to generate repair candidates, which are then tested to ensure they resolve the errors without introducing new issues. This approach automates the repair process, reducing the need for manual intervention and improving the efficiency of software maintenance. The system is particularly useful in continuous integration and deployment pipelines, where rapid and reliable code repairs are essential.

Claim 10

Original Legal Text

10. The one or more computer-readable storage media of claim 9 , wherein identifying the one or more repair candidates based on the third iteration of the first source code is based on the one or more repair candidates having a code pattern similar to that of the third iteration of the first source code.

Plain English Translation

Automated software repair systems identify and apply fixes to defective code. A challenge in these systems is efficiently selecting repair candidates that closely match the structure and logic of the faulty code to ensure effective and accurate repairs. Existing methods often struggle to precisely identify relevant repair candidates, leading to incorrect or inefficient fixes. This invention improves automated software repair by refining the selection of repair candidates through iterative analysis of the source code. The system generates multiple iterations of the source code, each representing different stages of analysis or transformation. During the repair process, the system identifies potential repair candidates by comparing their code patterns to the latest iteration of the source code. The comparison ensures that the selected candidates have a similar structure or logic, increasing the likelihood of a successful repair. This approach enhances the accuracy and reliability of automated repairs by leveraging pattern matching between the faulty code and potential fixes. The method can be applied in various software development environments to automate debugging and maintenance tasks, reducing manual effort and improving code quality.

Claim 11

Original Legal Text

11. The one or more computer-readable storage media of claim 8 , wherein the operations further comprise: identifying a particular second sub-tree that corresponds to a particular differing textual representation that is included in the smallest-sized set, the identifying of the particular second sub-tree being based on the particular second sub-tree having a larger number of levels than the other second sub-trees that correspond to the other differing textual representations included in the smallest-sized set; identifying a plurality of additional sub-trees that are sub-trees of the particular second sub-tree; generating a plurality of additional textual representations in which a respective additional textual representation is generated for each of the additional sub-trees; performing an additional difference determination between the first textual representation and each of the additional textual representations; identifying, based on the additional difference determination, one or more additional differing textual representations that differ from the first textual representation, each additional differing textual representation corresponding to one or more respective modifications of the particular change; determining an additional smallest-sized set of the differing textual representations that corresponds to the same particular event as the first textual representation; identifying, as additional secondary textual representations, the additional differing textual representations that are outside of the additional smallest-sized set, the additional secondary textual representations corresponding to the secondary modifications of the plurality of modifications; and identifying, as additional secondary trees, the additional sub-trees that correspond to the additional secondary textual representations; wherein modifying the second abstract syntax tree further includes removing the additional secondary trees from the second abstract syntax tree.

Plain English Translation

This invention relates to natural language processing and text analysis, specifically for identifying and refining textual representations of events in structured data. The problem addressed is the accurate extraction and modification of event-related textual representations from abstract syntax trees (ASTs) to handle variations in textual descriptions while preserving semantic meaning. The method involves analyzing a first textual representation of an event, which is part of an AST, and comparing it to other differing textual representations that describe the same event. These differing representations are organized into sub-trees within the AST. The process identifies a particular sub-tree with the most hierarchical levels (i.e., the deepest structure) among those representing the same event. This sub-tree is further decomposed into additional sub-trees, each generating a new textual representation. These representations are compared to the original to determine additional differences, which may correspond to secondary modifications of the event description. The method then refines the set of differing representations by selecting the smallest group that still describes the same event, filtering out secondary modifications. The corresponding sub-trees for these secondary modifications are removed from the AST, ensuring the remaining structure accurately reflects the primary event description while eliminating redundant or less relevant variations. This approach improves the precision of event extraction and modification in text analysis systems.

Claim 12

Original Legal Text

12. The one or more computer-readable storage media of claim 8 , wherein determining the smallest-sized set includes: performing an event correspondence determination with respect to the particular change, the event correspondence determination identifying the particular event as corresponding to the particular change; performing the event correspondence determination with respect to each possible set of a plurality of possible sets of differing textual representations in which each possible set of differing textual representations includes one or more differing textual representation; identifying, as matching sets and based on the event correspondence determinations made with respect to the plurality of possible sets, which of the plurality of possible sets of differing textual representations correspond to the particular event; and identifying, as the smallest-sized set, a particular matching set of the plurality of possible sets that includes the fewest number of differing textual representations.

Plain English Translation

This invention relates to a method for identifying the smallest set of differing textual representations in a computer system that corresponds to a particular event. The problem addressed is efficiently determining which minimal set of text variations accurately reflects a specific event, reducing computational overhead and improving accuracy in event-text mapping. The process involves analyzing a particular change in the system and performing an event correspondence determination to link that change to a specific event. This determination is then applied to multiple possible sets of differing textual representations, where each set contains one or more variations of text. The system evaluates each set to identify which ones correspond to the event based on the correspondence determinations. Among these matching sets, the smallest one—meaning the set with the fewest differing textual representations—is selected as the optimal representation of the event. This approach ensures minimal computational effort while maintaining accuracy in event-text associations. The method is particularly useful in systems requiring precise event tracking, such as version control, log analysis, or automated documentation generation.

Claim 13

Original Legal Text

13. The one or more computer-readable storage media of claim 12 , wherein performing the event correspondence determination with respect to the particular change includes: identifying the particular event as a fault introduction event that corresponds to the particular change based on identifying a first software test of the first source code that passed without the particular change included in the first source code and that failed with the particular change included in the first source code; identifying the particular event as a fault correction event that corresponds to the particular change based on identifying a second software test of the first source code that failed without the particular change included in the first source code and that passed with the particular change included in the first source code; identifying the particular event as a defect introduction event that corresponds to the particular change based on a first defect not being identified from a first static analysis performed on the first source code without the particular change being included in the first source code and based on the first defect being identified from a second static analysis performed on the first source code with the particular change included in the first source code; identifying the particular event as a defect correction event that corresponds to the particular change based on a second defect that is identified from a third static analysis performed on the first source code with the particular change included in the first source code and based on the second defect not being identified from a fourth static analysis performed on the first source code with the particular change included in the first source code; or identifying the particular event as a platform migration event that corresponds to the particular change based on a first build of the first source code with the particular change included therein having an error that is omitted with respect to a second build of the first source code with the particular change included therein, the first build being performed using a first version of a particular platform and the second build being performed using a second version of the particular platform.

Plain English Translation

The invention relates to software development and testing, specifically to methods for determining the impact of code changes on software behavior. The system analyzes changes in source code to identify specific types of events, such as fault introduction, fault correction, defect introduction, defect correction, or platform migration. For fault introduction, the system detects when a previously passing test fails after a change is applied. Conversely, fault correction is identified when a failing test passes after a change. Defect introduction is determined when a static analysis tool detects a defect in code that was not present before the change, while defect correction occurs when a defect is resolved in subsequent static analysis. Platform migration events are identified when a build error in one platform version is resolved in another. The system uses test results and static analysis outputs to correlate changes with these events, improving traceability and debugging in software development. This approach helps developers understand the impact of changes, ensuring better software quality and reliability.

Claim 14

Original Legal Text

14. The one or more computer-readable storage media of claim 8 , wherein the particular change introduces a particular error in the first source code and the operations further comprise: determining that a sub-portion of the particular portion corresponds to the particular error based on a comparison between the first iteration of the first source code and the third iteration of the first source code; wherein performing the repair operations includes modifying the sub-portion in response to determining that the sub-portion corresponds to the particular error.

Plain English Translation

This invention relates to automated error detection and repair in source code. The problem addressed is identifying and correcting errors introduced by changes in source code, particularly when those changes lead to functional or logical errors. The system analyzes different iterations of source code to detect errors caused by specific modifications. When a change introduces an error, the system compares the original and modified versions of the code to isolate the problematic sub-portion. It then performs repair operations by modifying only the sub-portion responsible for the error, ensuring targeted and efficient corrections. The approach improves debugging efficiency by focusing on the exact code segment causing the issue, reducing the need for manual inspection. The system may also track multiple iterations of the source code to refine error detection and repair accuracy. This method is particularly useful in software development environments where frequent code changes increase the risk of introducing errors. The invention enhances automated debugging by combining iterative code analysis with precise error localization and repair.

Claim 15

Original Legal Text

15. A system comprising: one or more computer-readable storage media configured to store instructions; and one or more processors communicatively coupled to the one or more computer- readable storage media and configured to, in response to execution of the instructions, cause the system to perform operations, the operations comprising: generating a first abstract syntax tree with respect to a first iteration of first source code of a first software program, the first iteration excluding a particular change in a particular portion of the first source code; generating a second abstract syntax tree with respect to a second iteration of the first source code, the second iteration including the particular change in the particular portion, the particular change including a plurality of modifications made with respect to the particular portion of the first source code; identifying a first sub-tree of the first abstract syntax tree that corresponds to the particular portion with respect to the first iteration of the first source code; identifying a plurality of second sub-trees of the second abstract syntax tree that correspond to the particular portion with respect to the second iteration of the first source code; generating a first textual representation of the first sub-tree; generating a plurality of second textual representations in which a respective second textual representation is generated for each of the second sub- trees; performing a difference determination between the first textual representation and each of the second textual representations; identifying, from the second textual representations based on the difference determination, one or more differing textual representations that differ from the first textual representation, each differing textual representation corresponding to one or more respective modifications of the particular change; determining a smallest-sized set of the differing textual representations that corresponds to a same particular event as the particular change, the particular event occurring with respect to the first source code from the first iteration to the second iteration; identifying, as secondary textual representations, the differing textual representations that are outside of the smallest-sized set, the secondary textual representations corresponding to secondary modifications of the plurality of modifications; identifying, as secondary trees, the second sub-trees that correspond to the secondary textual representations; modifying the second abstract syntax tree by removing the secondary trees from the second abstract syntax tree; obtaining a third iteration of the first source code by regenerating the first source code based on the modified second abstract syntax tree; and performing repair operations with respect to one or more of the first source code and second source code of a second software program based on the third iteration of the first source code.

Plain English Translation

This system addresses the challenge of analyzing and repairing software code changes by comparing different iterations of source code to identify and isolate meaningful modifications. The system processes a first iteration of source code without a particular change and a second iteration with that change, which includes multiple modifications. It generates abstract syntax trees (ASTs) for both iterations, then identifies sub-trees corresponding to the changed portion of the code. The system converts these sub-trees into textual representations and compares them to determine differences. It isolates the smallest set of modifications that represent the intended change, separating them from secondary modifications. The secondary modifications are removed from the AST, and the source code is regenerated. This refined version is then used to perform repair operations on the original or another software program. The approach ensures that only relevant changes are applied, improving code repair accuracy by filtering out extraneous modifications.

Claim 16

Original Legal Text

16. The system of claim 15 , wherein performing the repair operations with respect to the second source code includes: identifying one or more errors in the second source code of based on executing a test suite with respect to the second source code; and identifying one or more repair candidates for the one or more errors based on the third iteration of the first source code.

Plain English Translation

Automated software repair systems address the challenge of identifying and fixing errors in source code without manual intervention. These systems often rely on iterative processes to analyze and correct defects by leveraging test suites and historical code versions. A specific approach involves using a first source code version to generate a second source code version through iterative transformations. The system then performs repair operations on the second source code by executing a test suite to detect errors. Upon identifying errors, the system generates repair candidates by analyzing the third iteration of the first source code. This iterative analysis helps refine the repair process by incorporating insights from earlier transformations, improving the accuracy and effectiveness of the proposed fixes. The system may also include mechanisms to validate repair candidates by re-executing the test suite, ensuring that the suggested changes resolve the detected errors while maintaining code functionality. This method reduces development time and minimizes human intervention in the debugging process.

Claim 17

Original Legal Text

17. The system of claim 15 , wherein the operations further comprise: identifying a particular second sub-tree that corresponds to a particular differing textual representation that is included in the smallest-sized set, the identifying of the particular second sub-tree being based on the particular second sub-tree having a larger number of levels than the other second sub-trees that correspond to the other differing textual representations included in the smallest-sized set; identifying a plurality of additional sub-trees that are sub-trees of the particular second sub-tree; generating a plurality of additional textual representations in which a respective additional textual representation is generated for each of the additional sub-trees; performing an additional difference determination between the first textual representation and each of the additional textual representations; identifying, based on the additional difference determination, one or more additional differing textual representations that differ from the first textual representation, each additional differing textual representation corresponding to one or more respective modifications of the particular change; determining an additional smallest-sized set of the differing textual representations that corresponds to the same particular event as the first textual representation; identifying, as additional secondary textual representations, the additional differing textual representations that are outside of the additional smallest-sized set, the additional secondary textual representations corresponding to the secondary modifications of the plurality of modifications; and identifying, as additional secondary trees, the additional sub-trees that correspond to the additional secondary textual representations; wherein modifying the second abstract syntax tree further includes removing the additional secondary trees from the second abstract syntax tree.

Plain English Translation

This invention relates to a system for analyzing and modifying abstract syntax trees (ASTs) to identify and remove secondary modifications in textual representations of code or other structured data. The system addresses the problem of distinguishing between primary changes (representing a specific event or modification) and secondary changes (unintended or redundant modifications) in a structured representation of text. The system first identifies a smallest-sized set of differing textual representations that correspond to the same event as a first textual representation. Within this set, it selects a particular sub-tree with the most levels, indicating a more detailed or nested structure. The system then generates additional textual representations from sub-trees of this selected sub-tree and performs further difference analysis to identify additional differing representations. These are filtered to determine an additional smallest-sized set, and the secondary modifications (those outside this set) are identified. The corresponding sub-trees are then removed from the AST to refine the representation, ensuring only the most relevant modifications are retained. This process improves accuracy in tracking changes and reduces noise in version control or code analysis systems.

Claim 18

Original Legal Text

18. The system of claim 15 , wherein determining the smallest-sized set includes: performing an event correspondence determination with respect to the particular change, the event correspondence determination identifying the particular event as corresponding to the particular change; performing the event correspondence determination with respect to each possible set of a plurality of possible sets of differing textual representations in which each possible set of differing textual representations includes one or more differing textual representation; identifying, as matching sets and based on the event correspondence determinations made with respect to the plurality of possible sets, which of the plurality of possible sets of differing textual representations correspond to the particular event; and identifying, as the smallest-sized set, a particular matching set of the plurality of possible sets that includes the fewest number of differing textual representations.

Plain English Translation

This invention relates to a system for analyzing textual representations to determine the smallest set of differing textual representations that correspond to a particular event or change. The system addresses the challenge of identifying minimal sets of variations in text that accurately reflect a specific event, which is useful in applications like version control, document comparison, or change tracking. The system performs an event correspondence determination to link a particular change to a specific event. It then evaluates multiple possible sets of differing textual representations, where each set contains one or more variations. For each set, the system checks whether the variations correspond to the event. The system identifies all matching sets that align with the event and selects the smallest-sized set—meaning the set with the fewest differing textual representations—that still accurately represents the event. This ensures that the minimal necessary variations are identified, reducing redundancy and improving efficiency in tracking changes. The approach is particularly valuable in environments where precise change detection is critical, such as software development or legal document management.

Claim 19

Original Legal Text

19. The system of claim 18 , wherein performing the event correspondence determination with respect to the particular change includes: identifying the particular event as a fault introduction event that corresponds to the particular change based on identifying a first software test of the first source code that passed without the particular change included in the first source code and that failed with the particular change included in the first source code; identifying the particular event as a fault correction event that corresponds to the particular change based on identifying a second software test of the first source code that failed without the particular change included in the first source code and that passed with the particular change included in the first source code; identifying the particular event as a defect introduction event that corresponds to the particular change based on a first defect not being identified from a first static analysis performed on the first source code without the particular change being included in the first source code and based on the first defect being identified from a second static analysis performed on the first source code with the particular change included in the first source code; identifying the particular event as a defect correction event that corresponds to the particular change based on a second defect that is identified from a third static analysis performed on the first source code with the particular change included in the first source code and based on the second defect not being identified from a fourth static analysis performed on the first source code with the particular change included in the first source code; or identifying the particular event as a platform migration event that corresponds to the particular change based on a first build of the first source code with the particular change included therein having an error that is omitted with respect to a second build of the first source code with the particular change included therein, the first build being performed using a first version of a particular platform and the second build being performed using a second version of the particular platform.

Plain English Translation

The system determines the impact of code changes by analyzing events related to software testing, static analysis, and platform migration. It identifies whether a change introduces or corrects faults, defects, or migration issues. For fault detection, the system checks if a test passes before a change and fails after, indicating a fault introduction. Conversely, if a test fails before a change and passes after, it indicates a fault correction. For defect detection, the system compares static analysis results before and after a change. If a defect appears after the change, it is a defect introduction; if a defect disappears, it is a defect correction. For platform migration, the system checks if a build error occurs in one platform version but not another, indicating a migration event. This analysis helps developers understand the consequences of code changes, improving software reliability and maintainability. The system automates the correlation between changes and events, reducing manual effort and enhancing accuracy in tracking software quality.

Claim 20

Original Legal Text

20. The system of claim 15 , wherein the particular change introduces a particular error in the first source code and the operations further comprise: determining that a sub-portion of the particular portion corresponds to the particular error based on a comparison between the first iteration of the first source code and the third iteration of the first source code; wherein performing the repair operations includes modifying the sub-portion in response to determining that the sub-portion corresponds to the particular error.

Plain English Translation

A system for automated error detection and repair in source code identifies and corrects errors introduced by changes to the code. The system operates by analyzing multiple iterations of source code to detect discrepancies that may indicate errors. When a change is made to the source code, the system compares the modified version with previous iterations to identify errors introduced by that change. The system then isolates the specific sub-portion of the code responsible for the error by comparing the first iteration of the source code with a subsequent iteration that includes the error. Once the erroneous sub-portion is identified, the system performs repair operations by modifying that sub-portion to correct the error. This approach ensures that only the relevant portion of the code is altered, minimizing unintended side effects. The system is designed to handle incremental changes to source code, making it suitable for continuous integration and development environments where frequent updates are common. By automating error detection and repair, the system reduces the need for manual intervention, improving efficiency and reliability in software development.

Patent Metadata

Filing Date

Unknown

Publication Date

September 1, 2020

Inventors

Hiroaki YOSHIDA
Mukul R. PRASAD

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, FAQs, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “AUTOMATED SOFTWARE PROGRAM REPAIR” (10761962). https://patentable.app/patents/10761962

© 2026 Nomic Interactive Technology LLC. Machine-readable context available at /api/llm-context/10761962. See llms.txt for full attribution policy.