An optimized test data selection strategy references a sampling file that identifies data attributes that serve as the basis of the test data selection strategy. By analyzing fields and the corresponding field values of the sample imprint, a total number of test data selected for inclusion into a sample dataset is reduced. The test data selection strategy provides an efficient methodology for implementing a data comparison testing process.
Legal claims defining the scope of protection, as filed with the USPTO.
1. A computing device comprising: a non-transitory machine-readable medium storing instructions, the instructions configured to, when executed, cause a processing circuitry to: generate a first sampling file, the first sampling file including: a plurality of dimension fields and a line level portion, the line level portion including: a first quantity of dimension fields; and a header portion, the header portion including a second quantity of dimension fields, and wherein the second quantity of dimension fields exceeding the first quantity of dimension fields; receive a command input configured to receive a selection strategy instruction; adjust a quantity of dimension fields included in the plurality of dimension fields of the first sampling file based on the selection strategy instruction; and adjust a value of at least one of the plurality of dimension fields of the first sampling file based on the selection strategy instruction.
2. The computing device of claim 1 , wherein each dimension field included in the first sampling file includes a dimension field value identifying a corresponding data attribute; and wherein the instructions are further configured to, when executed, cause the processing circuitry to: parse a batch file, the batch file including test data, and wherein each test data includes a plurality of data attributes; determine a first set of test data from the parsed batch file, the first set of test data including data attributes matching the plurality of dimension fields in the first sampling file; and merge the first set of test data into a sample dataset.
3. The computing device of claim 2 , wherein the instructions are configured to, when executed, cause the processing circuitry to: determine the first set of test data by selecting a predetermined number of test data from the batch file, where the selected number of test data includes data attributes matching the plurality of dimension fields in the first sampling file.
4. The computing device of claim 2 , wherein the instructions are configured to, when executed, cause the processing circuitry to: determine the first set of test data by selecting, from the batch file, a predetermined percentage of test data including data attributes matching the plurality of dimension fields in the first sampling file.
5. The computing device of claim 1 , wherein the instructions are further configured to, when executed, cause the processing circuitry to: generate a second sampling file, the second sampling file including a different number of dimension fields than the plurality of dimension fields in the first sampling file, wherein at least one dimension field type included in the second sampling file is not included in the first sampling file.
6. The computing device of claim 5 , wherein the instructions are further configured to, when executed, cause the processing circuitry to: determine a second set of test data from the batch file, the second set of test data including data attributes matching the dimension fields in the second sampling file; and merge the second set of test data into the sample dataset.
7. The computing device of claim 6 , wherein a number of test data determined for the second set of test data is greater than the number of test data determined for the first set of test data based on the second sampling file including a fewer number of dimension fields than the plurality of dimension fields in the first sampling file.
8. The computing device of claim 6 , wherein a number of test data determined for the second set of test data is less than the number of test data determined for the first set of test data based on the second sampling file including a greater number of dimension fields than the plurality of dimension fields in the first sampling file.
9. A method comprising: generating, by a processing circuitry, a first sampling file, the first sampling file including: a plurality of dimension fields and a line level portion, the line level portion including: a first quantity of dimension fields; and a header portion, the header portion including a second quantity of dimension fields, and wherein the second quantity of dimension fields exceeding the first quantity of dimension fields; receiving, by the processing circuitry, a command input configured to receive a selection strategy instruction; adjusting, by the processing circuitry, a quantity of dimension fields included in the plurality of dimension fields of the first sampling file based on the selection strategy instruction; and adjusting, by the processing circuitry, a value of at least one of the plurality of dimension fields of the first sampling file based on the selection strategy instruction.
10. The method of claim 9 , wherein each dimension field included in the first sampling file includes a dimension field value identifying a corresponding data attribute; and wherein the method further comprising: parsing, by the processing circuitry, a batch file, the batch file including test data, and wherein each test data includes a plurality of data attributes; determining, by the processing circuitry, a first set of test data from the batch file, the first set of test data including data attributes matching the plurality of dimension fields in the first sampling file; and merging, by the processing circuitry, the first set of test data into a sample dataset.
11. The method of claim 10 , wherein determining, by the processing circuitry, the first set of test data comprises selecting a predetermined number of test data from the batch file, where the selected number of test data includes data attributes matching the plurality of dimension fields in the first sampling file.
12. The method of claim 10 , wherein determining, by the processing circuitry, the first set of test data comprises selecting, from the batch file, a predetermined percentage of test data including data attributes matching the plurality of dimension fields in the first sampling file.
13. The method of claim 9 , wherein the method further comprising: generating, by the processing circuitry, a second sampling file, the second sampling file including a different number of dimension fields than the plurality of dimension fields in the first sampling file, wherein at least one dimension field type included in the second sampling file is not included in the first sampling file.
14. The method of claim 13 , the method further comprising: determining, by the processing circuitry, a second set of test data from the batch file, the second set of test data including data attributes matching the dimension fields in the second sampling file; and merging, by the processing circuitry, the second set of test data into the sample dataset.
15. The method of claim 14 , wherein a number of test data determined for the second set of test data is greater than the number of test data determined for the first set of test data based on the second sampling file including a fewer number of dimension fields than the plurality of dimension fields in the first sampling file.
16. The method of claim 14 , wherein a number of test data determined for the second set of test data is less than the number of test data determined for the first set of test data based on the second sampling file including a greater number of dimension fields than the plurality of dimension fields in the first sampling file.
17. A non-transitory computer-readable medium storing a set of processor executable instructions that, when executed, cause a processing circuitry to: generate a first sampling file, the first sampling file including: a plurality of dimension fields and a line level portion, the line level portion including: a first quantity of dimension fields; and a header portion, the header portion including a second quantity of dimension fields, and wherein the second quantity of dimension fields exceeding the first quantity of dimension fields; receive a command input configured to receive a selection strategy instruction; adjust a quantity of dimension fields included in the plurality of dimension fields of the first sampling file based on the selection strategy instruction; and adjust a value of at least one of the plurality of dimension fields of the first sampling file based on the selection strategy instruction.
18. The non-transitory computer-readable medium of claim 17 , wherein each dimension field included in the first sampling file includes a dimension field value identifying a corresponding data attribute; and wherein the instructions are further configured to, when executed, cause the processing circuitry to: parse a batch file, the batch file including test data, and wherein each test data includes a plurality of data attributes; determine a first set of test data from the parsed batch file, the first set of test data including data attributes matching the plurality of dimension fields in the first sampling file; and merge the first set of test data into a sample dataset.
19. The non-transitory computer-readable medium of claim 17 , wherein the instructions are further configured to, when executed, cause the processing circuitry to: generate a second sampling file, the second sampling file including a different number of dimension fields than the plurality of dimension fields in the first sampling file, wherein at least one dimension field type included in the second sampling file is not included in the first sampling file.
20. The non-transitory computer-readable medium of claim 19 , wherein the instructions are further configured to, when executed, cause the processing circuitry to: determine a second set of test data from the batch file, the second set of test data including data attributes matching the dimension fields in the second sampling file; and merge the second set of test data into the sample dataset.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
July 10, 2019
January 5, 2021
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.