Legal claims defining the scope of protection, as filed with the USPTO.
1. A computer implemented method for performing extract, transform, load (ETL) operation using stream data, comprising: receiving ETL specification for processing stream data, the ETL specification comprising a transform operation represented using at least a database query specification for transforming the stream data; generating a dataflow graph for executing the transform operation, the dataflow graph comprising a sequence of database queries, the generating of the dataflow graph comprising: traversing the database query specification to determine whether the database query specification includes one or more operations from a predefined set of operations; responsive to determining that a database query of the database query specification includes an operation from the predefined set of operations, decomposing the database query into: a first database query that generates an intermediate results table, wherein data stored in the intermediate results table is determined based on the operation, and a second database query that receives as input the intermediate results table and outputs data used for performing the transform operation of the ETL specification; receiving stream data from a source; and executing the sequence of database queries of the dataflow graph for performing the transform operation on the stream data received from the source.
2. The computer implemented method of claim 1, wherein the stream data includes an incremental data set, and executing the sequence of database queries comprises: performing the sequence of database queries on the incremental data set.
3. The computer implemented method of claim 2, wherein executing the sequence of database queries of the dataflow graph comprises: executing the first database query on the incremental data set; receiving a change set as output from the execution of the first database query; and integrating the data stored in the intermediate results table with the change set as the input to the second database query.
4. The computer implemented method of claim 2, wherein executing the sequence of database queries of the dataflow graph comprises: executing the second database query on the incremental data set; receiving a change set as output from the execution of the second database query; receiving as input the intermediate results table and the change set; and outputting the data used for performing the transform operation of the ETL specification.
5. The computer implemented method of claim 1, wherein executing the sequence of database queries of the dataflow graph comprises: recursively performing the sequence of database queries on the stream data to obtain a result for the transform operation.
6. The computer implemented method of claim 1, wherein the database query specification incudes a structured query language (SQL).
7. The computer implemented method of claim 1, wherein the predefined set of operations includes one or more operations that correspond to a partition-based dataflow, an append-only dataflow, or a row ID based flow.
8. A non-transitory computer readable storage medium for performing an extract, transform, load (ETL) operation using stream data, comprising stored program code, the program code comprising instructions, the instructions when executed by one or more computer processors cause the one or more computer processors to: receive ETL specification for processing stream data, the ETL specification comprising a transform operation represented using at least a database query specification for transforming the stream data; generate a dataflow graph for executing the transform operation, the dataflow graph comprising a sequence of database queries, the generating of the dataflow graph comprising: traversing the database query specification to determine whether the database query specification includes one or more operations from a predefined set of operations; responsive to determining that a database query of the database query specification includes an operation from the predefined set of operations, decomposing the database query into: a first database query that generates an intermediate results table, wherein data stored in the intermediate results table is determined based on the operation, and a second database query that receives as input the intermediate results table and outputs data used for performing the transform operation of the ETL specification; receive stream data from a source; and execute the sequence of database queries of the dataflow graph for performing the transform operation on the stream data received from the source.
9. The non-transitory computer readable storage medium of claim 8, wherein the stream data includes an incremental data set, and the instructions that cause the one or more computer processors to execute the sequence of database queries, when executed cause a processor system to: perform the sequence of database queries on the incremental data set.
10. The non-transitory computer readable storage medium of claim 9, wherein the instructions that cause the one or more computer processors to execute the sequence of database queries, when executed cause a processor system to: execute the first database query on the incremental data set; receive a change set as output from the execution of the first database query; and integrate the data stored in the intermediate results table with the change set as the input to the second database query.
11. The non-transitory computer readable storage medium of claim 9, wherein the instructions that cause the one or more computer processors to execute the sequence of database queries, when executed cause a processor system to: execute the second database query on the incremental data set; receive a change set as output from the execution of the second database query; receive as input the intermediate results table and the change set; and output the data used for performing the transform operation of the ETL specification.
12. The non-transitory computer readable storage medium of claim 8, wherein the instructions that cause the one or more computer processors to execute the sequence of database queries, when executed cause a processor system to: recursively perform the sequence of database queries on the stream data to obtain a result for the transform operation.
13. The non-transitory computer readable storage medium of claim 8, wherein the database query specification incudes a structured query language (SQL).
14. The non-transitory computer readable storage medium of claim 8, wherein the predefined set of operations includes one or more operations that correspond to a partition-based dataflow, an append-only dataflow, or a row ID based flow.
15. A computer system for performing an extract, transform, load (ETL) operation using stream data comprising: one or more computer processors; and a non-transitory computer-readable storage medium for storing instructions that when executed by the one or more computer processors cause the one or more computer processors to: receive ETL specification for processing stream data, the ETL specification comprising a transform operation represented using at least a database query specification for transforming the stream data; generate a dataflow graph for executing the transform operation, the dataflow graph comprising a sequence of database queries, the generating of the dataflow graph comprising: traversing the database query specification to determine whether the database query specification includes one or more operations from a predefined set of operations; responsive to determining that a database query of the database query specification includes an operation from the predefined set of operations, decomposing the database query into: a first database query that generates an intermediate results table, wherein data stored in the intermediate results table is determined based on the operation, and a second database query that receives as input the intermediate results table and outputs data used for performing the transform operation of the ETL specification; receive stream data from a source; and execute the sequence of database queries of the dataflow graph for performing the transform operation on the stream data received from the source.
16. The system of claim 15, wherein the stream data includes an incremental data set, and the instruction that cause the one or more computer processors to execute the sequence of database queries, when executed cause a processor system to: perform the sequence of database queries on the incremental data set.
17. The system of claim 16, wherein the instructions that cause the one or more computer processors to execute the sequence of database queries, when executed cause a processor system to: execute the first database query on the incremental data set; receive a change set as output from the execution of the first database query; and integrate the data stored in the intermediate results table with the change set as the input to the second database query.
18. The system of claim 16, wherein the instructions that cause the one or more computer processors to execute the sequence of database queries, when executed cause a processor system to: execute the second database query on the incremental data set; receive a change set as output from the execution of the second database query; receive as input the intermediate results table and the change set; and output the data used for performing the transform operation of the ETL specification.
19. The system of claim 15, wherein the instructions that cause the one or more computer processors to execute the sequence of database queries, when executed cause a processor system to: recursively perform the sequence of database queries on the stream data to obtain a result for the transform operation.
20. The system of claim 15, wherein the database query specification incudes a structured query language (SQL).
Unknown
August 12, 2025
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.