US-11526505

Enabling cross-platform query optimization via expressive markup language

PublishedDecember 13, 2022

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

A database system receives a request from a user. The request invokes a data set function (DSF) and uses a property to be provided by the DSF. The database system determines that a function descriptor is available for the DSF. The function descriptor is expressed as markup language instructions. The function descriptor defines the property of the DSF. The database system uses the function descriptor to define a property for the DSF.

Patent Claims

13 claims

Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.

Claim 4

Original Legal Text

4. The method of claim 1, wherein the markup language is an instruction-based language.

Plain English Translation

The invention relates to a method for processing markup language documents, specifically focusing on instruction-based markup languages. Markup languages are used to structure and annotate content, but traditional approaches often lack dynamic or executable instructions within the markup itself. This invention addresses the need for a markup language that includes executable instructions, enabling more flexible and interactive document processing. The method involves parsing a markup language document where the markup contains embedded instructions. These instructions are not just descriptive tags but are executable commands that can modify the document's structure, content, or behavior during processing. The system interprets these instructions, executes them, and generates an output document or triggers actions based on the instructions. This allows for dynamic content generation, conditional formatting, or even programmatic interactions within the markup document. The instruction-based markup language may include syntax for defining variables, control structures (like loops or conditionals), and function calls. These instructions can be processed by a parser or interpreter that evaluates and executes them in sequence or based on predefined rules. The output can be a modified markup document, a rendered display, or an external action triggered by the instructions. This approach enhances traditional markup languages by integrating executable logic, making documents more interactive and adaptable without requiring separate programming files or external scripts. The method is particularly useful in scenarios where documents need to dynamically adjust content or behavior based on runtime conditions.

Claim 5

Original Legal Text

5. The method of claim 1, further comprising using the function descriptor to define an output schema for the DSF.

Plain English Translation

A system and method for processing data streams involves a data stream function (DSF) that operates on a continuous stream of data. The DSF includes a function descriptor that specifies the input and output data types, as well as the processing logic for the stream. The function descriptor is used to define an output schema for the DSF, ensuring that the processed data conforms to a structured format. This allows the system to handle dynamic data streams efficiently by dynamically adapting to changes in the input data structure. The method includes parsing the function descriptor to extract the output schema, validating the schema against the processed data, and ensuring compatibility with downstream systems. The output schema may include metadata such as data types, field names, and relationships between fields, enabling seamless integration with other data processing components. The system supports real-time data processing, where the function descriptor is updated dynamically to reflect changes in the data stream, ensuring consistent and reliable output. This approach improves data consistency and reduces errors in stream processing applications.

Claim 6

Original Legal Text

6. The method of claim 1, further comprising using the function descriptor to define an input schema for the DSF.

Plain English Translation

This invention relates to data processing systems, specifically methods for defining and managing data schemas in distributed systems. The problem addressed is the lack of flexibility and interoperability in defining input schemas for data processing functions, particularly in distributed systems where multiple components must agree on data formats. The method involves using a function descriptor to define an input schema for a Data Schema Function (DSF). The function descriptor is a metadata structure that describes the properties, parameters, and expected inputs of a data processing function. By embedding the input schema within this descriptor, the system ensures that data passed to the function adheres to a predefined structure, improving data consistency and reducing errors during processing. The input schema specifies the format, data types, and constraints for the data expected by the DSF. This allows the system to validate incoming data before processing, ensuring compatibility between different components in a distributed environment. The function descriptor may also include additional metadata, such as versioning information, to support backward and forward compatibility. This approach enhances modularity and reusability of data processing functions by clearly defining their interfaces. It also simplifies integration with other systems, as the schema is explicitly documented within the function descriptor, eliminating ambiguity in data exchange. The method is particularly useful in large-scale distributed systems where multiple independent components must interact seamlessly.

Claim 7

Original Legal Text

7. The method of claim 1, further comprising using the function descriptor to determine to push a predicate in the request from the DSF's output to the input of the DSF.

Plain English Translation

This invention relates to data processing systems that use a data stream function (DSF) to process data streams, particularly in optimizing predicate handling within such systems. The problem addressed is the inefficient handling of predicates in data stream processing, which can lead to redundant computations and degraded performance. The method involves using a function descriptor to analyze and optimize predicate handling in a data stream function (DSF). The function descriptor contains metadata about the DSF, including its input and output characteristics. By examining this descriptor, the system determines whether a predicate from the DSF's output should be pushed back to its input. This predicate pushing operation allows the predicate to be evaluated earlier in the data processing pipeline, reducing redundant computations and improving efficiency. The DSF processes a stream of data elements, applying one or more operations to transform or filter the data. The function descriptor provides information about the DSF's behavior, such as whether it preserves or modifies certain properties of the input data. The system uses this information to decide whether pushing a predicate back to the input is valid and beneficial. If the predicate can be safely pushed, the system modifies the data processing pipeline to evaluate the predicate at the input stage, filtering data before it enters the DSF. This optimization reduces the amount of data processed by the DSF, lowering computational overhead and improving throughput. The method ensures that predicate pushing is only performed when it is semantically correct and performance-enhancing, avoiding incorrect results or unnecessary transformations. The function descriptor's metadata guides this decision, ensuring compatibility

Claim 8

Original Legal Text

8. The method of claim 1, further comprising using the function descriptor to determine to push a projection in the request from the input of the DSF to the output of the DSF.

Plain English Translation

This invention relates to data processing systems, specifically methods for optimizing data transformations in a data stream framework (DSF). The problem addressed is the inefficiency in handling data projections within a DSF, where projections (subset selections of data fields) are often applied redundantly or at suboptimal stages, leading to unnecessary computational overhead. The method involves using a function descriptor to analyze and optimize the handling of projections in a data stream. The function descriptor contains metadata about the data transformations applied within the DSF, including the structure and dependencies of operations. By examining this descriptor, the system determines whether to push a projection from the input of the DSF to its output. This means the projection is applied as early as possible in the data processing pipeline, reducing the amount of data that must be processed in subsequent stages. The method ensures that only the necessary fields are propagated through the system, minimizing memory usage and computational effort. This optimization is particularly useful in large-scale data processing environments where efficiency is critical. The approach leverages the function descriptor to dynamically adjust the projection strategy based on the specific transformations being performed, ensuring adaptability to different data processing workflows.

Claim 9

Original Legal Text

9. The method of claim 1, further comprising using the function descriptor to estimate a cardinality of the property.

Plain English translation pending...

Claim 10

Original Legal Text

10. The method of claim 1, further comprising using the function descriptor to determine if the DSF inherits or obeys specific ordering or partitioning schemes.

Plain English Translation

A method for analyzing data structure functions (DSFs) in software systems involves determining whether a DSF adheres to specific ordering or partitioning schemes. The method leverages a function descriptor, which contains metadata about the DSF, to assess its compliance with predefined structural rules. This includes checking if the DSF follows hierarchical inheritance patterns or adheres to partitioning schemes that dictate how data is organized or accessed. The function descriptor may include attributes such as inheritance flags, partitioning directives, or structural constraints that define the expected behavior of the DSF. By evaluating these attributes, the method ensures that the DSF operates within the intended structural framework, maintaining consistency and predictability in data handling. This approach is particularly useful in large-scale software systems where maintaining structured data relationships is critical for performance and reliability. The method may be applied during software development, testing, or runtime to verify compliance with architectural guidelines.

Claim 14

Original Legal Text

14. The computer program of claim 11, wherein the method further comprises using the function descriptor to define an output schema for the DSF.

Plain English Translation

This invention relates to computer programs for managing data processing functions, specifically focusing on defining and utilizing function descriptors to standardize data processing operations. The technology addresses the challenge of ensuring consistent data handling across different systems by providing a structured way to describe and execute data processing functions. The function descriptor includes metadata that defines the input and output schemas for a Data Schema Function (DSF), enabling seamless integration and interoperability between disparate data sources and processing systems. By standardizing the schema definitions, the invention ensures that data transformations and processing operations are performed in a predictable and consistent manner, reducing errors and improving efficiency. The function descriptor also allows for dynamic adaptation of data processing functions based on changing requirements or data formats, enhancing flexibility in data workflows. This approach is particularly useful in environments where data must be processed across multiple systems with varying schemas, such as in enterprise data integration or cloud-based data processing applications. The invention simplifies the deployment and management of data processing functions by providing a clear and structured way to define their behavior and expected outputs.

Claim 15

Original Legal Text

15. The computer program of claim 11, wherein the method further comprises using the function descriptor to define an input schema for the DSF.

Plain English Translation

The invention relates to computer programs for managing data processing functions, specifically focusing on defining and utilizing function descriptors to enhance data processing workflows. The problem addressed involves the need for a structured and standardized way to describe data processing functions, ensuring compatibility and interoperability across different systems. The solution involves a computer program that includes a method for processing data using a data structure function (DSF). The method involves generating a function descriptor that encapsulates metadata about the DSF, such as its inputs, outputs, and processing logic. This descriptor is used to define an input schema for the DSF, specifying the structure, data types, and constraints of the input data required by the function. The input schema ensures that data fed into the DSF adheres to the expected format, reducing errors and improving efficiency. The function descriptor may also include additional metadata, such as version information, dependencies, and performance characteristics, to facilitate better integration and management of the DSF within larger data processing pipelines. By standardizing the description of data processing functions, the invention enables seamless integration, validation, and execution of functions across diverse computing environments.

Claim 16

Original Legal Text

16. The computer program of claim 11, wherein the method further comprises using the function descriptor to determine to push a predicate in the request from the DSF's output to the input of the DSF.

Plain English Translation

The invention relates to a computer program for processing data streams, specifically focusing on optimizing data stream functions (DSFs) by dynamically managing predicates within a data processing pipeline. The problem addressed is the inefficient handling of predicates in data stream processing, which can lead to redundant computations and degraded performance. The computer program includes a method for processing data streams using a data stream function (DSF) that applies one or more predicates to filter or transform input data. The method involves generating a function descriptor that describes the DSF's behavior, including its input and output characteristics. The function descriptor is used to analyze the DSF's operations and determine whether a predicate should be pushed from the DSF's output to its input. This optimization reduces unnecessary computations by applying the predicate earlier in the pipeline, closer to the data source, thereby improving efficiency. The method also involves executing the DSF with the pushed predicate, ensuring that the predicate is applied at the optimal stage in the data processing pipeline. The function descriptor may include metadata such as the predicate's selectivity, the DSF's computational cost, and the data stream's characteristics. By leveraging this metadata, the computer program dynamically adjusts the predicate placement to minimize processing overhead and enhance performance. This approach is particularly useful in real-time data processing systems where efficiency is critical.

Claim 17

Original Legal Text

17. The computer program of claim 11, wherein the method further comprises using the function descriptor to determine to push a projection in the request from the input of the DSF to the output of the DSF.

Plain English Translation

A system and method for optimizing data processing in a distributed stream processing framework (DSF) addresses inefficiencies in handling data transformations. The DSF processes continuous data streams, but traditional approaches often require redundant computations when applying projections (selecting specific fields) at multiple stages. This invention improves performance by analyzing function descriptors to identify opportunities for pushing projections earlier in the data flow. The function descriptor contains metadata about the operations to be performed, including input and output schemas. By examining this metadata, the system determines whether a projection can be applied at the input of the DSF rather than later stages, reducing unnecessary data processing. The method involves parsing the function descriptor to identify projection operations, validating that the projection can be safely applied at the input without affecting downstream logic, and then modifying the data flow to include the projection at the earliest possible stage. This reduces computational overhead and improves throughput by minimizing the amount of data processed in subsequent stages. The technique is particularly useful in large-scale stream processing environments where efficiency is critical.

Claim 18

Original Legal Text

18. The computer program of claim 11, wherein the method further comprises using the function descriptor to estimate a cardinality of the property.

Plain English Translation

A system and method for analyzing and processing data properties in a computer program involves generating a function descriptor that represents a property of a data structure. The function descriptor includes metadata such as the property's name, type, and relationships to other properties. The system uses this descriptor to estimate the cardinality of the property, which refers to the number of distinct values the property can take. This estimation helps optimize data processing, storage, and query performance by predicting how data will be distributed. The method may also involve validating the function descriptor against predefined rules to ensure correctness and consistency. The system can be applied in databases, data analysis tools, or software development environments where understanding property cardinality is critical for efficient data management. By leveraging the function descriptor, the system provides insights into data characteristics without requiring full data scans, improving performance in large-scale data systems.

Claim 19

Original Legal Text

19. The computer program of claim 11, wherein the method further comprises using the function descriptor to determine if the DSF inherits or obeys specific ordering or partitioning schemes.

Plain English Translation

This invention relates to computer program analysis, specifically determining inheritance or adherence to ordering and partitioning schemes in data structures. The technology addresses the challenge of analyzing complex data structures to identify how they inherit properties or follow specific organizational rules, which is critical for software optimization, debugging, and maintenance. The invention involves a computer program that processes a data structure function (DSF) descriptor to assess whether the DSF inherits characteristics from other structures or adheres to predefined ordering or partitioning schemes. The DSF descriptor contains metadata about the data structure, including its relationships, properties, and constraints. By analyzing this descriptor, the program can determine if the DSF follows hierarchical inheritance patterns, such as those in object-oriented programming, or if it conforms to partitioning schemes like those used in distributed systems or parallel processing. The analysis helps developers and compilers optimize performance, ensure consistency, and detect errors in data structure implementations. For example, if a DSF is supposed to maintain a specific ordering (e.g., sorted elements), the program can verify compliance. Similarly, if partitioning is required (e.g., sharding in databases), the program can confirm that the DSF adheres to the expected partitioning rules. This capability is particularly useful in large-scale systems where data structure behavior must be predictable and efficient.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

G06F

Patent Metadata

Filing Date

December 5, 2019

Publication Date

December 13, 2022

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search