Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.
1. A method implemented by a slave database server, the method comprising: receiving a first log sent by a master database server, the first log including a data table identifier, a match identifier, a log identifier, and a data change to a data table corresponding to the first log; determining a match tree based at least in part on the data table identifier of the first log; fragmenting the first log based at least in part on the match tree; and while fragmenting: concurrently replaying multiple fragments obtained through a log fragmentation, wherein the first log is replayed according to the log identifier of the first log and the data change, increasing a number of layers of branch nodes included in the match tree if a number of target logs distributed to the fragment exceeds a preset threshold, matching the match identifier of the first log with the match tree, and determining a target sub-node that matches the match identifier of the first log based at least in part on a tree structure of the match tree, the match identifier of the first log being within a range of sub-match values of the target sub-node, and the target sub-node being a sub-node at a bottom layer of the tree structure.
This invention relates to database replication systems, specifically optimizing log processing in slave database servers to improve efficiency and scalability. The problem addressed is the inefficiency in handling large volumes of data changes from a master database server, particularly when logs must be fragmented and replayed in a distributed manner. The method involves a slave database server receiving a log from a master server, where the log contains a data table identifier, a match identifier, a log identifier, and a data change. The slave server constructs or accesses a match tree based on the data table identifier to organize and distribute log fragments. The log is then fragmented according to the match tree structure. During fragmentation, multiple fragments are concurrently replayed based on their log identifiers and associated data changes. The match tree dynamically adjusts by increasing branch node layers if the number of logs assigned to a fragment exceeds a preset threshold. The match identifier in the log is matched against the tree to locate a target sub-node at the bottom layer, ensuring the identifier falls within the sub-node's range of values. This approach optimizes log distribution and replay, reducing bottlenecks in high-throughput database replication environments.
2. The method of claim 1 , wherein: the tree structure of the match tree includes a root node and one or more layers of sub-nodes, a total range of sub-match values for matching of sub-nodes at a same layer is equal to a range of match values of the root node, and respective ranges of sub-match values of sub-nodes at the bottom layer of the tree structure of the match tree are set such that a difference between data of respective target logs distributed to the sub-nodes at the bottom layer by matching in a log fragmentation via the match tree is less than a preset difference.
This invention relates to a method for optimizing log fragmentation in a distributed system using a hierarchical match tree structure. The problem addressed is the inefficient distribution of log data across nodes, leading to imbalances in processing loads and storage utilization. The solution involves a match tree with a root node and multiple layers of sub-nodes, where the total range of sub-match values for sub-nodes at the same layer equals the range of match values of the root node. The bottom layer of the tree is configured such that the difference in data volume distributed to each sub-node is minimized, ensuring balanced fragmentation. This hierarchical structure allows for efficient log matching and distribution, reducing discrepancies in data allocation and improving system performance. The method ensures that logs are evenly distributed across nodes, preventing bottlenecks and optimizing resource usage in distributed log processing systems.
3. The method of claim 1 , further comprising: allocating the first log to a fragment corresponding to the target sub-node.
Technical Summary: This invention relates to data storage systems, specifically methods for managing log data in a distributed or hierarchical storage architecture. The problem addressed is efficient log allocation and retrieval in systems where data is organized into nodes and sub-nodes, often fragmented across storage devices. The method involves assigning log data to specific storage fragments based on their target sub-node location. When a log entry is generated, the system identifies the relevant sub-node and allocates the log to a corresponding fragment associated with that sub-node. This ensures logs are stored in proximity to the data they reference, improving access efficiency and reducing latency during retrieval operations. The method may also include generating log entries that contain metadata about the target sub-node, such as identifiers or location pointers, to facilitate accurate allocation. The system may further validate the log allocation by verifying that the fragment corresponds to the intended sub-node before storing the log. This approach optimizes storage performance by minimizing cross-node data transfers and improving locality of reference, particularly in systems with hierarchical or distributed storage structures. The method is applicable to databases, file systems, or other storage architectures where logs must be efficiently managed alongside primary data.
4. The method of claim 3 , wherein the match tree comprises a partial tree structure of a balanced tree that describes the data table, a root node of the match tree is a root node of the balanced tree, and at least one layer of sub-nodes of the match tree is at least one layer of branch nodes of the balanced tree.
This invention relates to data processing systems that use tree structures for efficient data matching and retrieval. The problem addressed is optimizing the performance of data table searches by leveraging partial tree structures to reduce computational overhead while maintaining accuracy. The method involves constructing a match tree that is a partial representation of a balanced tree used to describe a data table. The match tree retains the root node of the balanced tree, ensuring consistency with the full data structure. Additionally, at least one layer of sub-nodes in the match tree corresponds to at least one layer of branch nodes in the balanced tree, preserving key hierarchical relationships. This partial structure allows for efficient matching operations by focusing on relevant portions of the tree, reducing the need to traverse the entire balanced tree. The match tree is designed to support fast lookups and comparisons, particularly in scenarios where only certain branches of the data table are relevant to a given query. By maintaining alignment with the balanced tree's root and critical branch layers, the method ensures that the match tree remains an accurate and efficient subset for data retrieval tasks. This approach is useful in applications requiring real-time data processing, such as database indexing, search engines, or machine learning model training, where minimizing computational complexity is essential.
5. The method of claim 4 , wherein increasing the number of layers of the branch nodes included in the match tree includes increasing the number of layers of the branch nodes of the balanced tree included in the match tree.
A method for optimizing a match tree structure in data processing systems, particularly for improving search efficiency in hierarchical data structures. The match tree is used to accelerate pattern matching or data retrieval operations by organizing data into a tree-like structure with branch nodes. The method involves dynamically adjusting the number of layers of branch nodes within the match tree to enhance performance. Specifically, increasing the number of layers of branch nodes in the match tree includes increasing the number of layers of branch nodes in a balanced tree structure that is part of the match tree. This adjustment ensures that the tree remains balanced, which is critical for maintaining efficient search and retrieval operations. The balanced tree structure helps minimize search time by reducing the depth of the tree, thereby allowing faster traversal and comparison operations. This method is particularly useful in systems where large datasets need to be processed quickly, such as in network routing, database indexing, or real-time data analysis applications. The dynamic adjustment of branch node layers allows the system to adapt to varying data sizes and access patterns, ensuring optimal performance under different operational conditions.
6. The method of claim 1 , wherein the match identifier of the first log identifies a first data record, and the first data record includes a data record in a data table corresponding to the first log.
A system and method for log data processing and record matching involves analyzing log entries to identify and correlate them with specific data records in a database. The method addresses the challenge of efficiently linking log data to relevant database records, improving data traceability and system diagnostics. When a log entry is generated, it includes a match identifier that uniquely references a corresponding data record in a database table. The system processes the log entry, extracts the match identifier, and retrieves the associated data record from the database. This ensures that each log entry is directly linked to the specific database record it pertains to, enabling accurate tracking and analysis of system events. The method may also involve validating the match identifier against the database to confirm the existence of the referenced record, enhancing data integrity. By maintaining this direct linkage, the system improves debugging, auditing, and performance monitoring capabilities, particularly in large-scale or distributed systems where log data must be correlated with database transactions. The approach ensures that log entries provide meaningful context by explicitly referencing the database records they relate to, reducing ambiguity and improving system observability.
7. The method of claim 6 , wherein the match identifier of the first log includes a primary key value of the first data record.
A system and method for correlating log data with database records involves extracting a match identifier from a log entry and using it to retrieve a corresponding data record from a database. The match identifier in the log entry includes a primary key value of the associated data record, enabling precise identification and retrieval. The system processes log data to extract the match identifier, queries the database using this identifier, and retrieves the corresponding record. This allows for accurate correlation between log entries and database records, improving data analysis and troubleshooting. The method ensures that log entries can be directly linked to specific database records, enhancing traceability and reducing errors in data correlation. The system may also include additional processing steps, such as filtering or transforming the log data before extraction, to improve the accuracy of the match identifier retrieval. The database query may be optimized to efficiently retrieve the record using the primary key, ensuring fast and reliable correlation. This approach is particularly useful in environments where logs and database records must be synchronized for auditing, debugging, or performance monitoring.
8. One or more computer-readable media storing executable instructions that, when executed by one or more processors, cause the one or more processors to perform acts comprising: receiving a first log sent by a master database server, the first log including a data table identifier, a match identifier, a log identifier, and a data change to a data table corresponding to the first log; determining a match tree based at least in part on the data table identifier of the first log; fragmenting the first log based at least in part on the match tree; and while fragmenting: concurrently replaying multiple fragments obtained through a log fragmentation, wherein the first log is replayed according to the log identifier of the first log and the data change, increasing a number of layers of branch nodes included in the match tree if a number of target logs distributed to the fragment exceeds a preset threshold, matching the match identifier of the first log with the match tree, and determining a target sub-node that matches the match identifier of the first log based at least in part on a tree structure of the match tree, the match identifier of the first log is within a range of sub-match values of the target sub-node, and the target sub-node being a sub-node at a bottom layer of the tree structure.
This invention relates to database systems, specifically to optimizing log processing and replication in distributed database environments. The problem addressed is the inefficient handling of large transaction logs, which can slow down data replication and synchronization between master and replica databases. The solution involves a method for dynamically fragmenting and replaying logs to improve performance. The system receives a log from a master database server, which includes a data table identifier, a match identifier, a log identifier, and a data change to a corresponding data table. The log is processed by first determining a match tree based on the data table identifier. The log is then fragmented according to this match tree, which organizes data changes in a hierarchical structure. During fragmentation, multiple fragments are concurrently replayed based on the log identifier and the data change. If the number of target logs distributed to a fragment exceeds a preset threshold, the system increases the number of layers in the match tree to improve distribution efficiency. The match identifier of the log is matched against the match tree to determine a target sub-node, ensuring the log is routed to the correct sub-node at the bottom layer of the tree. This dynamic fragmentation and replay mechanism enhances log processing speed and scalability in distributed database systems.
9. The one or more computer-readable media of claim 8 , wherein: the tree structure of the match tree includes a root node and one or more layers of sub-nodes, a total range of sub-match values for matching of sub-nodes at a same layer is equal to a range of match values of the root node, and respective ranges of sub-match values of sub-nodes at the bottom layer of the tree structure of the match tree are set such that a difference between data of respective target logs distributed to the sub-nodes at the bottom layer by matching in a log fragmentation via the match tree is less than a preset difference.
This invention relates to a data processing system for efficiently distributing log data across multiple nodes in a hierarchical tree structure, minimizing data imbalance. The system addresses the challenge of evenly distributing log data in large-scale log processing systems, where uneven distribution can lead to performance bottlenecks and inefficient resource utilization. The system uses a match tree with a root node and multiple layers of sub-nodes to partition log data. The root node defines an initial range of match values, and each subsequent layer of sub-nodes further refines this range. The total range of sub-match values for sub-nodes at the same layer equals the range of the root node, ensuring consistent partitioning. At the bottom layer, the ranges of sub-match values are adjusted so that the difference in data volume distributed to each sub-node is minimized, preventing uneven data distribution. This hierarchical approach ensures that log data is evenly fragmented and distributed across nodes, improving load balancing and processing efficiency. The system dynamically adjusts the ranges at the bottom layer to maintain balanced data distribution, reducing the risk of overloading specific nodes. The invention is particularly useful in log management systems where large volumes of log data must be processed efficiently and evenly across multiple nodes.
10. The one or more computer-readable media of claim 8 , wherein the acts further comprise: allocating the first log to a fragment corresponding to the target sub-node.
A system and method for managing log data in a distributed storage environment addresses the challenge of efficiently organizing and retrieving log entries in large-scale systems. The invention involves a distributed storage system that processes log data by partitioning it into fragments, each associated with a specific sub-node in a hierarchical data structure. The system dynamically allocates log entries to these fragments based on their target sub-node, ensuring efficient storage and retrieval. When a log entry is generated, the system identifies the target sub-node for that entry and assigns the log to the corresponding fragment. This allocation optimizes data distribution, reduces search latency, and improves system scalability. The method includes steps for determining the target sub-node, creating or identifying the appropriate fragment, and storing the log entry within that fragment. The system may also handle log replication, fragmentation, and merging to maintain data consistency and availability. This approach enhances performance in distributed logging systems by minimizing fragmentation overhead and ensuring logs are stored in the most relevant sub-node fragments.
11. The one or more computer-readable media of claim 10 , wherein the match tree comprises a partial tree structure of a balanced tree that describes the data table, a root node of the match tree is a root node of the balanced tree, and at least one layer of sub-nodes of the match tree is at least one layer of branch nodes of the balanced tree.
The invention relates to data processing systems that use tree structures to efficiently search and match data in large datasets. The problem addressed is the computational inefficiency of traditional search methods when dealing with extensive data tables, particularly in scenarios requiring fast lookups or pattern matching. The solution involves a match tree, which is a partial structure derived from a balanced tree representing the data table. The match tree retains the root node of the balanced tree and includes at least one layer of sub-nodes corresponding to branch nodes of the balanced tree. This partial structure allows for optimized search operations by reducing the amount of data traversed while maintaining the hierarchical relationships necessary for accurate matching. The balanced tree ensures that the data is organized in a way that minimizes search time, and the partial structure of the match tree further enhances performance by focusing only on relevant branches. This approach is particularly useful in applications requiring real-time data retrieval or pattern recognition, such as database queries, machine learning, or network routing. The invention improves efficiency by leveraging the balanced tree's properties while avoiding the overhead of processing the entire tree structure.
12. The one or more computer-readable media of claim 11 , wherein increasing the number of layers of the branch nodes included in the match tree includes increasing the number of layers of the branch nodes of the balanced tree included in the match tree.
A method and system for optimizing pattern matching in a match tree structure, particularly for use in network packet processing or string matching applications. The invention addresses the inefficiency of traditional match tree structures, which can suffer from unbalanced branching, leading to slower search times and higher computational overhead. The solution involves dynamically adjusting the number of layers of branch nodes within the match tree to improve search performance. Specifically, the match tree includes a balanced tree structure where the number of layers of branch nodes is increased to enhance the tree's balance, ensuring more uniform search paths and reducing the worst-case search time. This adjustment is performed based on predefined criteria, such as the frequency of pattern matches or the depth of the tree, to maintain optimal performance. The balanced tree structure ensures that the match tree remains efficient even as new patterns are added or removed, preventing degradation in search efficiency over time. The invention is particularly useful in high-speed networking and security applications where rapid pattern matching is critical.
13. The one or more computer-readable media of claim 8 , wherein the match identifier of the first log identifies a first data record, and the first data record includes a data record in a data table corresponding to the first log.
This invention relates to data processing systems that manage and correlate log data with structured data records. The problem addressed is the difficulty in efficiently linking log entries to their corresponding data records in a database, which is critical for troubleshooting, auditing, and analytics. The solution involves a method for identifying and associating log entries with specific data records in a structured data table. When a log entry is generated, it includes a match identifier that uniquely references a data record in a data table. The system retrieves the data record using this identifier, ensuring accurate correlation between the log and the relevant data. This approach improves data traceability and simplifies debugging by providing direct access to the underlying data referenced by each log entry. The system may also support multiple log entries referencing the same data record, allowing for comprehensive tracking of changes or interactions involving that record. The method enhances data integrity and reduces errors in log-based analysis by ensuring logs are properly linked to their source data. This is particularly useful in large-scale systems where logs and data records are generated and stored separately but must be correlated for analysis.
14. The one or more computer-readable media of claim 13 , wherein the match identifier of the first log includes a primary key value of the first data record.
A system and method for log data processing and correlation in distributed computing environments addresses the challenge of efficiently identifying and correlating log entries from multiple sources. The system captures log data generated by various components of a distributed system, where each log entry includes a match identifier that uniquely associates it with a corresponding data record in a database. The match identifier is derived from a primary key value of the data record, ensuring accurate and deterministic correlation between log entries and their associated records. The system processes these logs to extract the match identifiers, then uses them to link log entries to their corresponding data records, enabling comprehensive analysis, debugging, and monitoring of system behavior. This approach improves traceability and reduces ambiguity in log data analysis by leveraging database primary keys as a reliable correlation mechanism. The system may also include additional metadata in the log entries, such as timestamps and source identifiers, to further enhance log correlation and analysis capabilities. The method ensures that log data remains consistent and actionable, even in large-scale distributed systems with high log volumes.
15. An apparatus comprising: one or more processors; memory; a receiving unit stored in the memory and executable by the one or more processors to receive a first log sent by a master database server, the first log including a data table identifier, a match identifier, a log identifier, and a data change to a data table corresponding to the first log; a determination unit stored in the memory and executable by the one or more processors to determine a match tree based at least in part on the data table identifier of the first log; a fragmentation unit stored in the memory and executable by the one or more processors to fragment the first log based at least in part on the match tree, match the match identifier of the first log with the match tree, and determine a target sub-node that matches the match identifier of the first log based at least in part on a tree structure of the match tree, the match identifier of the first log being within a range of sub-match values of the target sub-node, and the target sub-node being a sub-node at a bottom layer of the tree structure, and a replay unit stored in the memory and executable by the one or more processors to concurrently replaying multiple fragments obtained through a log fragmentation while the fragmentation unit fragments the first log, wherein the first log is replayed according to the log identifier of the first log and the data change, wherein, while the fragmentation unit fragments the first log, a number of layers of branch nodes included in the match tree is increased if a number of target logs distributed to the fragment exceeds a preset threshold.
This apparatus is designed for efficient log processing in distributed database systems, addressing challenges in managing and replaying large volumes of transaction logs. The system includes a receiving unit that captures logs from a master database server, where each log contains a data table identifier, a match identifier, a log identifier, and a data change to a specific data table. A determination unit then constructs or accesses a match tree structure based on the data table identifier. A fragmentation unit processes the log by fragmenting it according to the match tree, matching the log's match identifier to a target sub-node within the tree, and identifying the appropriate sub-node for handling the log. The target sub-node is selected from the bottom layer of the tree, ensuring the match identifier falls within the sub-node's range of values. Concurrently, a replay unit replays multiple log fragments in parallel, using the log identifier and data change to reconstruct the original transaction. If the number of logs assigned to a fragment exceeds a preset threshold, the match tree dynamically expands by adding more layers of branch nodes to improve distribution efficiency. This approach optimizes log processing by reducing bottlenecks and enhancing parallelism in distributed database environments.
16. The apparatus of claim 15 , further comprising a match tree setting unit configured to obtain the match tree according to a target table set, wherein: the target table set includes at least one data table in a database, a tree structure of the match tree includes a root node and at least one layer of sub-nodes of the root node, a range of match values for matching the root node is determined according to ranges of match identifiers included in multiple target logs, the multiple target logs being data logs corresponding to the at least one data table, and the first log being a target log of the multiple target logs, a total range of sub-match values for matching of sub-nodes at each layer is equal to the range of match values of the root node, respective ranges of sub-match values of sub-nodes at a bottom layer of the tree structure of the match tree are set such that a difference between data of respective target logs distributed to the sub-nodes at the bottom layer by matching during a log fragmentation via the match tree is less than a preset difference, a sub-node at the bottom layer of the tree structure of the match tree corresponding to a fragment.
This invention relates to database systems and specifically to optimizing log fragmentation for efficient data processing. The problem addressed is the need to distribute data logs across multiple nodes in a balanced manner to minimize processing imbalances and improve efficiency. The apparatus includes a match tree setting unit that constructs a match tree based on a target table set, which consists of one or more data tables in a database. The match tree has a hierarchical structure with a root node and multiple layers of sub-nodes. The root node's match value range is determined by analyzing the ranges of match identifiers in multiple target logs associated with the target tables. The total range of sub-match values for all sub-nodes at each layer equals the root node's range. The bottom layer of the tree is configured such that when logs are fragmented using the match tree, the data distribution across sub-nodes is balanced, ensuring the difference in data volume between any two sub-nodes is below a preset threshold. Each sub-node at the bottom layer corresponds to a specific data fragment, ensuring efficient and balanced log fragmentation. This approach optimizes data processing by reducing skew and improving parallel processing efficiency.
17. The apparatus of claim 15 , wherein the fragmentation unit further allocates the first log to a fragment corresponding to the target sub-node.
A system for managing data storage in a distributed database environment addresses the challenge of efficiently distributing and retrieving fragmented data across multiple nodes. The system includes a fragmentation unit that divides data logs into fragments and assigns them to specific sub-nodes within a distributed storage architecture. The fragmentation unit ensures that each log is allocated to a fragment corresponding to a target sub-node, optimizing data placement for performance and reliability. The system also includes a storage unit that stores the fragmented logs in the assigned sub-nodes and a retrieval unit that reconstructs the original data from the fragments when requested. The fragmentation unit may further determine the target sub-node based on predefined criteria, such as data locality, load balancing, or redundancy requirements. The storage unit may also verify the integrity of the stored fragments and handle any errors during storage or retrieval. This approach improves data accessibility and reduces latency in distributed storage systems by ensuring efficient fragmentation and allocation of data logs.
18. The apparatus of claim 15 , wherein: the match tree comprises a partial tree structure of a balanced tree that describes the data table, a root node of the match tree is a root node of the balanced tree, and at least one layer of sub-nodes of the match tree is at least one layer of branch nodes of the balanced tree.
This invention relates to data processing systems that use tree structures for efficient data matching and retrieval. The problem addressed is optimizing the performance of data searches in large datasets, particularly when dealing with structured data tables. Traditional tree-based search methods often require full tree traversal, which can be inefficient for certain query patterns. The invention describes an apparatus that employs a match tree, which is a partial structure of a balanced tree representing a data table. The match tree includes a root node that corresponds to the root node of the full balanced tree. Additionally, at least one layer of sub-nodes in the match tree aligns with at least one layer of branch nodes from the balanced tree. This partial structure allows for faster and more efficient data matching by reducing the number of nodes that need to be traversed during a search operation. The match tree can be dynamically adjusted based on query patterns, further improving search efficiency. The apparatus may also include components for constructing, maintaining, and querying the match tree, ensuring optimal performance for data retrieval tasks. This approach is particularly useful in database systems, data analytics, and other applications requiring fast and scalable data access.
19. The apparatus of claim 15 , wherein the match identifier of the first log identifies a first data record, and the first data record includes a data record in the data table corresponding to the first log.
The invention relates to a system for managing and correlating log data with data records in a database. The problem addressed is the difficulty in efficiently linking log entries to their corresponding data records, which is critical for troubleshooting, auditing, and system monitoring. The apparatus includes a log processing module that receives log data from various sources and extracts match identifiers from the logs. These identifiers are used to locate corresponding data records in a data table stored in a database. The system ensures that each log entry is accurately associated with the correct data record, improving traceability and reducing errors in data analysis. The apparatus also includes a validation module that verifies the integrity of the match identifiers and the associated data records, ensuring consistency and reliability. The invention further supports dynamic updates to the data table, allowing the system to adapt to changes in the data structure or new log formats. This enhances scalability and flexibility in handling diverse log sources and data schemas. The overall system improves efficiency in data correlation, reduces manual intervention, and provides a robust framework for log management in complex IT environments.
20. The apparatus of claim 19 , wherein the match identifier of the first log includes a primary key value of the first data record.
A system for log data processing and analysis involves matching log entries to corresponding data records in a database to enhance data correlation and retrieval. The system includes a log processing module that receives log data from various sources, such as application logs, system logs, or network logs. The log processing module extracts a match identifier from each log entry, which uniquely identifies a corresponding data record in a database. The system also includes a database interface that retrieves the data record associated with the match identifier from the database. A correlation engine then links the log entry to the retrieved data record, enabling efficient querying and analysis of log data in context with the associated database records. The system may also include a user interface for visualizing the correlated data, allowing users to trace log entries back to their source records and vice versa. The match identifier in the log entry may include a primary key value of the data record, ensuring accurate and unambiguous matching. This system improves log data usability by providing direct access to related database records, facilitating troubleshooting, auditing, and performance monitoring.
Unknown
December 3, 2019
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.