10877684

Changing a Distributed Storage Volume from Non-Replicated to Replicated

PublishedDecember 29, 2020
Assigneenot available in USPTO data we have
Technical Abstract

Patent Claims
19 claims

Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.

Claim 1

Original Legal Text

1. A method comprising: providing a distributed storage system including a plurality of nodes coupled to one another by a network, each nodes of the plurality of nodes being a computing device, the plurality of nodes further including a plurality of storage nodes, each storage node of the plurality of storage nodes including a storage device; providing a storage volume including a plurality of slices stored on two or more first storage nodes of the plurality of storage nodes; receiving, by a storage manager executing in the distributed storage system, a first instruction to convert the storage volume to a replicated storage volume; in response to the first instruction, generating, by the storage manager, a second instruction to each first storage node of the two or more first storage node to create a replica slice of each slice stored on the each first storage node on a corresponding second storage node of the plurality of storage nodes; performing, by each first storage node in response to the second instruction for each slice of the plurality of slices stored on the each first storage node: (a) fencing a primary open segment allocated to the each slice that is not full and is open for writing of additional data; (b) while performing (a), copying the primary open segment to a replica open segment on the second storage node corresponding to the each first storage node; (c) after performing (b) unfencing the primary open segment; (d) after performing (c) executing write operations to the each slice in both the primary open segment and the replica open segment; and (e) after performing (c) copying a plurality of primary final segments to replica final segments on the second storage node corresponding to the each first storage node.

Plain English Translation

A distributed storage system includes multiple computing devices (nodes) interconnected by a network, with some nodes functioning as storage nodes equipped with storage devices. The system manages a storage volume divided into multiple slices distributed across two or more storage nodes. To enhance data redundancy, the system converts the storage volume into a replicated storage volume. A storage manager receives an instruction to initiate this conversion and generates commands to each storage node holding slices. Each storage node then processes its slices by first fencing (locking) any open, non-full segments to prevent further writes. While the segment is fenced, the node copies the primary open segment to a corresponding replica segment on a designated second storage node. After copying, the primary segment is unfenced, allowing subsequent write operations to be mirrored to both the primary and replica segments. Additionally, the node copies all finalized segments from the primary storage to the replica storage. This process ensures data consistency and redundancy by maintaining synchronized copies of both active and finalized segments across multiple storage nodes. The method improves fault tolerance by replicating data segments while minimizing disruption to ongoing write operations.

Claim 2

Original Legal Text

2. The method of claim 1 , further comprising: (f) after performing (e) notifying the storage manager that a replica of the each slice is complete.

Plain English Translation

A system and method for managing data storage in a distributed environment involves creating and verifying replicas of data slices across multiple storage nodes. The method addresses the challenge of ensuring data redundancy and integrity in distributed storage systems, where data is divided into slices and stored across different nodes to improve reliability and availability. The process includes dividing data into multiple slices, distributing these slices to different storage nodes, and creating replicas of each slice on separate nodes to ensure redundancy. After distributing the slices, the system verifies that each replica has been successfully created and stored. Once verification is complete, the storage manager is notified that the replication process for each slice is finished. This notification allows the storage manager to track the status of data replication and ensure that all required replicas are properly stored. The method improves data reliability by confirming that replicas are correctly created and stored, reducing the risk of data loss in case of node failures. The system is particularly useful in large-scale distributed storage systems where data integrity and availability are critical.

Claim 3

Original Legal Text

3. The method of claim 2 , further comprising: in response to (f), incrementing, by the storage manager, a replica count of the each slice in an entry corresponding to the each slice and adding a reference to the second storage node corresponding to the each first storage node to the entry.

Plain English Translation

A system and method for managing data storage in a distributed storage environment involves tracking and updating replica counts for data slices stored across multiple storage nodes. The method addresses the challenge of maintaining data redundancy and consistency in distributed storage systems, where data is divided into slices and replicated across different nodes to ensure reliability and availability. When a new storage node is added to the system, the method identifies slices stored on existing nodes and creates corresponding entries in a metadata structure. For each slice, the method increments a replica count to reflect the new replica stored on the second storage node and updates the entry to include a reference to the second storage node. This ensures that the metadata accurately reflects the current distribution and redundancy of the data slices across the storage nodes, enabling efficient data management and retrieval. The method supports dynamic scaling of the storage system by allowing seamless addition of new nodes while maintaining data consistency and redundancy.

Claim 4

Original Legal Text

4. The method of claim 1 , further comprising: performing (d) and (e) in parallel.

Plain English Translation

A system and method for parallel processing of data streams involves receiving a first data stream and a second data stream, where the first data stream includes a first set of data elements and the second data stream includes a second set of data elements. The method processes the first data stream by applying a first transformation to the first set of data elements to generate a first transformed output, and processes the second data stream by applying a second transformation to the second set of data elements to generate a second transformed output. The first and second transformations are performed simultaneously, allowing for concurrent processing of the data streams. The transformed outputs may then be combined or further processed to produce a final result. This approach improves efficiency by reducing processing time through parallel execution of independent transformations on separate data streams, which is particularly useful in applications requiring real-time or high-throughput data processing, such as financial transactions, sensor data analysis, or multimedia streaming. The method ensures that the transformations are applied without interference, maintaining data integrity and processing speed.

Claim 5

Original Legal Text

5. The method of claim 1 , wherein (d) comprises performing, by the each first storage node: receiving a first write operation; executing the first write operation on the primary open segment; transmitting the first write operation to the second storage node corresponding to the each first storage node; receiving an acknowledgment of completion of the first write operation from the second storage node corresponding to the each first storage node; and transmitting an acknowledgment of completion of the first write operation to a source of the first write operation only after receiving the acknowledgment of completion of the first write operation from the second storage node corresponding to the each first storage node.

Plain English Translation

This invention relates to distributed storage systems, specifically a method for ensuring data consistency and durability in a storage system with multiple nodes. The problem addressed is maintaining data integrity across distributed storage nodes while minimizing latency in write operations. The system includes a plurality of first storage nodes and second storage nodes, where each first storage node is paired with a second storage node. Data is stored in segments, with each first storage node managing a primary open segment and a secondary open segment. When a write operation is received by a first storage node, it is first executed on the primary open segment. The write operation is then transmitted to the corresponding second storage node, which also executes the operation. The first storage node only sends an acknowledgment of completion to the source of the write operation after receiving confirmation from the second storage node that the operation has been completed. This ensures that the write operation is durably stored in at least two locations before the source is notified, improving data reliability. The method also includes handling segment transitions, where a primary open segment is closed and a new primary open segment is opened, with data consistency maintained during these transitions. The system is designed to handle failures by ensuring that data is replicated across nodes, preventing data loss in case of node failure. The method optimizes performance by minimizing the number of acknowledgments required while ensuring data durability.

Claim 6

Original Legal Text

6. The method of claim 1 , wherein fencing the primary open segment comprises suppressing acknowledgment of write operations to the each slice.

Plain English Translation

A system and method for managing data storage in a distributed storage environment involves partitioning data into multiple slices and organizing these slices into segments. The primary open segment is a segment currently receiving write operations, while other segments are closed and may be subject to background operations like garbage collection or compaction. To ensure data consistency and prevent corruption during these background operations, the system implements a fencing mechanism for the primary open segment. This fencing mechanism suppresses acknowledgment of write operations to each slice within the primary open segment, effectively preventing the system from confirming the completion of these operations. By doing so, the system ensures that any ongoing background operations on closed segments do not interfere with active write operations in the primary open segment, maintaining data integrity and consistency. The method may also include additional steps such as tracking the status of each slice, coordinating between different storage nodes, and dynamically adjusting the fencing mechanism based on system conditions. This approach is particularly useful in distributed storage systems where multiple nodes must synchronize write operations while allowing background maintenance tasks to proceed without disrupting active data writes.

Claim 7

Original Legal Text

7. A method comprising: receiving, by a storage node including a memory device, a storage device, and a processing device coupled to the memory device and the storage device, an instruction to convert a storage unit into a replicated storage unit, the storage unit storing a plurality of segments allocated to the storage unit and including a primary open segment that is not full and open for writing of additional data and a plurality of primary final segments that store data written to the storage unit and not available for writing of additional data; in response to the instruction: (a) fencing, by the storage node, a primary open segment allocated to the storage unit on the storage device; (b) while performing (a), copying the primary open segment to a replica open segment on a second storage node; (c) after performing (b) unfencing the primary open segment; (d) after performing (c) executing write operations to the storage unit in both the primary open segment and the replica open segment; and (e) after performing (c) copying the plurality of primary final segments to replica final segments on the second storage node.

Plain English Translation

This invention relates to data storage systems, specifically methods for converting a non-replicated storage unit into a replicated storage unit to enhance data redundancy and reliability. The problem addressed is ensuring data consistency and availability during the conversion process, particularly when dealing with an open segment that is actively being written to. The method involves a storage node with memory, storage, and processing components. Upon receiving an instruction to convert a storage unit into a replicated storage unit, the system first identifies the storage unit's segments, including a primary open segment (partially filled and actively written to) and multiple primary final segments (fully written and closed). The conversion process begins by fencing the primary open segment to prevent further writes, then copying its contents to a replica open segment on a second storage node. Once the copy is complete, the primary open segment is unfenced, allowing write operations to resume on both the primary and replica open segments. Finally, the primary final segments are copied to replica final segments on the second storage node, completing the replication process. This ensures data consistency and minimizes downtime during the conversion.

Claim 8

Original Legal Text

8. The method of claim 7 , further comprising: (f) after performing (e) notifying, by the storage node, a storage manager that a replica of the storage unit is complete.

Plain English Translation

A distributed storage system manages data redundancy by creating and maintaining multiple replicas of storage units across different storage nodes. A challenge in such systems is efficiently tracking the completion of replica creation to ensure data consistency and availability. This invention addresses this problem by implementing a notification mechanism between storage nodes and a central storage manager. The method involves a storage node creating a replica of a storage unit, which may include data blocks or other storage objects. After the replica is successfully created, the storage node sends a notification to the storage manager, indicating that the replica is complete. This notification allows the storage manager to update its records, ensuring that the system recognizes the replica as available and reliable. The storage manager can then use this information to manage data distribution, redundancy, and recovery processes. The notification may include metadata about the replica, such as its location, status, or timestamp, to help the storage manager maintain accurate system state information. This mechanism improves system reliability by ensuring that the storage manager is aware of all completed replicas, reducing the risk of data loss or inconsistency. The method is particularly useful in large-scale distributed storage systems where real-time tracking of replica status is critical for maintaining data integrity.

Claim 9

Original Legal Text

9. The method of claim 8 , further comprising: in response to (f), incrementing, by the storage manager, a replica count of the storage unit in an entry corresponding to the storage unit and adding a reference to the second storage node to the entry.

Plain English Translation

This invention relates to distributed storage systems, specifically managing data replication across multiple storage nodes to ensure data availability and durability. The problem addressed is efficiently tracking and updating replica counts and node references when new storage nodes are added to a distributed storage system. The method involves a storage manager that maintains a data structure, such as a table or database, where each entry corresponds to a storage unit (e.g., a file, block, or object) and includes a replica count and references to storage nodes holding replicas of that unit. When a new storage node is added to the system, the storage manager identifies a storage unit to be replicated and transfers a copy of the unit to the new node. After successful transfer, the storage manager increments the replica count for the storage unit in its corresponding entry and adds a reference to the new storage node to that entry. This ensures the system accurately tracks the number of replicas and their locations, enabling efficient data recovery and load balancing. The method may also include verifying the integrity of the transferred replica before updating the entry. This approach improves fault tolerance and scalability in distributed storage environments.

Claim 10

Original Legal Text

10. The method of claim 7 , further comprising: performing (d) and (e) in an interleaved manner.

Plain English Translation

This invention relates to a method for optimizing the execution of computational tasks, particularly in systems where tasks are processed in parallel or sequentially. The problem addressed is the inefficiency that arises when tasks are executed in a rigid, non-interleaved sequence, leading to suboptimal resource utilization and delays in task completion. The method involves performing two distinct operations, referred to as (d) and (e), in an interleaved manner to improve efficiency. Operation (d) involves processing a first set of data or tasks, while operation (e) involves processing a second set of data or tasks. By interleaving these operations, the method ensures that resources are used more effectively, reducing idle time and improving overall system performance. The interleaving may involve alternating between (d) and (e) at predefined intervals or based on dynamic conditions such as resource availability or task priority. This approach is particularly useful in computing systems, data processing pipelines, or any scenario where multiple tasks must be executed concurrently or in close succession. The method may also include additional steps such as monitoring system performance, adjusting interleaving parameters, or prioritizing certain tasks to further enhance efficiency. The interleaving technique can be applied to various computational processes, including but not limited to parallel computing, real-time data processing, and task scheduling in multi-core systems.

Claim 11

Original Legal Text

11. The method of claim 7 , wherein (d) comprises performing, by the storage node: receiving a first write operation; executing the first write operation on the primary open segment; transmitting the first write operation to the second storage node; receiving an acknowledgment of completion of the first write operation from the second storage node; and transmitting an acknowledgment of completion of the first write operation to a source of the first write operation only after receiving the acknowledgment of completion of the first write operation from the second storage node.

Plain English Translation

This invention relates to distributed storage systems, specifically ensuring data consistency and durability across multiple storage nodes. The problem addressed is maintaining data integrity when writing operations are distributed across a primary storage node and a secondary storage node, ensuring that data is not acknowledged as written until it is successfully stored in both locations. The method involves a storage node receiving a write operation and executing it on a primary open segment, which is the active storage location for new data. The write operation is then transmitted to a secondary storage node for redundancy. The primary storage node waits for an acknowledgment from the secondary node confirming that the write operation has been completed there. Only after receiving this acknowledgment does the primary node send an acknowledgment back to the source of the original write operation, ensuring that the data is durably stored in both locations before confirming success. This approach prevents data loss if one node fails before the write operation is fully replicated. The method ensures that write operations are atomic and consistent across the distributed storage system, maintaining data reliability in the event of node failures.

Claim 12

Original Legal Text

12. The method of claim 7 , wherein fencing the primary open segment comprises suppressing acknowledgment of write operations to the storage unit.

Plain English Translation

A method for managing data storage systems addresses the challenge of maintaining data consistency and availability during storage operations. The method involves isolating a primary open segment of a storage unit to prevent further modifications while allowing read operations to continue. This isolation is achieved by suppressing acknowledgment of write operations directed to the storage unit, effectively blocking new write requests without disrupting ongoing read access. The technique ensures that the storage unit remains accessible for reads while preventing data corruption or inconsistencies during critical operations. The method is particularly useful in distributed storage systems where maintaining data integrity and availability is essential. By suppressing acknowledgments, the system can safely handle transitions, such as failover or maintenance, without compromising data reliability. The approach is designed to work with storage units that support both read and write operations, ensuring seamless integration into existing storage architectures. The method enhances system resilience by preventing unauthorized or unintended writes during sensitive phases, thereby improving overall data management efficiency.

Claim 13

Original Legal Text

13. The method of claim 7 , further comprising: (f) receiving, by the second storage node, an instruction to convert the storage unit to a non-replicated storage unit; in response to (f), faulting storage node with respect to the storage unit and deleting the replica final segments from the second storage node.

Plain English Translation

A distributed storage system manages data across multiple storage nodes, where data is initially stored in a replicated manner for redundancy. The system includes a first storage node that stores a primary copy of a storage unit and a second storage node that stores a replica of the storage unit. The storage unit is divided into segments, and the second storage node maintains final segments of the replica. To optimize storage efficiency, the system allows converting the storage unit from a replicated state to a non-replicated state. When an instruction is received to convert the storage unit to non-replicated, the second storage node is faulted with respect to the storage unit, and the replica final segments stored on the second storage node are deleted. This ensures that the storage unit is no longer replicated, reducing storage overhead while maintaining data integrity on the primary storage node. The conversion process is triggered by an explicit instruction, allowing dynamic adjustment of replication based on system requirements or operational conditions. The method ensures that the replica data is properly cleaned up to avoid inconsistencies and wasted storage space.

Claim 14

Original Legal Text

14. The method of claim 13 , further comprising: in response to write operations received after (f), performing, by the storage node, the write operations received after (f) without requesting replicating of the write operations received after (f) on the second storage unit.

Plain English Translation

This invention relates to data storage systems, specifically methods for managing write operations in distributed storage environments where data is replicated across multiple storage units. The problem addressed is ensuring data consistency and availability during storage node failures or network partitions while minimizing unnecessary replication overhead. The method involves a storage system with at least two storage units, where data is initially replicated between them. When a storage node detects a failure or partition event (event (f)), it enters a degraded mode. During this mode, write operations received after event (f) are performed locally on the primary storage unit without replicating them to the secondary storage unit. This prevents data loss or corruption during network instability while reducing unnecessary replication traffic. The system later reconciles the non-replicated writes once normal operation resumes, ensuring eventual consistency. The method also includes monitoring the storage units for recovery from the failure or partition event. Once the secondary storage unit is restored, the system synchronizes the non-replicated writes to maintain data consistency across the storage units. This approach balances immediate write performance with long-term data integrity, particularly useful in distributed storage systems where network reliability cannot be guaranteed.

Claim 15

Original Legal Text

15. A system comprising: a memory device, a storage device, and a processing device coupled to the memory device and the storage device, the memory device storing instructions that, when executed by the processing device, cause the processing device to: receive an instruction to convert a storage unit stored on the storage device into a replicated storage unit, the storage unit storing a plurality of segments allocated to the storage unit and including a primary open segment that is not full and open for writing of additional data and a plurality of primary final segments that store data written to the storage unit and not available for writing of additional data; in response to the instruction: (a) fence a primary open segment allocated to the storage unit on the storage device; (b) while performing (a), copy the primary open segment to a replica open segment on a second storage node; (c) after performing (b) unfence the primary open segment; (d) after performing (c) execute write operations to the storage unit in both the primary open segment and the replica open segment; and (e) after performing (c) copy the plurality of primary final segments to replica final segments on the second storage node.

Plain English Translation

The system relates to data storage and replication in distributed storage environments. The problem addressed is ensuring data consistency and availability during the conversion of a storage unit into a replicated storage unit, particularly when handling an open segment that is actively being written to. The system includes a memory device, a storage device, and a processing device. The storage unit contains multiple segments, including a primary open segment (partially filled and open for writes) and multiple primary final segments (fully written and closed). When instructed to convert the storage unit into a replicated storage unit, the system first fences the primary open segment to prevent further writes, then copies this segment to a replica open segment on a second storage node while the fencing is active. After copying, the primary open segment is unfenced, allowing writes to resume. Subsequent write operations are then mirrored to both the primary and replica open segments. Finally, the system copies the primary final segments to replica final segments on the second storage node. This approach ensures data consistency and minimizes downtime during replication by handling the active segment separately from the finalized segments.

Claim 16

Original Legal Text

16. The system of claim 15 , wherein the memory device further stores instructions that, when executed by the processing device, cause the processing device to: (f) after performing (e), notify a storage manager that a replica of the storage unit is complete.

Plain English Translation

A system for managing data storage includes a processing device and a memory device storing instructions that, when executed, cause the processing device to perform operations related to data replication. The system is designed to address challenges in ensuring data integrity and availability by creating and verifying replicas of storage units. The memory device stores instructions for the processing device to receive a request to create a replica of a storage unit, identify a target storage location for the replica, and initiate the replication process. The system then monitors the replication progress, verifies the integrity of the replicated data, and ensures the replica is complete and accurate. After confirming the replica is successfully created, the system notifies a storage manager to update records and manage the newly created replica. This notification allows the storage manager to track the status of storage units and ensure proper data distribution and redundancy. The system enhances data reliability by automating the replication process and providing real-time updates to the storage manager, reducing the risk of data loss and improving system efficiency.

Claim 17

Original Legal Text

17. The system of claim 15 , wherein the memory device further stores instructions that, when executed by the processing device, cause the processing device to: perform (d) and (e) in an interleaved manner.

Plain English Translation

A system for managing data processing tasks includes a processing device and a memory device storing instructions. The system is designed to optimize task execution by interleaving specific operations. The processing device executes instructions to perform a first operation (d) and a second operation (e) in an interleaved manner, meaning the operations are alternated or overlapped in time to improve efficiency. The interleaving may involve switching between operations at predefined intervals, prioritizing one operation over the other based on system conditions, or dynamically adjusting the interleaving pattern to balance workload. This approach reduces latency and enhances throughput by preventing bottlenecks that occur when operations are executed sequentially. The system may be used in applications requiring real-time processing, such as data streaming, signal processing, or parallel computing, where interleaving operations helps maintain performance under varying loads. The memory device also stores additional instructions for other system functions, such as task scheduling, resource allocation, or error handling, which support the interleaved execution of operations (d) and (e). The interleaving mechanism may be configurable, allowing adjustments based on system requirements or external inputs.

Claim 18

Original Legal Text

18. The system of claim 15 , wherein the memory device further stores instructions that, when executed by the processing device, cause the processing device to perform (d) by: receiving a first write operation; executing the first write operation on the primary open segment; transmitting the first write operation to the second storage node; receiving an acknowledgment of completion of the first write operation from the second storage node; and transmitting an acknowledgment of completion of the first write operation to a source of the first write operation only after receiving the acknowledgment of completion of the first write operation from the second storage node.

Plain English Translation

A distributed storage system ensures data consistency across multiple storage nodes by synchronizing write operations. The system addresses the challenge of maintaining data integrity in distributed environments where write operations must be reliably replicated across nodes before acknowledgment is sent to the source. The system includes a primary storage node with a processing device and a memory device storing instructions for managing write operations. When a write operation is received, the primary node executes it on a primary open segment and forwards the operation to a secondary storage node. The primary node only sends an acknowledgment of completion to the source after confirming that the secondary node has successfully completed the write operation. This ensures that data is consistently replicated across nodes before the source is notified, preventing data loss or inconsistency in case of node failures. The system may also include additional mechanisms for managing segments, such as closing and opening segments to optimize storage performance and reliability. The synchronization process guarantees that all nodes in the distributed system maintain identical data states, enhancing fault tolerance and data durability.

Claim 19

Original Legal Text

19. The system of claim 15 , wherein the memory device further stores instructions that, when executed by the processing device, cause the processing device to fence the primary open segment by suppressing acknowledgment of write operations to the storage unit.

Plain English Translation

A system for managing data storage operations in a distributed or high-availability environment addresses the challenge of maintaining data consistency and integrity during write operations. The system includes a processing device and a memory device storing instructions that, when executed, enable the processing device to manage data segments in a storage unit. Specifically, the system can fence a primary open segment by suppressing acknowledgment of write operations directed to the storage unit. This suppression prevents the storage unit from confirming the completion of write operations, effectively blocking further modifications to the primary open segment until the fencing is resolved. The system may also include a secondary storage unit and a secondary processing device to handle failover operations, ensuring data availability and consistency in the event of a primary system failure. The instructions further enable the processing device to detect a failure condition, such as a loss of communication with the primary storage unit, and initiate a failover process to the secondary storage unit. The system may also include a network interface for communication between components and a power supply for operational stability. The fencing mechanism ensures that no conflicting writes occur during a failover, maintaining data integrity across the storage system.

Patent Metadata

Filing Date

Unknown

Publication Date

December 29, 2020

Inventors

Ripulkumar Hemantbhai Patel
Dhanashankar Venkatesan
Jagadish Kumar Mukku

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, FAQs, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “CHANGING A DISTRIBUTED STORAGE VOLUME FROM NON-REPLICATED TO REPLICATED” (10877684). https://patentable.app/patents/10877684

© 2026 Nomic Interactive Technology LLC. Machine-readable context available at /api/llm-context/10877684. See llms.txt for full attribution policy.