Patentable/Patents/US-12638975-B2
US-12638975-B2

Method and system for distributing and managing IO in a disaggregated storage architecture

PublishedMay 26, 2026
Assigneenot available in USPTO data we have
Inventorsnot available in USPTO data we have
Technical Abstract

A method for distributing and managing an Input/Output (IO) request in a disaggregated storage architecture includes receiving the IO request including IO data to be distributed in the disaggregated storage architecture, generating IO metadata corresponding to the IO data included in the received IO request, determining one or more controller parameters for each of a plurality of controllers of the disaggregated storage architecture, determining a first priority weight of each controller parameter of the one or more controller parameters based on a network type of the disaggregated storage architecture, determining a first IO management weight for each of the plurality of controllers based on the one or more controller parameters and corresponding first priority weights, and statically mapping each of the IO metadata and the IO data to at least one controller of the plurality of controllers based on the first IO management weights of the plurality of controller.

Patent Claims

Legal claims defining the scope of protection, as filed with the USPTO.

1

. A method for distributing and managing an Input/Output (IO) request in a disaggregated storage architecture, the method comprising:

2

. The method of, wherein the one or more controller parameters correspond to configurational parameters, and

3

. The method of, further comprising:

4

. The method of, further comprising:

5

. The method of, wherein re-mapping each of the IO metadata and the IO data to at least one controller of the plurality of controllers comprises:

6

. The method of, wherein the first set of relative comparison values comprises an indication of overall capability of each of the plurality of storage nodes.

7

. The method of, further comprising:

8

. The method of, further comprising:

9

. The method of, wherein migrating the IO metadata comprises:

10

. The method of, further comprising:

11

. The method of, wherein the one or more storage node parameters corresponds to configurational parameters, and

12

. A system for distributing and managing an Input/Output (IO) request in a disaggregated storage architecture, the system comprising:

13

. The system of, wherein the one or more controller parameters correspond to configurational parameters, the configurational parameters comprising at least one of:

14

. The system of, wherein the one or more cluster management modules are configured to:

15

. The system of,

16

. The system of, wherein the one or more cluster management modules are further configured to:

17

. The system of, wherein the one or more cluster management modules are further configured to:

18

. A method of a disaggregated storage architecture, comprising:

Detailed Description

Complete technical specification and implementation details from the patent document.

This application is based on and claims priority to Indian Complete Patent Application No. 202141061007, filed on Dec. 21, 2022, in the Indian Patent Office, and Indian Provisional Patent Application No. 202141061007, filed on Dec. 27, 2021, in the Indian Patent Office, the disclosures of which are incorporated herein by reference in their entireties.

Example embodiments of the present disclosure relate to distributed storage management, and in particular, to methods and systems for distributing and managing Input/Output (IO) including IO metadata and IO data in a disaggregated storage architecture

New challenges, such as emergence of new age workloads, e-commerce demands, asymmetric scaling, and increased adoption of flash storage devices, have necessitated development of a disaggregated storage system architecture. Also, workloads associated with e-commerce platforms are becoming unpredictable, and demands on storage and computing capabilities vary with time. Further, consumers are willing to use available over-the-shelf servers and storage devices, which offer flexibility to add resources of any capacity. Conventional data storage systems fail to meet the aforesaid requirements. Moreover, the emergence of quad-level cell (QLC) solid-state drive (SSD) as a cheaper alternative to hard disk drives (HDD) has resulted in increasing adoption of flash devices in enterprise storage devices. Due to lower endurance, cluster level endurance management becomes more challenging. The aforementioned challenges demand for the disaggregated storage system architecture which can manage flash devices at scale to cater to low endurance flash such as QLC SSD, penta level cell (PLC) SSD, provide flexibility in resource addition to cater to e-commerce needs, manage heterogeneous storages and controllers to allow for asymmetric scaling, and provide flexibility in deployment to cater to the new age workloads.

A distributed storage architecture includes a plurality of storage nodes for a plurality of clients with a mechanism for data synchronization and coordination among such storage nodes. Therefore, the distributed storage architecture provides remote management of storage nodes through disaggregation. Further, such disaggregation enables cluster level flash management. However, the challenges in the cluster level flash management lie in providing efficiency aware, flash aware, resource aware, and capacity aware distribution and scaling methods.

In general, such architecture follows two paradigms: a share nothing paradigm or a share everything paradigm. In the share nothing paradigm, data and associated metadata are statically mapped to the storage nodes and controllers. However, the share nothing paradigm is not flash friendly. Further, in the share everything paradigm, the data and associated metadata are mapped to any of the storage nodes or controllers. However, the share everything paradigm creates challenges in synchronization for IO access. The synchronization challenges may be overcome using various kind of synchronization locks, but this incurs additional network traffic or expense.

Accordingly, there is a need to overcome at least the above challenges in a distributed storage architecture.

Information disclosed in this Background section has already been known to or derived by the inventors before or during the process of achieving the embodiments of the present application, or is technical information acquired in the process of achieving the embodiments. Therefore, it may contain information that does not form the prior art that is already known to the public.

One or more example embodiments provide a method and system for flash-aware distributed storage disaggregation.

Additional aspects will be set forth in part in the description which follows and, in part, will be apparent from the description, or may be learned by practice of the presented embodiments.

According to an aspect of an example embodiment, a method for distributing and managing an Input/Output (IO) request in a disaggregated storage architecture may include receiving the IO request including IO data to be distributed in the disaggregated storage architecture, generating IO metadata corresponding to the IO data included in the received IO request, determining one or more controller parameters for each of a plurality of controllers of the disaggregated storage architecture, determining a first priority weight of each controller parameter of the one or more controller parameters based on a network type of the disaggregated storage architecture, determining a first IO management weight for each of the plurality of controllers based on the one or more controller parameters and corresponding first priority weights, and statically mapping each of the IO metadata and the IO data to at least one controller of the plurality of controllers based on the first IO management weights of the plurality of controllers.

According to an aspect of an example embodiment, a system for distributing and managing an IO request in a disaggregated storage architecture may include a plurality of client devices configured to generate the IO request including IO data to be distributed in the disaggregated storage architecture, a plurality of controllers coupled with the plurality of client devices, the plurality of controllers configured to receive the IO request from the plurality of client devices and generate IO metadata corresponding to the IO data included in the received IO request, a plurality of storage nodes coupled with the plurality of client devices and the plurality of controllers, and one or more cluster management modules coupled with the plurality of client devices, the plurality of controllers and the plurality of storage nodes, the one or more cluster management modules configured to determine one or more controller parameters for each of the plurality of controllers of the disaggregated storage architecture, determine a first priority weight of each controller parameter of the one or more controller parameters based on a network type of the disaggregated storage architecture, determine a first IO management weight for each of the plurality of controllers based on the one or more controller parameters and corresponding first priority weights, and statically map each of the IO metadata and the IO data to at least one controller of the plurality of controllers based on the first IO management weights of the plurality of controllers.

According to an aspect of an example embodiment, a method of a disaggregated storage architecture may include receiving an IO request including IO data, generating IO metadata corresponding to the IO data, determining at least one controller parameter for each of a plurality of controllers, determining a priority weight for the at least one controller parameter, determining an IO management weight based on the at least one controller parameter and corresponding priority weights, and mapping the IO metadata and the IO data to at least one controller of the plurality of controllers based on the IO management weight.

Hereinafter, example embodiments of the disclosure will be described in detail with reference to the accompanying drawings. The same reference numerals are used for the same components in the drawings, and redundant descriptions thereof will be omitted. The embodiments described herein are example embodiments, and thus, the disclosure is not limited thereto and may be realized in various other forms.

As used herein, expressions such as “at least one of,” when preceding a list of elements, modify the entire list of elements and do not modify the individual elements of the list. For example, the expression, “at least one of a, b, and c,” should be understood as including only a, only b, only c, both a and b, both a and c, both b and c, or all of a, b, and c.

The term “some” as used herein may be defined as “none, or one, or more than one, or all.” Accordingly, the terms “none,” “one,” “more than one,” “more than one, but not all” or “all” would all fall under the definition of “some.”

The terminology and structure employed herein is for describing, teaching, and illuminating embodiments and their specific features and elements, and does not limit, restrict, or reduce the spirit and scope of the claims or their equivalents.

Further, skilled artisans will appreciate that elements in the drawings are illustrated for simplicity and may not have been necessarily drawn to scale. For example, the flow charts illustrate the method in terms of the most prominent steps involved to help to improve understanding of aspects of the present invention. Furthermore, in terms of the construction of the device, one or more components of the device may have been represented in the drawings by conventional symbols, and the drawings may show only those specific details that are pertinent to understanding the embodiments of the present invention so as not to obscure the drawings with details that will be readily apparent to those of ordinary skill in the art having the benefit of the description herein.

Unless otherwise defined, all terms, and especially any technical and/or scientific terms, used herein may be taken to have the same meaning as commonly understood by one having ordinary skill in the art.

Example embodiments of the present disclosure will be described below in detail with reference to the accompanying drawings.

Example embodiments of the present disclosure provide a method and a system for distributing and managing Input/Output (IO) including IO metadata and IO data in a disaggregated storage architecture based on IO management weights determined for each of the controller and storage node in the disaggregated storage architecture.

illustrates an environment of a disaggregated storage architecture/system(interchangeably referred to as “the architecture” or “the system”), according to an example embodiment of the present disclosure. The architecturemay include a plurality of client devices-(collectively referred to as “the client device”), a plurality of controllers-(collectively referred to as “the controllers”) and a plurality of storage nodes-(collectively referred to as “the storage nodes”). The architecturemay also include a cluster manager (CM). In an example embodiment, the CMmay be coupled to the client device, the controlleror the storage nodesvia control paths may not interfere with the IO path.

The CMmay also be referred to as the cluster management module. Further, in the illustrated embodiment, only single CMis shown. However, any number of CMs required to implement example embodiments the present disclosure may be included. The architecturemay include the plurality of storage nodesto provide multiple storage areas to the client device. Further, the client devicemay be configured to access the plurality of storage nodesvia the plurality of controllers.

In an example embodiment, the client device, the controllerand the storage nodesmay be operatively coupled to each other via a network. For example, the plurality of client devices, the plurality of controllers, and the plurality of storage nodesmay be coupled using nonvolatile memory express over Fabrics (NVMeOF) network. Further, the client devicemay correspond to a device configured to generate an IO request for the controller. The IO request may require IO metadata and IO data to be stored in storage nodes. The client devicemay be configured to forward IO requests to controllersfor performing read and/or write requests using the storage nodes. The client devicemay include any suitable communication device such as, but not limited to, a mobile phone, a smart watch, a laptop computer, a desktop computer, a Personal Computer (PC), a notebook, a tablet, a server, and/or any other device configured to store and access data in the disaggregated storage architecture. In example embodiments, the client devicemay also correspond to an application server configured to store and access data to and from the storage nodes. The client devicemay include any suitable components such as, but not limited to, applications, hardware, and/or software drivers, configured to enable the client deviceto access the disaggregated storage architecture. In an example embodiment, the client devicemay include a client driver configured to act as an interface to an application installed at the client device. The client devicemay be operatively coupled to the controllersvia the networkto distribute and manage IO at the storage nodes.

The controllersmay be configured to act as an interface between the client deviceand the storage nodes. In an example embodiment, the controllermay be configured to receive IO request from the client deviceand process and distribute the corresponding IO data and IO metadata to the storage nodes. The controllersmay correspond to any suitable computing devices such as, but not limited to, a laptop computer, a desktop computer, a server, and/or any other device configured to store and access data in the disaggregated storage architecture. In an example embodiment, the controllermay be configured to implement logics to process IO request from the client deviceand maintain IO metadata. The controllermay be configured to distribute the IO data and IO metadata among the plurality of storage nodes.

The storage nodesmay be configured to provide storage space to store IO data and IO metadata. Example of the storage nodesmay include any suitable non-volatile memory, such as, but not limited to, read-only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, magnetic tapes, and so forth.

The CMmay be operatively coupled to each of the client device, the controllersand the storage nodes. In an example embodiment, the CMmay be configured to manage one or more clusters comprising the controllersand the storage nodes. The CMmay be configured to perform cluster management of the cluster which includes operations/events such as, but not limited to, addition and/or deletion of controllers, addition and/or deletion of storage nodes, IO distribution and controller/storage node failure. The CMand the client driver may facilitate storage disaggregation. A detailed explanation of various operations of the CMis explained in the following description.

To perform flash aware distribution in a manner which prevents the necessity of use of any sort of synchronization mechanism between controllers, the systemmay be configured to distribute IO metadata statically among the controllersand the storage nodes, while at the same time distributing IO data dynamically across the storage nodes. The systemmay also configured to consider controller and storage capabilities while performing the static distribution and also provide mechanisms to change static distribution at run time when there is a controller or storage addition/deletion. Further, the systemmay perform global data distribution using flash aware central allocation. The global data distribution may allow distribution of data across all storage nodes to efficiently use flash resources. The above stated objectives may be achieved using central allocation management, run time resource, and capacity aware data segment distribution. The various operations of the systemmay be explained in the following description.

illustrates a block diagram of the CM, according to an example embodiment of the present disclosure. In an example embodiment, the CMmay be implemented independently and remotely coupled with the client device, the controllersand the storage nodes. In example embodiments, the CMmay be implemented at any suitable device such as, the client device, the controller, or the storage node.

The CMmay include a processor, an IO interface, a memorystoring dataand the modules. As an example, the processormay be a single processing unit or a number of units, all of which could include multiple computing units. The processormay be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the processormay be configured to fetch and execute computer-readable instructions and data stored in the memory. The processormay include one or a plurality of processors. The processormay be implemented as one processor or a plurality of processors. The processormay be a general-purpose processor, such as a central processing unit (CPU), an application processor (AP), or the like, a graphics-only processing unit such as a graphics processing unit (GPU), a visual processing unit (VPU), and/or an artificial intelligence (AI)-dedicated processor such as a neural processing unit (NPU). The processormay control the processing of the input data in accordance with a predefined operating rule stored in the non-volatile memory and/or the volatile memory, i.e., the memory. The predefined operating rule may be provided through training or learning.

The processormay be in communication with one or more input/output (I/O) devices via the I/O interface. The I/O interfacemay employ communication code-division multiple access (CDMA), high-speed packet access (HSPA+), global system for mobile communications (GSM), long-term evolution (LTE), WiMAX, or the like, etc. In an example embodiment, using the I/O interface, the CMmay communicate with one or more I/O devices such as the client deviceswhich are configured to generate the IO requests. The processormay be in communication with a communication networkvia a network interface. In an example embodiment, the network interface may be the I/O interface.

The memorymay include any non-transitory computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic RAM (DRAM), and/or non-volatile memory, such as ROM, erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes. The memorymay be configured to store data. The datamay include controller-related data, storage node related data, controller parameters, storage node parameters and other additional information which may be required to implement the desired functionality of the CM.

The modulesmay be configured to perform one or more desired functions of the CM. The modulesmay include a distribution module, a segmentation allocation module, an orchestration moduleand one or more additional modules based on the requirement. The distribution modulemay be configured to implement logic to perform static IO distribution to the controllersand the storage nodes. The segmentation modulemay be configured to implement logic to flash aware segmentation of the storage nodes. The orchestration modulemay be configured to implement logic to perform controller scaling and/or storage node scaling, according to embodiments of the present disclosure. Further, the modulesmay include one or more additional modules configured to implement any additional logic to achieve the desired objective of the CM.

are flowcharts of a methodfor distributing and managing IO in the disaggregated storage architecture, according to an example embodiment of the present disclosure. The operations of the methodmay be performed by at least one of the controllers, the storage nodes, or the CMof the system.

The methodmay be explained based on an assumption that the client devicehas generated an IO request including IO data. In operation, the methodmay include receiving the IO request including IO data to be distributed in the disaggregated storage architecture. In an example embodiment, the controllerand/or the CMmay receive the IO request from the client device. In operation, the methodmay include generating IO metadata corresponding to the IO data included in the received IO request. In an example embodiment, IO metadata may include information corresponding to storage of IO data to the storage nodes.

In operation, the methodmay include determining one or more controller parameters for each of the plurality of controllersof the disaggregated storage architecture. The controller parameters may correspond to configurational parameters of the controllerswhich may define the capability of the controllers. Examples of the controller parameters may include, but not limited to, a number of CPU cores, a capacity of random access memory (RAM), a capacity of network interface card (NIC), frequency of CPUs, a type of cache memory. In an example embodiment, the CMmay determine the one or more controller parameters corresponding to each of the plurality of the controller.

In operation, the methodmay include determining a first priority weight of each controller parameter of the one or more controller parameters based at least on a network type of the disaggregated storage architecture. For example, Table 1 illustrates a difference in priority weight of different controller parameters based on different networks:

Table 1 illustrates that for a disaggregated—transmission control protocol (TCP) based network, each of the controller parameters may have equal weightage. However, for a disaggregated—remote direct memory access (RDMA) based network, RAM capacity may have higher priority weightage than the NIC capacity and the cores of CPUs. Further, for the disaggregated-RDMA, NIC capacity may have higher priority weightage than the cores of CPUs. In example embodiments, the priority weight of different controller parameters may be determined based on other factors such as, but not limited to, deployment type, workload, and transport structure.

In operation, the methodmay include determining a first IO management weight for each of the plurality of controllersbased at least on the one or more controller parameters and corresponding first priority weights. In an example embodiment, to determine the first IO management weight, the method may include generating an attribute matrix “A” for the controllersbased on the controller parameters, as shown in Equation (1)

The controller parameters illustrated in the attribute matrix Aare exemplary in nature and the attribute matrix Amay include any number of the controller parameters. Further, the methodmay include normalizing the controller parameters and calculating a weighted average. Further, the methodmay include generating a weight matrix W=(w), based on Equation (2), in which wmay represent the first IO management weight of the icontroller.=α*cpu+α*ram+α*nic  (2)

α, α, and αmay represent priority weights of the corresponding controller parameters which may be determined based on Table 1.

In operation, the methodmay include performing comparison of the first IO management weights of the plurality of controllersto determine a first set of relative comparison values of the first IO management weights. The first set of relative comparison values may indicate a difference in the first IO management weights of the controllersand therefore may indicate a difference in capabilities of the different controllers.

In operation, the methodmay include statically mapping each of the IO metadata and the IO data to at least one controllerfrom the plurality of controllersbased on the first IO management weights of the plurality of controllers. Specifically, the methodmay include statically mapping each of the IO metadata and the IO data to at least one controllerfrom the plurality of controllersbased on the first set of relative comparison values.

In operation, the methodmay include determining one or more storage node parameters for each of a plurality of storage nodes of the disaggregated storage architecture. The storage node parameters may correspond to configurational parameters of the storage nodeswhich may define the capability of the storage nodes. Examples of the storage node parameters may include, but not limited to, space availability of storage node, a capacity of RAM, a capacity of NIC, and metadata space availability. In an example embodiment, the CMmay determine the one or more storage node parameters corresponding to each of the plurality of the storage nodes.

In operation, the methodmay include determining a second priority weight of each storage node parameter of the one or more storage node parameters based at least on a network type of the disaggregated storage architecture. For example, Table 2 (as shown below) illustrates a difference in priority weight of different storage node parameters based on different networks:

Table 2 illustrates that for a disaggregated—TCP based network, the storage node parameters CPU cores and NIC capacity may have equal weightage, however each of the storage node parameters CPU cores and NIC capacity may have higher weightage than availability of the space. In example embodiments, the priority weight of different storage node parameters may be determined based on other factors such as, but not limited to, deployment type, workload, and transport structure.

In operation, the methodmay include determining a second IO management weight for each of the plurality of storage nodesbased at least on the one or more storage node parameters and corresponding second priority weights. In an example embodiment, to determine the second IO management weight, the method may include generating an attribute matrix “A” for the storage nodesbased on the storage node parameters, as shown in Equation (3).

Patent Metadata

Filing Date

Unknown

Publication Date

May 26, 2026

Inventors

Unknown

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “Method and system for distributing and managing IO in a disaggregated storage architecture” (US-12638975-B2). https://patentable.app/patents/US-12638975-B2

© 2026 Patentable. All rights reserved.

Patentable is a research and drafting-assistant tool, not a law firm, and does not provide legal advice. Documents we generate are drafts for review by a licensed patent attorney.