Patentable/Patents/US-20250342175-A1

US-20250342175-A1

Techniques for Dynamically Scaling Hardware Capacity Used to Host Data Partitions of a Database

PublishedNovember 6, 2025

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

Some embodiments provide a system for optimizing the operational efficiency of a distributed database system configured to store data divided among a plurality of data partitions. The distributed database system comprises database hardware for hosting the plurality of data partitions. The system determines, for each of multiple data partitions, a hardware capacity for hosting the data partition. The system configures the database hardware based on hardware capacities determined for hosting the data partitions.

Patent Claims

Legal claims defining the scope of protection, as filed with the USPTO.

. A system for scaling hardware capacity in a distributed database system configured to store data divided among a plurality of data partitions, the distributed database system comprising database hardware for hosting the plurality of data partitions, the system comprising:

. The system of, wherein:

. The system of, wherein determining the first hardware capacity for hosting the first data partition comprises:

. The system of, wherein determining the first hardware capacity for hosting the first data partition comprises modifying a previous hardware capacity determined for hosting the first data partition.

. The system of, wherein modifying the previous hardware capacity for hosting the first data partition comprises increasing the previous hardware capacity for hosting the first data partition.

. The system of, wherein modifying the previous hardware capacity for hosting the first data partition comprises decreasing the previous hardware capacity for hosting the first data partition.

. The system of, wherein:

. The system of, wherein determining the first hardware capacity for hosting the first data partition comprises:

. The system of, wherein each of the plurality of hardware capacities comprises a specification of at least one of:

. A method for scaling hardware capacity in a distributed database system configured to store data divided among a plurality of data partitions, the distributed database system comprising database hardware for hosting the plurality of data partitions, the method comprising:

. The method of, wherein:

. The method of, wherein determining the first hardware capacity for hosting the first data partition comprises:

. The method of, wherein determining the first hardware capacity for hosting the first data partition comprises modifying a previous hardware capacity determined for hosting the first data partition.

. The method of, wherein determining the first hardware capacity for hosting the first data partition comprises:

. A distributed database system configured to store data divided among a plurality of data partitions, the distributed database system comprising:

. The distributed database system of, wherein the at least one processor is configured to modify a configuration of the database hardware to update the hardware capacity used to host the particular data partition based on a record of operations performed on the particular data partition over a time period.

. The distributed database system of, wherein the at least one processor is configured to modify the configuration of the database hardware to update the hardware capacity used to host the particular data partition based on the record of operations performed on the particular data partition over the time period by performing:

. The distributed database system of, wherein modifying the configuration of the database hardware to update the hardware capacity used to host the particular data partition comprises:

Detailed Description

Complete technical specification and implementation details from the patent document.

This application claims priority under 35 U.S.C. § 119 (e) to U.S. Provisional Application Ser. No. 63/640,986, entitled “TECHNIQUES FOR DYNAMICALLY SCALING HARDWARE CAPACITY USED TO HOST DATA PARTITIONS OF A DATABASE,” filed on May 1, 2024. This application claims priority under 35 U.S.C. § 119 (e) to U.S. Provisional Application Ser. No. 63/640,978, entitled “SYSTEMS AND METHODS FOR DISTRIBUTED CATCHALL DATABASE,” filed on May 1, 2024. Each of which is herein incorporated by reference in their entirety.

Database sharding involves dividing data of the database across multiple different database servers. A single database server has a limited amount of storage capacity. Thus, as the amount of data stored in the database grows larger, the data needs to be divided into multiple partitions (which also may be referred to as “chunks”). The smaller data partitions are stored across multiple database servers (“shards”). Each shard may store a respective data partition in its storage hardware and execute operations on the data partition (e.g., to generate responses to queries on data in the data partition).

Today, users leverage sharding as a way to horizontally scale their database. Sharding allows users to spread their collection data and workload across multiple servers/shards. Users can shard one or more collections of data. The more collections they shard, the better the distribution of data and workload across the shards.

In practice, only a limited number of collections of data in a database are sharded. All unsharded collections for a database may live on the same shard. These unsharded collections can lead to some shards needing more resources than others, especially if the unsharded collections are relatively large or “hot”. However, currently, computing resources (e.g., cluster tiers and disk performance) have to be uniform since they are applied at the cluster level. An auto-scaler applies the same symmetry when it scales a cluster, meaning it scales all the nodes within the cluster to the same cluster tier on all the shards.

Some embodiments provide a system for optimizing operational efficiency of a distributed database system configured to store data divided among a plurality of data partitions. The distributed database system comprises database hardware for hosting the plurality of data partitions. The system determines, for each of multiple data partitions, a hardware capacity for hosting the data partition. The system configures the database hardware based on hardware capacities determined for hosting the data partitions.

Some embodiments provide a system for scaling hardware capacity in a distributed database system configured to store data divided among a plurality of data partitions. The distributed database system comprises database hardware for hosting the plurality of data partitions. The system comprises at least one processor; and at least one non-transitory computer-readable storage medium storing instructions. The instructions, when executed by the at least one processor, cause the at least one processor to: determine, for each of at least some of the plurality of data partitions, a hardware capacity for hosting the data partition, the determining comprising: determine a first hardware capacity for hosting a first data partition of the plurality of data partitions; and determine a second hardware capacity, different from the first hardware capacity, for hosting a second data partition of the plurality of data partitions; and configure the database hardware based on hardware capacities determined for hosting the at least some data partitions, the configuring comprising: configure a first set of the database hardware, having the first hardware capacity, to host the first data partition; and configure a second set of the database hardware having the second hardware capacity to host the second data partition.

Some embodiments provide a method for scaling hardware capacity in a distributed database system configured to store data divided among a plurality of data partitions. The distributed database comprises database hardware for hosting the plurality of data partitions. The method comprises using at least one processor to perform: determining, for each of at least some of the plurality of data partitions, a hardware capacity for hosting the data partition, the determining comprising: determining a first hardware capacity for hosting a first data partition of the plurality of data partitions; and determine a second hardware capacity, different from the first hardware capacity, for hosting a second data partition of the plurality of data partitions; and configuring the database hardware based on hardware capacities determined for hosting the at least some data partitions, the configuring comprising: configuring a first set of the database hardware, having the first hardware capacity, to host the first data partition; and configuring a second set of the database hardware having the second hardware capacity to host the second data partition.

Some embodiments provide a distributed database system configured to store data divided among a plurality of data partitions. The distributed database system comprises: database hardware configured to host the plurality of data partitions, wherein the database hardware is configurable to provide different hardware capacities for hosting different data partitions; and at least one processor configured to dynamically modify a configuration of the database hardware to update a hardware capacity used to host a particular data partition of the plurality of data partitions.

The foregoing summary is non-limiting.

The inventors have developed techniques for scaling hardware capacity in a distributed database system configured to store data divided among multiple data partitions (also referred to as “chunks”). Hardware capacity may refer to computing performance and/or storage (e.g., disk) input/output performance provided by a set of database hardware (e.g., one or more servers). The techniques determine a hardware capacity for each data partition and configure database hardware to provide the hardware capacity determined for the data partition. The techniques may configure the database hardware to host different data partitions with different hardware capacities.

Database sharding allows users of a database to divide the database into multiple data partitions that are hosted by different systems (e.g., database servers). Sharding may be used, for example, when storage on a given server is nearing capacity and/or to more uniformly distribute data across multiple servers. Sharding further allows the volume of data stored in a database to increase without overloading a single machine. For example, in the context of a MongoDB database, a collection of documents in the database can be divided into multiple different shards, where each shard is hosted by a different set of one or more servers. Each set of server(s) may store the shard data and execute operations involving data in the shard (e.g., execute queries targeting data of the shard).

The inventors have recognized a problem that often occurs when a database is sharded into multiple data partitions. When data of a database is sharded, either in part or whole, all of the data partitions are hosted with database hardware having the same hardware capacity (e.g., processing power, amount of memory, and/or disk read/write speed). Typically, one or more data partitions require more hardware capacity than other data partitions because the data partition(s) store a larger volume of data, operations are executed more frequently on the data partition(s), and/or operations executed on the data partition(s) are more complex than those executed on other data partitions. To ensure that there is sufficient hardware capacity to support the data partition(s) that demand higher hardware capacity, the database is configured with the higher hardware capacity to host all data partitions, including those that do not require the hardware capacity to operate (e.g., because the data partitions do not store as much data, operations are executed less frequently on the data partitions, and/or operations executed on the data partitions are not generally as complex as those executed on the other data partition). This leads to a database system using higher hardware capacity than it needs to host much of its data, resulting in a waste of computing and/or storage resources as well as higher operating costs for users of the database system.

Accordingly, the inventors have developed techniques that address the above-described problem that often occurs in sharded database systems. The techniques configure a database system such that it can be configured to use different hardware capacities to host different data partitions. This allows data partitions that require a higher hardware capacity (e.g., higher compute performance and/or storage device performance) to be hosted using hardware with higher capacity. Likewise, data partitions that require a lower hardware capacity (e.g., lower compute performance and/or storage device performance) can be hosted using hardware with lower hardware capacity. By allowing variability in hardware capacity used to host different data partitions, embodiments described herein reduce the waste of computing and storage resources by assigning hardware capacity to data partitions with higher granularity that reduces waste of computing and/or storage resources.

Some embodiments provide a system for automatically scaling hardware capacity in a distributed database system that stores data divided among multiple data partitions. The system automatically determines different hardware capacities (e.g., computing performance tiers and/or disk read/write performance) for data partitions by analyzing operations performed on the data partitions. The system may be configured to determine different hardware capacities for different data partitions and to configure database hardware accordingly to host the different data partitions. For example, a data partition storing near-term transaction data that is frequently updated by an application may be hosted using database hardware with higher hardware capacity than another data partition storing historical transaction data that is less frequently accessed by the application.

Some embodiments provide a distributed database system configured to store data divided among a plurality of data partitions. The distributed database system comprises database hardware configured to host the plurality of data partitions, wherein the database hardware is configurable with different hardware capacities for hosting different data partitions. The distributed database system may be configured to dynamically modify a configuration of the database hardware to update a hardware capacity used to host a particular data partition of the plurality of data partitions.

Following below are more detailed descriptions of various concepts related to, and embodiments of, hardware capacity scaling systems and methods developed by the inventors. It should be appreciated that various aspects described herein may be implemented in any of numerous ways. Examples of specific implementations are provided herein for illustrative purposes only. In addition, the various aspects described in the embodiments below may be used alone or in any combination and are not limited to the combinations explicitly described herein.

is an example database systemincluding a hardware capacity scaling system, according to some embodiments of the technology described herein. As shown in, the database system stores data divided into multiple data partitionsA,B,C that are hosted by respective sets of database hardwareA,B,C. The database systemhas database hardware assetsA,B,C with respective hardware capacitiesA,B,C. Hardware capacity scaling systemmay be configured to configure database hardware (e.g., from hardware asset setsA,B,C) to host data partitionsA,B,C.

In some embodiments, hardware capacity scaling systemmay be configured to determine a hardware capacity for hosting each of data partitionsA,B,C. Hardware capacity scaling systemmay be configured to determine different hardware capacities for different ones of the data partitionsA,B,C, and configure database hardware based on the different hardware capacities. Hardware capacity scaling systemmay be configured to configure a set of database hardware for each of the data partitionsA,B,C that has the hardware capacity determined for the data partition. To illustrate, database hardwareA may have a higher hardware capacity than database hardwareB. For example, servers of database hardwareA, may have higher performing central processing unit (CPU) hardware than servers of database hardwareB. As another example, servers of database hardwareA may have more CPU cores than servers of database hardwareB. As another example, servers of database hardwareA may have more random access memory (RAM) per CPU core than database hardwareB. As another example, the disks of database hardwareA may have better read/write performance than the disks of database hardwareB.

In some embodiments, hardware capacity scaling systemmay be configured to automatically determine a hardware capacity for hosting a particular data partition. In some embodiments, hardware capacity scaling systemmay be configured to determine the hardware capacity for the particular data partition based on a history of operations performed on the data partition (also referred to as a “look-back window”). For example, hardware capacity scaling systemmay determine CPU utilization and/or memory usage of database hardware configured to host the data partition over a time period. In some embodiments, the time period may be 0-1 hours, 1-2 hours, 2-3 hours, 3-4 hours, 4-5 hours, 5-6 hours, 6-12 hours, 12-24 hours, 1-2 days, 1-7 days, 7-30 days, 1-6 months, 6-12 months, 1 to 5 years, or another time period. For example, the time period may be 1 hour. In some embodiments, hardware capacity scaling systemmay be configured to determine: (1) whether a CPU utilization and/or memory usage for the data partition has reached a threshold level; and (2) modify the hardware capacity assigned for the data partition when it is determined that the CPU utilization and/or memory usage has reached the threshold level. For example, hardware capacity scaling systemmay be configured to increase the hardware capacity if CPU utilization and/or memory usage is above a threshold (e.g., a threshold in one of ranges 70-80%, 80-90%, 90-100%, or another threshold). As another example, hardware capacity scaling systemmay be configured to decrease the hardware capacity for the data partition if CPU utilization and/or memory usage is below a threshold (e.g., a threshold in one of ranges 10-20%, 20-30%, 30-40%, 40-50%, 50-60%, or another threshold).

In some embodiments, hardware capacity scaling systemmay be configured to increase a hardware capacity for hosting a data partition and/or decrease a hardware capacity for hosting the data partition. In some embodiments, hardware capacity scaling systemmay be configured to determine whether to increase hardware capacity based on a history of operations performed on the data partition over a first time period (e.g., 1 hour). For example, hardware capacity scaling systemmay determine CPU utilization and/or memory usage during the first time period and determine whether to increase hardware capacity based on the CPU utilization and/or memory (e.g., by determining whether the CPU utilization and/or memory usage has reached a threshold level in the first time period). In some embodiments, hardware capacity scaling systemmay be configured to determine whether to decrease hardware capacity based on a history of operations performed on the data partition over a second time period (e.g., 24 hours). For example, hardware capacity scaling systemmay determine CPU utilization and/or memory usage during the second time period and determine whether to decrease hardware capacity based on the CPU utilization and/or memory (e.g., by determining whether the CPU utilization and/or memory usage has reached a threshold level in the second time period). In some embodiments, the first time period is different from the second time period. For example, the first time period (e.g., 1 hour) may be shorter than the second time period (e.g., 24 hours). In some embodiments, the first time period and the second time period are the same.

In some embodiments, hardware capacity scaling systemmay be configured to configure database hardware of database systemto initially host a data partition with a default hardware capacity. In some embodiments, hardware capacity scaling systemmay be configured to configure database hardware of database systemto host a data partition (e.g., a new data partition for which there is no operation history) with a hardware capacity based on an expected operation load determined from other data partitions (e.g., for which there is a history of operations).

In some embodiments, hardware capacity scaling systemmay be configured to determine a hardware capacity for hosting a data partition based on user input. For example, the hardware capacity scaling systemmay receive user input indicating a minimum and/or maximum hardware capacity for a data partition (e.g., a minimum and/or maximum IOPS performance, a minimum and/or maximum amount of RAM, and/or a minimum and/or maximum CPU performance). Hardware capacity scaling systemmay configure database hardware of database systemto be within hardware capacity limit(s) (e.g., a minimum and/or maximum hardware capacity parameter) when scaling hardware capacity for the data partition.

In some embodiments, hardware capacity scaling systemmay be configured to dynamically scale hardware capacity used to host data partitionsA,B,C. Hardware capacity scaling systemmay analyze a log of operations executed on each data partitionA,B,C and determine a measure of hardware resource utilization (e.g., percentage of CPU utilization and/or memory utilization) of database hardware that is currently being used to hose the data partition. Hardware capacity scaling systemmay be configured to update hardware capacity for the data partition in response to detecting a change in hardware capacity needed for the data partition.shows the hardware capacity scaling systemofdynamically configuring database hardware for hosting data partitionB, according to some embodiments of the technology described herein.

In the example of, hardware capacity scaling systemmay determine that the hardware capacity for hosting data partitionB is to change (e.g., increase or decrease). Hardware capacity scaling systemmay determine that the previous hardware capacity is insufficient or excessive. For example, hardware capacity scaling systemmay determine that CPU and/or memory utilization of hardware currently hosting data partitionB has been above a threshold level (e.g., 95%, 90%, 85%, 80%, 75%, 70%, or a suitable threshold between 70-100%) for a time period (e.g., the past 1 hour, 1.5 hours, 2 hours, 3 hours, 4 hours, 5 hours, or a suitable time period between 0-5 hours). As another example, hardware capacity scaling systemmay determine that CPU and/or memory utilization of hardware currently hosting data partitionB has been below a threshold level (e.g., 40%, 35%, 30%, 25%, 20%, 15%, 10%, or a suitable threshold between 0-40%) for a time period (e.g., 6 hours, 5 hours, 4 hours, 3 hours, 2 hours, 1 hour, or a time suitable time period between 1-6 hours). When hardware capacity scaling systemdetermines that hardware capacity for hosting data partitionB is to change, hardware capacity scaling systemmay configure a different set of database hardware to host data partitionB. In the example of, hardware capacity scaling systemconfigures database hardware from hardware assetsA with hardware capacityA to host data partitionB.

In some embodiments, hardware capacity scaling systemmay determine a hardware capacity for hosting a data partition by selecting one of multiple tiers of hardware capacity (e.g., based on CPU and/or memory utilization). For example, the multiple tiers of hardware capacity may be different cluster tiers that each provide a level of memory, storage, CPU performance, and/or IOPS performance. In some embodiments, hardware capacity scaling systemmay be configured to automatically increase or decrease a hardware capacity for hosting a data partition from one of the tiers to another tier (e.g., a higher tier or a lower tier). Accordingly, hardware capacity scaling systemmay dynamically transition database hardware used to host the data partition (e.g., based on changing operational use of the data partition and/or content of data stored in the data partition). Hardware capacity scaling systemmay thus evolve database hardware configuration with time to mitigate wasted hardware capacity (e.g., on data partitions that use little capacity) and ensure that performance requirements are met (e.g., for data partitions that require high performance).

In some embodiments, database systemmay be a cloud-based database system in which hardware resources are designated by a cloud provider system. Hardware capacity scaling systemmay be configured to configure database hardware for hosting data partitions by transmitting requests to a cloud provider system (e.g., Amazon Web Services (AWS) cloud storage system, Google cloud storage system, and/or another cloud provider system). For example, hardware capacity scaling systemmay transmit an application programming interface (API) call to the cloud provider system to change a hardware capacity for hosting data partitionB. The cloud provider system may abstract the assignment of physical hardware resources from database system. For example, hardware capacity scaling systemmay transmit an indication of a hardware capacity tier to use for hosting data partitionB and the cloud provider system may handle the coordination of physical hardware resources to host data partitionB.

After configuring performed by hardware capacity scaling system, database hardware of database systemmay be reconfigured to host data partitionB with the updated hardware capacity. For example, data partitionB may now be hosted using hardware of higher computing performance and/or disk storage performance. This may facilitate scaling operations on data partitionB up (e.g., to account for greater traffic in an application that stores data in data partitionB).

is an example processfor optimizing operational efficiency of a distributed database system (e.g., database system) configured to store data divided among multiple data partitions, according to some embodiments of the technology described herein. In some embodiments, processmay be performed by hardware capacity scaling systemdescribed herein with reference to. For example, processmay be performed by hardware capacity scaling systemto configure database hardware for hosting data partitionsA,B. In some embodiments, processmay be performed periodically by the system to dynamically update hardware capacity for the data partitions to mitigate the waste of hardware capacity on a data partition and/or ensure adequate hardware capacity for a data partition. In one example, the processmay be performed to automatically scale hardware capacity shards of a MongoDB Atlas database system. In this example, the system may assign a given shard to one of multiple tiers (e.g., “cluster tiers”) to scale the shard to a particular hardware capacity.

Processbegins at block, where the system obtains information about operations performed on multiple data partitions (e.g., two or more data partitions). In some embodiments, the system may be configured to obtain information about hardware utilization for performing operations. For example, the system may obtain CPU and/or memory utilization of database hardware currently configured to host the data partitions in performing the operations. As another example, the system may obtain logs of the operations and compute statistics indicating hardware usage of the database hardware currently configured to host the data partitions. In some embodiments, the information about hardware utilization may include information about memory utilization. For example, the system may use a formula to calculate the memory utilization.

Next, processproceeds to block, where the system determines hardware capacity for a first data partition (e.g., a first shard). The system may determine a hardware capacity for the first data partition. The system may determine a hardware capacity for the first data partition based on information about operations performed on the first data partition. For example, the system may determine CPU utilization and/or memory utilization of the database hardware performing the operations on a particular data partition, and determine a hardware capacity for the data partition based on the CPU and/or memory utilization. Example techniques of using CPU and/or memory utilization to determine a hardware capacity are described herein. To illustrate, the system may determine that the CPU and/or memory utilization has exceeded 90% in the past hour and determine to increase the hardware capacity to a higher one (e.g., the next tier up) of multiple tiers of hardware capacity in response to doing so. As another example, the system may determine that the CPU and/or memory utilization has been below 30% for 4 hours and determine to decrease the hardware capacity to a lower one (e.g., the next tier below) of multiple tiers of hardware capacity. As another example, the system may determine the hardware capacity for the first data partition based on user input (e.g., limiting the hardware capacity by a minimum and/or maximum hardware capacity specified by a user). As another example, the system may determine the hardware capacity for the first data partition by calculating a target hardware capacity based on collected statistics.

Next, processproceeds to block, where the system determines a hardware capacity for hosting a second data partition based on information about operations performed on the second data partition (e.g., obtained from a set of database hardware currently configured to host the second data partition). The system may determine the hardware capacity as described herein with reference to block. For example, the system may determine CPU utilization and/or memory utilization of the database hardware performing the operations on the second data partition, and determine a hardware capacity for the second data partition based on the CPU and/or memory utilization. Example techniques of using CPU and/or memory utilization to determine a hardware capacity are described herein. To illustrate, the system may determine that the CPU and/or memory utilization has exceeded 90% in the past hour and determine to increase the hardware capacity to a higher one (e.g., the next tier up) of multiple tiers of hardware capacity in response to doing so. As another example, the system may determine that the CPU and/or memory utilization has been below 30% for 4 hours and determine to decrease the hardware capacity to a lower one (e.g., the next tier below) of multiple tiers of hardware capacity. As another example, the system may determine the hardware capacity for the first data partition based on user input (e.g., specifying a particular CPU performance and/or disk performance). As another example, the system may determine the hardware capacity for the second data partition by calculating a target hardware capacity based on collected statistics.

Next, processproceeds to block, where the system configures a first set of database hardware having the first hardware capacity determined for the first data partition to host the first data partition. For example, the system may designate a first set of servers, having the first hardware capacity, to host the first data partition. As another example, the system may transmit a request (e.g., an API call) to a cloud provider system to configure cloud-based hardware resources having the first hardware capacity to host the data partitions. In some embodiments, the system may be configured to configure the first set of database hardware for the first data partition by assigning a particular hardware capacity tier (e.g., a cluster tier) to the first data partition (e.g., the first shard). Assignment of the particular hardware capacity tier may trigger configuration of the first set of database hardware having the first hardware capacity determined for the first data partition to host the first data partition.

Next, processproceeds to block, where the system configures a second set of database hardware having the second hardware capacity determined for the second data partition to host the second data partition. For example, the system may designate a second set of servers, having the second hardware capacity, to host the second data partition. As another example, the system may transmit a request (e.g., an API call) to a cloud provider system to configure cloud-based hardware resources having the second hardware capacity to host the data partitions. In some embodiments, the system may be configured to configure the second set of database hardware for the second data partition by assigning a particular hardware capacity tier (e.g., a cluster tier) to the second data partition (e.g., the first shard). Assignment of the particular hardware capacity tier may trigger configuration of the second set of database hardware having the second hardware capacity determined for the second data partition to host the second data partition.

shows an example graphical user interface (GUI)through which hardware capacity scaling can be configured for shards of a database system, according to some embodiments of the technology described herein. As shown in, the GUIincludes selectable options to configure automatic hardware capacity scaling for the shards. The options include an optionto enable scaling of hardware capacity. In the example of, the hardware capacity for each of the shards can be scaled to different cluster tiers. Each of the cluster tiers may provide a different hardware capacity (e.g., RAM, amount of storage, and/or number of vCPUs). Example cluster tiers include tiers M10, M20, M30, and M40 as listed in the GUIshown in. The options in the GUIfurther include an optionthat enables predictive scaling using artificial intelligence (AI). The options further include an optionto allow a cluster tier for a shard to be scaled down. The GUIprovides an input fieldto configure a minimum cluster tier and an input fieldmaximum cluster tier to which a shard can be scaled. In the example of, the user has set the minimum cluster tier to M10 and the maximum cluster tier to M30. Accordingly, a shard may be scaled to the minimum cluster tier, maximum cluster tier, or a cluster tier in between the minimum and maximum cluster tiers.

shows an example GUIthrough which hardware capacity scaling can be configured for shards of a database system, according to some embodiments of the technology described herein. The options include an optionto enable automatic hardware capacity scaling for the shards. Thus, each shard may be scaled to a different cluster tier. The options further include an optionto enable AI-based predictive scaling of cluster tiers for a shard. The options further include an optionthat allows a cluster tier for a shard to be scaled down. The GUIprovides an input fieldto configure a minimum cluster tier and an input fieldmaximum cluster tier to which a shard can be scaled. In the example of, the user has set the minimum cluster tier to M10 and the maximum cluster tier to M30. Accordingly, a shard may be scaled to the minimum cluster tier, maximum cluster tier, or a cluster tier in between the minimum and maximum cluster tiers.

Some embodiments of the technology described herein may be implemented in a MongoDB Atlas database system. Hardware capacity of shards in an Atlas database system may be automatically scaled using techniques described herein. A shard may be scaled independently of other shards (e.g., other shards in a cluster to which the shard belong). Workload on the shard may be used to determine its hardware capacity. For example, the workload on the shard may be used to determine which of multiple hardware capacity tiers to assign to the shard. In some embodiments, the cluster tier ranges that Atlas uses to automatically scale cluster tier, storage capacity, or both in response to cluster usage may be configured. Atlas auto-scaling adjusts cluster tier based on real-time resource usage. The auto-scaling engine can accurately detect sustained higher demand and short-term peak traffic for upscaling decisions. Similarly, Atlas makes downscaling choices more promptly, for more optimized resource utilization and cost profile.

To help control costs, in some embodiments, a user can specify a range of maximum and minimum cluster sizes that shards in a cluster of shards can automatically scale to. In some embodiments, auto-scaling works on a rolling basis, and the process doesn't incur any downtime. Atlas may maintain a primary during this process but the nodes are upgraded one by one and will be unavailable while being upgraded.

In some embodiments, Atlas analyzes the following cluster metrics to determine when to scale a cluster, and whether to scale the cluster tier up or down: normalized system CPU utilization and system memory utilization. For example, Atlas calculates system System Memory Utilization based on available node memory and total memory as follows: (memoryTotal−(memoryFree+memory Buffers+memoryCached))/(memoryTotal)*100. In this equation, memory Free, memoryBuffers, and memoryCached are amounts of available memory that Atlas can reclaim for other purposes. In some embodiments, Atlas won't scale your cluster tier if the new cluster tier would fall outside of your specified minimum and maximum cluster size range.

In some embodiments, Atlas may be configured to scale a cluster to another tier in the same class. For example, Atlas may be configured to scale general clusters to other general cluster classes, but doesn't scale general clusters to low-CPU cluster classes. In some embodiments, auto-scaling criteria are subject to change in order to ensure appropriate cluster resource utilization.

As an illustrative example, if the next cluster tier is within a specified maximum cluster size range, Atlas scales operational nodes in a cluster up to the next tier if at least one of the following criteria is true for any cluster node of this type.

The above thresholds ensure that a cluster scales up quickly in response to high loads, and an application can handle spikes in traffic or usage, maintaining its performance and reliability. In some embodiments, for analytics nodes on any cloud provider, Atlas may scale them up to the next tier if the average normalized System CPU Utilization or the System Memory Utilization has exceeded 75% of resources available to any cluster node for the past one hour.

In some embodiments, to achieve optimal resource utilization and cost profile, Atlas avoids scaling up the cluster to the next tier if: (1) the M10 or M20 cluster has been scaled up in the past 20 minutes or one hour, depending on thresholds, or (2) the M30+ cluster has been scaled up in the past 10 minutes or one hour, depending on thresholds. For example, if the cluster tier has not been changed since 12:00, Atlas will scale an M30+ cluster at 12:10, if the cluster's current normalized System CPU Utilization is greater than 90%.

In some embodiments, scaling up to a greater cluster tier requires enough time to prepare backing resources. Automatic scaling may not occur when a cluster receives a burst of activity, such as a bulk insert. To reduce the risk of running out of resources, plan to scale up clusters before bulk inserts and other workload spikes.

In some embodiments, Atlas scales down nodes in your cluster under the conditions. For example, if the next lowest cluster tier is within a specified minimum cluster size range, Atlas scales the nodes in a cluster down to the next lowest tier if all of the following criteria are true for all nodes of the specified cluster type:

In some embodiments, Atlas measures the current memory usage and replaces the current WiredTiger cache usage size with 80% of the WiredTiger cache size on the new lower tier cluster. Next, Atlas checks whether the projected total memory usage would be below 60% for at least the last 4 hours and at least the last 10 minutes on the new tier size. In some embodiments, Atlas includes the WiredTiger cache in its memory calculation to make it more likely that clusters with a full cache, but otherwise low traffic, will scale down. In other words, Atlas examines the size of the WiredTiger cache to determine that it can safely down scale an otherwise idle cluster with low Normalized System CPU Utilization in cases where the cluster's WiredTiger caches might reach 90% of the cluster's maximum WiredTiger cache size. The above conditions ensure that Atlas scales down operational nodes in your cluster to prevent high utilization states.

Some example considerations for downward auto-scaling of cluster tier and storage include:

For example, the auto-scaling bounds are set to M20-M60 and the current cluster tier is M40 with a disk capacity of 200 GB. Atlas triggers a disk auto-scaling event to increase capacity to 320 GB because current disk usage exceeds 180 GB, which is more than 90% of the 200 GB capacity.

Atlas may:

In some embodiments, Atlas auto-scales the cluster tier for sharded clusters using the same criteria as replica sets. Atlas may, for example, apply the following rules:

In some embodiments, the cluster tier of each shard can be scaled individually. An API is capable of describing asymmetric clusters. For example, each shard is specified by a separate replicationSpec.

Patent Metadata

Filing Date

Unknown

Publication Date

November 6, 2025

Inventors

Unknown

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search